• Slowing down non-critical paths

    From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Thu Sep 11 12:12:26 2025
    From Newsgroup: comp.arch

    The notion that some paths are critical (and those determine the
    frequency that can be used) implies that there are paths which are not critical, because they're much shorter, e.g. the low-bits of an addition
    don't need to wait for the carry-propagation, so they're typically ready
    long before the end of the cycle.

    Do CPUs try and save power by making those paths slower (using lower
    voltage, or slower and less power-hungry transistors)?


    Stefan
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Sep 11 19:00:39 2025
    From Newsgroup: comp.arch


    Stefan Monnier <monnier@iro.umontreal.ca> posted:

    The notion that some paths are critical (and those determine the
    frequency that can be used) implies that there are paths which are not critical, because they're much shorter, e.g. the low-bits of an addition don't need to wait for the carry-propagation, so they're typically ready
    long before the end of the cycle.

    Do CPUs try and save power by making those paths slower (using lower
    voltage, or slower and less power-hungry transistors)?

    I admit to using minimum sided transistors in the logic units (slowing
    it down). But those things rarely consumed enough power to be worth
    "much" effort.

    On the other hand, there is an entire 'art" to transistor sizing which
    makes an 11-gate integer adder take only 8-FO4 gates of delay, thus
    fitting in a ½-cycles allowing for input muxing (forward) and output
    muxing (result drive) in a combined single cycle::

    |<----------1 cycle---------->|
    | inMux | integer ADD |outMux |

    {{ this can also be timed as::

    |<----------1 cycle---------->|
    | integer ADD |outMux | inMux |
    or
    |<----------1 cycle---------->|
    |outMux | inMux | integer ADD |

    I have participated in designs using all three of these timings}}

    So, on a small scale I agree that you can slow down a few paths.
    Many of them in the control paths of the machine rather than on
    the data-path.

    On the other hand: slowing down interconnect wires {that are "data
    path height" long or longer} is seldom profitable, especially as
    one aims at higher frequencies.

    A different thing happens in <say> the exponent section of FMAC.
    You don't make it slower per-sé, instead you allow things to progress
    without spending the effort to make them more concurrent and thus faster.
    The Multiplier tree and Baugh-Walley adder take most of the time, so the exponent section only has to be "that fast".


    Stefan
    --- Synchronet 3.21a-Linux NewsLink 1.2