• Intel's Software Defined Super Cores

    From John Savard@quadibloc@invalid.invalid to comp.arch on Mon Sep 15 23:54:12 2025
    From Newsgroup: comp.arch

    When I saw a post about a new way to do OoO, I had thought it might be
    talking about this:

    https://www.techradar.com/pro/is-it-a-bird-is-it-a-plane-no-its-super-core- intels-latest-patent-revives-ancient-anti-hyperthreading-cpu-technique-in- attempt-to-boost-processor-performance-but-will-it-be-enough

    Basically, Intel proposes to boost single-thread performance by splitting programs into chunks that can be performed in parallel on different cores, where the cores are intimately connected in order to make this work.

    This is a sound idea, but one may not find enough opportunities to use it.

    Although it's called "inverse hyperthreading", this technique could be combined with SMT - put the chunks into different threads on the same
    core, rather than on different cores, and then one wouldn't need to add
    extra connections between cores to make it work.

    John Savard
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From John Savard@quadibloc@invalid.invalid to comp.arch on Tue Sep 16 00:03:51 2025
    From Newsgroup: comp.arch

    On Mon, 15 Sep 2025 23:54:12 +0000, John Savard wrote:

    Although it's called "inverse hyperthreading", this technique could be combined with SMT - put the chunks into different threads on the same
    core, rather than on different cores, and then one wouldn't need to add
    extra connections between cores to make it work.

    On further reflection, this may be equivalent to re-inventing out-of-order execution.

    John Savard
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Mon Sep 15 17:19:36 2025
    From Newsgroup: comp.arch

    On 9/15/2025 4:54 PM, John Savard wrote:
    When I saw a post about a new way to do OoO, I had thought it might be talking about this:

    https://www.techradar.com/pro/is-it-a-bird-is-it-a-plane-no-its-super-core- intels-latest-patent-revives-ancient-anti-hyperthreading-cpu-technique-in- attempt-to-boost-processor-performance-but-will-it-be-enough

    Basically, Intel proposes to boost single-thread performance by splitting programs into chunks that can be performed in parallel on different cores, where the cores are intimately connected in order to make this work.

    This is a sound idea, but one may not find enough opportunities to use it.

    Although it's called "inverse hyperthreading", this technique could be combined with SMT - put the chunks into different threads on the same
    core, rather than on different cores, and then one wouldn't need to add
    extra connections between cores to make it work.

    Two weeks ago, I saw this in Tom's Hardware.

    https://www.tomshardware.com/pc-components/cpus/intel-patents-software-defined-supercore-mimicking-ultra-wide-execution-using-multiple-cores

    But at this point, it is just a patent. While it *might* get included
    in a future product, it seems a long way away, if ever.
    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Mon Sep 15 17:56:28 2025
    From Newsgroup: comp.arch

    On 9/15/2025 4:54 PM, John Savard wrote:
    When I saw a post about a new way to do OoO, I had thought it might be talking about this:

    https://www.techradar.com/pro/is-it-a-bird-is-it-a-plane-no-its-super-core- intels-latest-patent-revives-ancient-anti-hyperthreading-cpu-technique-in- attempt-to-boost-processor-performance-but-will-it-be-enough

    Basically, Intel proposes to boost single-thread performance by splitting programs into chunks that can be performed in parallel on different cores, where the cores are intimately connected in order to make this work.

    We would have to somehow tell the system that the program only uses a
    single thread, right? Not exactly sure how the sync is going to work
    with regard to multi-threaded and/or multi process programs?

    A single threaded program runs, then it calls into a function that
    creates a thread. Humm...


    This is a sound idea, but one may not find enough opportunities to use it.

    Although it's called "inverse hyperthreading", this technique could be combined with SMT - put the chunks into different threads on the same
    core, rather than on different cores, and then one wouldn't need to add
    extra connections between cores to make it work.

    Can one get something kind of akin to it by a clever use of affinity
    masks? But, those are not 100% guaranteed?
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Tue Sep 16 10:13:35 2025
    From Newsgroup: comp.arch

    Basically, Intel proposes to boost single-thread performance by splitting programs into chunks that can be performed in parallel on different cores, where the cores are intimately connected in order to make this work.

    Sounds like [multiscalar processors](doi:multiscalar processor)


    Stefan
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Tue Sep 16 10:15:04 2025
    From Newsgroup: comp.arch

    Sounds like [multiscalar processors](doi:multiscalar processor)
    ^^^^^^^^^^^^^^^^^^^^^
    10.1145/223982.224451

    [ I guess it can be useful to actully look at what one pasts before
    pressing "send", eh? ]


    Stefan
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Tue Sep 16 15:10:09 2025
    From Newsgroup: comp.arch

    Stefan Monnier <monnier@iro.umontreal.ca> schrieb:

    [ I guess it can be useful to actully look at what one pasts before
    pressing "send", eh? ]

    This is sooooo 2010's. Next, you'll be claming it makes sense to
    think before writing, and where would we be then? Not in the age
    of modern social media, that's for sure.
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Sep 16 15:50:38 2025
    From Newsgroup: comp.arch


    John Savard <quadibloc@invalid.invalid> posted:

    When I saw a post about a new way to do OoO, I had thought it might be talking about this:

    https://www.techradar.com/pro/is-it-a-bird-is-it-a-plane-no-its-super-core- intels-latest-patent-revives-ancient-anti-hyperthreading-cpu-technique-in- attempt-to-boost-processor-performance-but-will-it-be-enough

    Basically, Intel proposes to boost single-thread performance by splitting programs into chunks that can be performed in parallel on different cores, where the cores are intimately connected in order to make this work.

    This is a sound idea, but one may not find enough opportunities to use it.

    Although it's called "inverse hyperthreading", this technique could be combined with SMT - put the chunks into different threads on the same
    core, rather than on different cores, and then one wouldn't need to add extra connections between cores to make it work.

    Andy Glew was working on stuff like this 10-15 years ago

    John Savard
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Tue Sep 16 13:01:30 2025
    From Newsgroup: comp.arch

    On Tue, 16 Sep 2025 00:03:51 -0000 (UTC), John Savard <quadibloc@invalid.invalid> wrote:

    On Mon, 15 Sep 2025 23:54:12 +0000, John Savard wrote:

    Although it's called "inverse hyperthreading", this technique could be
    combined with SMT - put the chunks into different threads on the same
    core, rather than on different cores, and then one wouldn't need to add
    extra connections between cores to make it work.

    On further reflection, this may be equivalent to re-inventing out-of-order >execution.

    John Savard

    Sounds more like dynamic micro-threading.

    Over the years I've seen a handful of papers about compile time micro-threading: that is the compiler itself identifies separable
    dependency chains in serial code and rewrites them into deliberate
    threaded code to be executed simultaneously.

    It is not easy to do under the best of circumstances and I've never
    seen anything about doing it dynamically at run time.

    To make a thread worth rehosting to another core, it would need to be
    (at least) many 10s of instructions in length. To figure this out
    dynamically at run time, it seems like you'd need the decode window to
    be 1000s of instructions and a LOT of "figure-it-out" circuitry.


    MMV, but to me it doesn't seem worth the effort.
    --- Synchronet 3.21a-Linux NewsLink 1.2