• Re: sort history, PDP-11 history, was Variable-length instructions

    From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon Feb 16 22:16:11 2026
    From Newsgroup: comp.arch


    John Levine <johnl@taugh.com> posted:

    According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
    John Levine <johnl@taugh.com> writes:
    These days z/Series has the optional Enhanced-Sort Facility added in >>September 2020 which adds a complex SORT LISTS instruction that does >>in-memory sorts or merges presuably in microcode so it can run at full >>memory speed.

    If the microarchitecture is worth anything, microcode does not run any >faster than architectural code, when doing the same operations.

    One possible reason for the additional instruction is that there is a >hardware feature that provides a speedup; in that case the hardware >feature, not the microcode is the reason for the speedup. Is there
    such a hardware feature for SORT LISTS, and if so, what is it?

    It says here it's a combination of hardware and microcode, but the references are all broken or paywalled. It also says that the instruction can run for quite
    a while so it might be that the microcode can run the memory at full speed better
    than ordinary instrucions can.

    Sounds like a co-processor or an attached-processor using µcode for communication {setup and tear down}. Basically, core-CPU ships request
    to Sorter(tm); sorter does its job, and gets back at lower latency than
    an interrupt. Core-caches will contain post-sorted data.

    https://blog.share.org/Technology-Article/peeking-under-the-hood-of-sort-acceleration-on-z15

    This isn't their first sort speedup. A while ago they added instructions that do
    the inner loop of heapsort. Not sure whether there's hardware for that.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Tue Feb 17 01:19:58 2026
    From Newsgroup: comp.arch

    According to John Levine <johnl@taugh.com>:
    One possible reason for the additional instruction is that there is a >>hardware feature that provides a speedup; in that case the hardware >>feature, not the microcode is the reason for the speedup. Is there
    such a hardware feature for SORT LISTS, and if so, what is it?

    I found a 2020 IBM JR&D article that describes it in some detail.
    It's some fairly simple hardware that keeps pointers to the keys
    in fast SRAM so it can do what IBM calls a looser tree merge sort
    and everyone else seems to call a loser tree merge sort.
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.21b-Linux NewsLink 1.2