• Re: condition bits, Concertina IV Has Arrived

    From David Brown@david.brown@hesbynett.no to comp.arch on Mon May 25 17:18:18 2026
    From Newsgroup: comp.arch

    On 25/05/2026 16:28, Anton Ertl wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 24/05/2026 23:39, quadi wrote:
    On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:
    quadi <quadibloc@ca.invalid> posted:

    It makes sense to trap on a floating-point overflow, but trapping on an >>>>> integer overflow is usually a terrible idea.

    Most programming environments I have had contact with don't trap on floating-point overflow.

    So, detecting something went wrong and you should inform the programmer >>>> is a bad idea ???

    The question is if an integer overflow means that something went
    wrong.

    At the source code level, that is often the case - but not always. I
    think it is quite clear that if you do something the language does not
    allow, the code is wrong, but it might give the correct results for some
    tools nonetheless. And overflow will often mean something went wrong
    even when the language (or compiler options) specifically allow it. At
    the object code level, things may be different again. (For an obvious example, if you are using a double-width integer type then the source
    code may have no overflow but the implementation might use two "add-with-carry" instructions where overflow is a natural part of the implementation.)

    Despite their eagerness to "optimize" based on the assumption
    that signed integer overflow does not happen, the GCC developers have
    avoided making -ftrap the default, even on platforms like MIPS and
    Alpha where the implementation of -ftrapv just means to use different instructions (e.g., add instead of addu on MIPS, and addv instead of
    add on Alpha).

    An awkward thing about using trap on overflow is determining how
    precisely it is defined. Supposing you have the expression "a + b - a".
    Perhaps "a + b" overflows. I would hope than when using debug-related compiler flags such as "-fsanitize=signed-integer-overflow", a compiler
    would check for overflow on "a + b", and report it at runtime.
    (Unfortunately, gcc does not do that unless the partial expression is
    assigned to a variable.) But in "normal" usage, I'd expect the
    expression to be simplified, resulting in just "b" and no overflow.

    If "trap on overflow" has precise semantics in the code, then this
    disables a range of useful optimisations and re-arrangements. If it is
    just "use trapping arithmetic instructions", then it will miss many
    possible cases of actual overflow in the code, which we might want to
    catch. And "trap on overflow" might either trigger when there is no
    overflow in the original code, or hinder optimisations. (Consider the expression "x / 2 + y / 2" - the compiler could implement that as a
    combined "(x + y) / 2", but that might introduce overflow.)

    It is not easy to see how a tool can avoid false positives and false
    negatives and also conveniently optimise and re-arrange code.


    The hardware, of course, cannot always enable trapping on overflow if it
    is going to efficiently support a range of programming languages. But
    as an optional feature it can be helpful for catching a few bugs in
    code, so it can be a good idea (both for signed and unsigned overflow).

    This supposedly helpful feature has been neglected by C compiler
    developers, and you see in the progression from MIPS (1986) to Alpha
    (1992) and then RISC-V (2011) that the hardware architects have
    accepted that:

    MIPS: add traps on signed overflow, you need to write addu if you
    don't want that.

    Alpha: add ignores signed overflow, you need to write addv if you want
    the trapping.

    RISC-V: add ignores signed overflow, there is no add that traps on
    signed overflow (and detecting signed overflow is pretty
    involved if both operands are unknown to the compiler).

    - anton

    Compilers have not always been good at taking advantage of all the
    features provided by hardware - nor have languages been good at exposing
    the possibilities in the language so that programmers can take advantage
    of them.


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 16:45:07 2026
    From Newsgroup: comp.arch


    quadi <quadibloc@ca.invalid> posted:

    On Wed, 20 May 2026 01:35:01 +0000, MitchAlsup wrote:

    You will find you have no <marketable> choice; you need to support::

    Integer{S8, S16, S32, S64, U8, U16, U32, U64}
    Float {FP8, FP16, FP32, FP64 and some way to get FP128}

    After realizing that I did need a second instruction for unsigned
    _division_ I then learned, to my shock, that division was not one, but
    two, instructions, at least in my architecture, for integers.

    And there didn't seem to be enough opcode space left for Divide Extensibly Unsigned.

    My 66000 has an instruction bit that denotes the signedness of integer calculations {Signed, unSigned}. This bit is available as another OpCode
    bit for non-integer calculation instructions.

    I was able to re-adjust the 32-bit operate instructions so that the two places where only 96 opcodes were provided for the basic operate instructions could now provide 128 opcodes.

    The 16-bit and 24-bit short instructions could not be so modified. But
    there were a few unused opcodes; so Divide Extensibly Unsigned could still fit in, just out of place.

    But that meant that this one operation would be missing from the minimum- length immediate instructions, and would still be treated as out of the basic instruction set, getting immediate instructions that were 16 bits longer, for them.

    The Pigeonhole Principle has finally bit me!

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 16:49:59 2026
    From Newsgroup: comp.arch


    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    David Brown <david.brown@hesbynett.no> writes:
    On 24/05/2026 23:39, quadi wrote:
    On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:
    -----------------
    This supposedly helpful feature has been neglected by C compiler
    developers, and you see in the progression from MIPS (1986) to Alpha
    (1992) and then RISC-V (2011) that the hardware architects have
    accepted that:

    MIPS: add traps on signed overflow, you need to write addu if you
    don't want that.

    Alpha: add ignores signed overflow, you need to write addv if you want
    the trapping.

    RISC-V: add ignores signed overflow, there is no add that traps on
    signed overflow (and detecting signed overflow is pretty
    involved if both operands are unknown to the compiler).

    The worst of all possible semantic encodings


    - anton
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon May 25 16:43:07 2026
    From Newsgroup: comp.arch

    David Brown <david.brown@hesbynett.no> writes:
    On 25/05/2026 16:28, Anton Ertl wrote:
    Despite their eagerness to "optimize" based on the assumption
    that signed integer overflow does not happen, the GCC developers have
    avoided making -ftrap the default, even on platforms like MIPS and
    Alpha where the implementation of -ftrapv just means to use different
    instructions (e.g., add instead of addu on MIPS, and addv instead of
    add on Alpha).

    An awkward thing about using trap on overflow is determining how
    precisely it is defined. Supposing you have the expression "a + b - a".
    Perhaps "a + b" overflows. I would hope than when using debug-related
    compiler flags such as "-fsanitize=signed-integer-overflow", a compiler >would check for overflow on "a + b", and report it at runtime. >(Unfortunately, gcc does not do that unless the partial expression is >assigned to a variable.) But in "normal" usage, I'd expect the
    expression to be simplified, resulting in just "b" and no overflow.

    OTOH, cases like a+b+c where the result is in range, while an
    intermediate result is out of range are one of the reasons why I
    prefer -fwrapv over -ftrapv. As for your preference of nasal demons,
    given enough information, the compiler might "optimize" "a+b-a" into,
    e.g., 0.

    Anyway, the definition of -ftrapv is not very precise; for gcc-12.2:

    |'-ftrapv'
    | This option generates traps for signed overflow on addition,
    | subtraction, multiplication operations.

    As for what gcc-12.2 does for your example on AMD64:

    long foo(long a, long b)
    {
    return a+b-a;
    }

    is compiled with gcc -O3 -ftrapv to:

    0: 48 89 f0 mov %rsi,%rax
    3: c3 ret

    If "trap on overflow" has precise semantics in the code, then this
    disables a range of useful optimisations and re-arrangements. If it is
    just "use trapping arithmetic instructions", then it will miss many
    possible cases of actual overflow in the code, which we might want to
    catch.

    Which would you prefer by default?

    The gcc developers apparently took the latter approach, even when you
    ask for -ftrapv explicitly. So what, IYO, speaks against doing that
    by default on machines like MIPS and Alpha.

    And "trap on overflow" might either trigger when there is no
    overflow in the original code, or hinder optimisations. (Consider the >expression "x / 2 + y / 2" - the compiler could implement that as a
    combined "(x + y) / 2", but that might introduce overflow.)

    x/2+y/2 produces a different result from (x+y)/2 when both x and y are
    odd integers.

    gcc-12.2 compiles

    long bar(long x, long y)
    {
    return x/2+y/2;
    }


    on AMD64 to:

    gcc -O3 -ftrapv gcc -O3
    mov %rdi,%rax mov %rdi,%rax
    sub $0x8,%rsp mov %rsi,%rdx
    shr $0x3f,%rax shr $0x3f,%rax
    add %rax,%rdi shr $0x3f,%rdx
    mov %rsi,%rax add %rdi,%rax
    shr $0x3f,%rax add %rsi,%rdx
    sar %rdi sar %rax
    add %rax,%rsi sar %rdx
    sar %rsi add %rdx,%rax
    call __addvdi3@PLT ret
    add $0x8,%rsp
    ret

    so the -ftrapv introduces an additional mov and a call; I would have
    expected that the + would be compiled to an ADD instruction followed
    by a JO instruction.

    Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
    produces ILP32 code) produces a call to __addvsi3 instead of the
    expected add instruction:

    gcc -O3 -ftrapv gcc -O3
    lui gp,0x0 srl v0,a0,0x1f
    addiu gp,gp,0 srl v1,a1,0x1f
    addu gp,gp,t9 addu v0,v0,a0
    srl v1,a0,0x1f addu a1,v1,a1
    lw t9,__addvsi3(gp) sra v0,v0,0x1
    srl v0,a1,0x1f sra a1,a1,0x1
    addiu sp,sp,-32 jr ra
    addu a0,v1,a0 addu v0,v0,a1
    addu a1,v0,a1
    sra a0,a0,0x1
    sw ra,28(sp)
    sw gp,16(sp)
    jalr t9
    sra a1,a1,0x1
    lw ra,28(sp)
    jr ra
    addiu sp,sp,32

    The call costs a lot of overhead.

    It is not easy to see how a tool can avoid false positives and false >negatives and also conveniently optimise and re-arrange code.

    It can't. But it does not try to avoid false negatives even when
    explicitly asked for trapping on overflow.

    If some overflow trapping when it can be done without additional
    instructions would be preferable over no overflow, gcc would compile
    signed adds that survive after optimization into add on MIPS rather
    than addu, by default. Given that it does not, the GCC developers
    probably found out that it is not preferable. I guess they would get
    too many customer complaints, including for "relevant" code, i.e.,
    code where the usual "it's UB, so your code is broken" excuse does not
    work.

    The fact that they don't even try to make -ftrapv produce efficient
    code indicates that there is no "relevant" interest in efficient
    -ftrapv. It would be interesting to know who came up with the idea of
    adding -ftrapv, and why they are still keeping it.

    Compilers have not always been good at taking advantage of all the
    features provided by hardware

    GCC is pretty good at implementing -fwrapv. For the two examples
    above, "gcc -O3 -fwrapv" produces the same code on AMD64 and MIPS as
    "gcc -O3".

    nor have languages been good at exposing
    the possibilities in the language so that programmers can take advantage
    of them.

    Yes. But I leave that for another day.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 19:20:01 2026
    From Newsgroup: comp.arch


    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    David Brown <david.brown@hesbynett.no> writes:
    On 25/05/2026 16:28, Anton Ertl wrote:
    Despite their eagerness to "optimize" based on the assumption
    that signed integer overflow does not happen, the GCC developers have
    avoided making -ftrap the default, even on platforms like MIPS and
    Alpha where the implementation of -ftrapv just means to use different
    instructions (e.g., add instead of addu on MIPS, and addv instead of
    add on Alpha).

    An awkward thing about using trap on overflow is determining how
    precisely it is defined. Supposing you have the expression "a + b - a".
    Perhaps "a + b" overflows. I would hope than when using debug-related
    compiler flags such as "-fsanitize=signed-integer-overflow", a compiler >would check for overflow on "a + b", and report it at runtime. >(Unfortunately, gcc does not do that unless the partial expression is >assigned to a variable.) But in "normal" usage, I'd expect the
    expression to be simplified, resulting in just "b" and no overflow.

    OTOH, cases like a+b+c where the result is in range, while an
    intermediate result is out of range are one of the reasons why I
    prefer -fwrapv over -ftrapv. As for your preference of nasal demons,
    given enough information, the compiler might "optimize" "a+b-a" into,
    e.g., 0.

    a/0/b/


    Anyway, the definition of -ftrapv is not very precise; for gcc-12.2:

    |'-ftrapv'
    | This option generates traps for signed overflow on addition,
    | subtraction, multiplication operations.

    As for what gcc-12.2 does for your example on AMD64:

    long foo(long a, long b)
    {
    return a+b-a;
    }

    is compiled with gcc -O3 -ftrapv to:

    0: 48 89 f0 mov %rsi,%rax
    3: c3 ret

    If "trap on overflow" has precise semantics in the code, then this >disables a range of useful optimisations and re-arrangements. If it is >just "use trapping arithmetic instructions", then it will miss many >possible cases of actual overflow in the code, which we might want to >catch.

    Which would you prefer by default?

    What you do want is compiled code that can trap on overflow and avoid
    trapping on overflow without code substitution or being re-compiled.
    This way production code can avoid trapping but if the debugger is
    turned on, you can trap.

    The gcc developers apparently took the latter approach, even when you
    ask for -ftrapv explicitly. So what, IYO, speaks against doing that
    by default on machines like MIPS and Alpha.

    Both architectures got this one wrong--IMO--and so does RISC-V.

    And "trap on overflow" might either trigger when there is no
    overflow in the original code, or hinder optimisations. (Consider the >expression "x / 2 + y / 2" - the compiler could implement that as a >combined "(x + y) / 2", but that might introduce overflow.)

    x/2+y/2 produces a different result from (x+y)/2 when both x and y are
    odd integers.

    gcc-12.2 compiles

    long bar(long x, long y)
    {
    return x/2+y/2;
    }


    on AMD64 to:

    gcc -O3 -ftrapv gcc -O3
    mov %rdi,%rax mov %rdi,%rax
    sub $0x8,%rsp mov %rsi,%rdx
    shr $0x3f,%rax shr $0x3f,%rax
    add %rax,%rdi shr $0x3f,%rdx
    mov %rsi,%rax add %rdi,%rax
    shr $0x3f,%rax add %rsi,%rdx
    sar %rdi sar %rax
    add %rax,%rsi sar %rdx
    sar %rsi add %rdx,%rax
    call __addvdi3@PLT ret
    add $0x8,%rsp
    ret

    so the -ftrapv introduces an additional mov and a call; I would have
    expected that the + would be compiled to an ADD instruction followed
    by a JO instruction.

    Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
    produces ILP32 code) produces a call to __addvsi3 instead of the
    expected add instruction:

    gcc -O3 -ftrapv gcc -O3
    lui gp,0x0 srl v0,a0,0x1f
    addiu gp,gp,0 srl v1,a1,0x1f
    addu gp,gp,t9 addu v0,v0,a0
    srl v1,a0,0x1f addu a1,v1,a1
    lw t9,__addvsi3(gp) sra v0,v0,0x1
    srl v0,a1,0x1f sra a1,a1,0x1
    addiu sp,sp,-32 jr ra
    addu a0,v1,a0 addu v0,v0,a1
    addu a1,v0,a1
    sra a0,a0,0x1
    sw ra,28(sp)
    sw gp,16(sp)
    jalr t9
    sra a1,a1,0x1
    lw ra,28(sp)
    jr ra
    addiu sp,sp,32

    The call costs a lot of overhead.

    Architectures without overflow traps are notorious for excess instruction
    count when overflow detection is desired or mandated.

    It is not easy to see how a tool can avoid false positives and false >negatives and also conveniently optimise and re-arrange code.

    It can't. But it does not try to avoid false negatives even when
    explicitly asked for trapping on overflow.

    Granted, Optimization can do a lot of strange code emission and movement
    when one does not care about precise overflow semantics. But, as a whole,
    we are a society where we want high HP automobiles more than we want safe automobiles ('we' not including *.gov's).

    If some overflow trapping when it can be done without additional
    instructions would be preferable over no overflow, gcc would compile
    signed adds that survive after optimization into add on MIPS rather
    than addu, by default. Given that it does not, the GCC developers
    probably found out that it is not preferable. I guess they would get
    too many customer complaints, including for "relevant" code, i.e.,
    code where the usual "it's UB, so your code is broken" excuse does not
    work.

    It is much harder than that. For example: does a signed shift left
    overflow when significant bits are shifted out ?? What if the sub-
    sequent instruction shifts the result back and the pair are acting
    as a bit-field extract ?? My 66000 has bit field extracts for exactly
    this reason. Floating-point has a lot of these cases, too.

    The fact that they don't even try to make -ftrapv produce efficient
    code indicates that there is no "relevant" interest in efficient
    -ftrapv. It would be interesting to know who came up with the idea of
    adding -ftrapv, and why they are still keeping it.

    Compilers have not always been good at taking advantage of all the >features provided by hardware

    GCC is pretty good at implementing -fwrapv. For the two examples
    above, "gcc -O3 -fwrapv" produces the same code on AMD64 and MIPS as
    "gcc -O3".

    nor have languages been good at exposing
    the possibilities in the language so that programmers can take advantage >of them.

    Yes. But I leave that for another day.

    A whole new kettle of fish...

    - anton
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon May 25 20:26:24 2026
    From Newsgroup: comp.arch

    On Mon, 25 May 2026 10:23:00 +0200, David Brown wrote:

    The hardware, of course, cannot always enable trapping on overflow if it
    is going to efficiently support a range of programming languages.

    Yes. And I am used to FORTRAN, which did not trap on integer overflows.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon May 25 20:32:15 2026
    From Newsgroup: comp.arch

    On Mon, 25 May 2026 19:20:01 +0000, MitchAlsup wrote:
    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
    David Brown <david.brown@hesbynett.no> writes:
    On 25/05/2026 16:28, Anton Ertl wrote:

    Despite their eagerness to "optimize" based on the assumption
    that signed integer overflow does not happen, the GCC developers
    have avoided making -ftrap the default, even on platforms like MIPS
    and Alpha where the implementation of -ftrapv just means to use
    different instructions (e.g., add instead of addu on MIPS, and addv
    instead of add on Alpha).

    Both architectures got this one wrong--IMO--and so does RISC-V.

    You may not have been replying to what Anton Ertl wrote above, since there
    was a lot in between that I snipped. But it does mention two architectures that took an approach to trapping on integer overflow... that I also tend
    to disagree with.

    What I'm used to is the System/360. While it made the mistake of having
    two condition code bits instead of NZVC, the idea of having "trap on
    overflow" controlled by a bit in the PSW is... what I assumed to be normal
    and correct.

    I could be wrong, as I haven't examined that approach critically and given full consideration to the alternatives.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Mon May 25 20:32:15 2026
    From Newsgroup: comp.arch

    David Brown <david.brown@hesbynett.no> schrieb:
    On 24/05/2026 23:39, quadi wrote:
    On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:
    quadi <quadibloc@ca.invalid> posted:

    It makes sense to trap on a floating-point overflow, but trapping on an >>>> integer overflow is usually a terrible idea.

    So, detecting something went wrong and you should inform the programmer
    is a bad idea ???

    No, so being able to turn the trap for integer overflow on should
    definitely be allowed. But that shouldn't be the default behavior.
    Otherwise, programs like random number generators wouldn't work.

    John Savard

    That does not make sense. Code such as random number generators should
    be written so that they are correct in the language they are written in.

    In principle, yes.

    In practice, people often used whatever "worked" on their systems.
    Implementors have a certain right because they control what their
    compiler does or does not do. But users did so, as well, with
    Numerical Recipes a(n in)famous example.

    And yes, this bites people. You can see this at https://gcc.gnu.org/gcc-13/porting_to.html :

    # GCC 13 includes new optimizations which may change behavior
    # on integer overflow. Traditional code, like linear congruential
    # pseudo-random number generators in old programs and relying on
    # a specific, non-standard behavior may now generate unexpected
    # results. The option -fsanitize=undefined can be used to detect
    # such code at runtime.

    # It is recommended to use the intrinsic subroutine RANDOM_NUMBER for
    # random number generators or, if the old behavior is desired, to use
    # the -fwrapv option. Note that this option can impact performance.


    If that is C, signed integer overflow is UB while unsigned integers
    have wrapping behaviour - thus if your code depends on wrapping, and it
    is written in C, it needs to use unsigned types or compiler-specific extensions, flags, etc. (Or C23 ckd_add and other checked arithmetic functions.)

    If it is written in Zig, you need to use the specific modulo arithmetic functions even for unsigned arithmetic. If it is written in Java,
    signed integer arithmetic is fine.

    It all depends on the language and/or any options the language and tools might support - and code should be written to work correctly according
    to the language rules.

    Fortran has no standard way of implementing this unless you
    restrict yourself to sizes which do not overflow a signed integer.
    Implementing LCGRNGs was one reason why I pushed for unsigned
    arithmetic (modulo 2**n) in Fortran. The attempt failed (not
    taken up by WG5 after being endorsed by J3), but I implemented it
    for gfortran anyway.

    The hardware, of course, cannot always enable trapping on overflow if it
    is going to efficiently support a range of programming languages. But
    as an optional feature it can be helpful for catching a few bugs in
    code, so it can be a good idea (both for signed and unsigned overflow).

    Sanitizers are also fairly good now, but of course cost performance.
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon May 25 20:34:41 2026
    From Newsgroup: comp.arch

    On Mon, 25 May 2026 16:49:59 +0000, MitchAlsup wrote:
    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    RISC-V: add ignores signed overflow, there is no add that traps on
    signed overflow (and detecting signed overflow is pretty
    involved if both operands are unknown to the compiler).

    The worst of all possible semantic encodings

    Although I thought that making trapping on fixed-point overflow the
    default is a bad idea, I agree that making it impossible to do so, or even test for fixed-point overflow, is a much worse idea.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon May 25 20:45:20 2026
    From Newsgroup: comp.arch

    On Mon, 25 May 2026 16:45:07 +0000, MitchAlsup wrote:

    My 66000 has an instruction bit that denotes the signedness of integer calculations {Signed, unSigned}. This bit is available as another OpCode
    bit for non-integer calculation instructions.

    That's nice. It's not an option I can consider, as having lots of
    orthogonal modifiers on instructions would tend to increase their length.
    A major goal of the Concertina II, III, and IV architectures is for instructions not to be longer than similar instructions on the Motorola
    68020 or the IBM System/360 if at all possible.

    Basically, the selling point is... "Your programs only get 10% bigger, if that, and yet you have 32 registers, so they run faster!".

    Or they _would_, if the design didn't have so many extra transistors for supporting both IBM-format and Intel-format Decimal Floating Point, old-
    style IBM floats, simple floating (You too can work with numbers that go around the world 2 1/2 times!), packed decimal, mixed-radix arithmetic...

    But, hey, supporting these things in hardware is faster than doing them in software!

    And are people even going to _read_ the part of the manual that
    explains... as is noted in the description of the original Concertina architecture...

    This chip has 8-way simultaneous multi-threading, but only for programs
    which do not make use of extensions to the register set.

    Only two programs per core may use the extended register banks with 128 elements.

    Only one program per core may use the vector registers for long vector instructions. The 256-bit short vector registers, on the other hand, like
    the integer and floating-point registers, are available to all
    simultaneous threads.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon May 25 20:32:35 2026
    From Newsgroup: comp.arch

    MitchAlsup <user5857@newsgrouper.org.invalid> writes:

    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
    What you do want is compiled code that can trap on overflow and avoid >trapping on overflow without code substitution or being re-compiled.
    This way production code can avoid trapping but if the debugger is
    turned on, you can trap.

    Why do you consider that desirable?

    long bar(long x, long y)
    {
    return x/2+y/2;
    }
    ...
    Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
    produces ILP32 code) produces a call to __addvsi3 instead of the
    expected add instruction:

    gcc -O3 -ftrapv gcc -O3
    lui gp,0x0 srl v0,a0,0x1f
    addiu gp,gp,0 srl v1,a1,0x1f
    addu gp,gp,t9 addu v0,v0,a0
    srl v1,a0,0x1f addu a1,v1,a1
    lw t9,__addvsi3(gp) sra v0,v0,0x1
    srl v0,a1,0x1f sra a1,a1,0x1
    addiu sp,sp,-32 jr ra
    addu a0,v1,a0 addu v0,v0,a1
    addu a1,v0,a1
    sra a0,a0,0x1
    sw ra,28(sp)
    sw gp,16(sp)
    jalr t9
    sra a1,a1,0x1
    lw ra,28(sp)
    jr ra
    addiu sp,sp,32

    The call costs a lot of overhead.

    Architectures without overflow traps are notorious for excess instruction >count when overflow detection is desired or mandated.

    MIPS' add traps on overflow. gcc could have emitted almost the same
    code for gcc -O3 -trapv as for gcc -O3, except that the last
    instruction would be an add, not an addu. But apparently nobody gives
    a damn about the efficiency of -trapv, possibly rightly so.

    If some overflow trapping when it can be done without additional
    instructions would be preferable over no overflow, gcc would compile
    signed adds that survive after optimization into add on MIPS rather
    than addu, by default. Given that it does not, the GCC developers
    probably found out that it is not preferable. I guess they would get
    too many customer complaints, including for "relevant" code, i.e.,
    code where the usual "it's UB, so your code is broken" excuse does not
    work.

    It is much harder than that. For example: does a signed shift left
    overflow when significant bits are shifted out ??

    -ftrapv specifies trapping on overflow only for additions,
    subtractions, and multiplications.
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Mon May 25 16:34:50 2026
    From Newsgroup: comp.arch

    On 5/25/2026 9:28 AM, Anton Ertl wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 24/05/2026 23:39, quadi wrote:
    On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:
    quadi <quadibloc@ca.invalid> posted:

    It makes sense to trap on a floating-point overflow, but trapping on an >>>>> integer overflow is usually a terrible idea.

    Most programming environments I have had contact with don't trap on floating-point overflow.


    Many just go Inf...

    Division by zero is usually handled by going NaN.

    Contrast with integer division by zero which does usually trap.


    So, detecting something went wrong and you should inform the programmer >>>> is a bad idea ???

    The question is if an integer overflow means that something went
    wrong. Despite their eagerness to "optimize" based on the assumption
    that signed integer overflow does not happen, the GCC developers have
    avoided making -ftrap the default, even on platforms like MIPS and
    Alpha where the implementation of -ftrapv just means to use different instructions (e.g., add instead of addu on MIPS, and addv instead of
    add on Alpha).


    Integer overflow happens far too often for trapping to be a good solution.


    We almost need a separate "integer that should not overflow" type, with
    more explicit "do something special if it does" semantics.


    Though, more likely to be useful would be a "detect if an overflow had happened" mechanism.

    errno_t ovfstate;
    __int_no_overflow x, y, z;
    ...
    __start_errsense(&ovfstate);
    z=x+y;
    __end_errsense(&ovfstate);
    if(ovfstate&ERRSENSE_FLAG_OVERFLOW)
    ...

    Which would be awkward, but probably more useful than, say, raising a
    signal and/or terminating the program.


    The hardware, of course, cannot always enable trapping on overflow if it
    is going to efficiently support a range of programming languages. But
    as an optional feature it can be helpful for catching a few bugs in
    code, so it can be a good idea (both for signed and unsigned overflow).

    This supposedly helpful feature has been neglected by C compiler
    developers, and you see in the progression from MIPS (1986) to Alpha
    (1992) and then RISC-V (2011) that the hardware architects have
    accepted that:

    MIPS: add traps on signed overflow, you need to write addu if you
    don't want that.

    Alpha: add ignores signed overflow, you need to write addv if you want
    the trapping.

    RISC-V: add ignores signed overflow, there is no add that traps on
    signed overflow (and detecting signed overflow is pretty
    involved if both operands are unknown to the compiler).


    In practice, given:
    We have instructions like ADDW, etc, whose behavior is explicitly to sign-extend the results of 32-bit ADD;
    Behavior in practice is often to meticulously follow wrap-on-overflow semantics;
    Exceptions to wrap-on-overflow usually exist as edge cases;
    Various programs exist that will actively break if wrap-on-overflow is
    not the observed behavior in C land;
    ...

    The expectation that 'int' can or meaningfully do something other than
    wrap on overflow is more of a fantasy.


    Or like some other some other "portability boogeymen":
    Non two's complement integer arithmetic;
    Big endian machines;
    Machines that don't allow unaligned loads and stores;
    Types with sizes other than the "usually accepted" set;
    ...


    The argument has often been, "but, 64-bit machines might not provide
    native 32-bit arithmetic".

    But, often in 64-bit machines, a pattern emerges:
    Most ops are full 64-bit;
    A subset of instructions have variants that produce sign and/or zero
    extended results;
    The instructions which produce these results, typically being, the ones
    needed to preserve the usual wrap-on-overflow semantics in those places
    where something could happen that would produce a deviation from the
    expected semantics.

    The ones that have zero-extension usually treating signed integers as zero-extended.

    The reverse has also been done; treating unsigned as sign-extended, as
    in the standard RISC-V ABI, but IMO this is stupid. Even in the absence
    of a native zero-extension op (as in plain RV64G), the mess that results
    from sign-extending unsigned is worse than the cost of explicit zero extension.

    Best case here being to keep values using "native extension":
    'int' : Always sign extended;
    'unsigned int': Always zero extended.
    Then 32-bit types are a strict subset of the 64-bit range, and
    up-promotion becomes free. Not sure why some people don't see this as
    obvious though. Well, and people keep making the choice of adding
    garbage edge cases to RISC-V that would have been entirely unnecessary
    if people weren't being stupid about the ABI rules.

    But yeah...


    But, all this would not be expected to happen unless one accepts that it
    is already generally accepted that wrap-on-overflow for 'int' and
    similar is the only really practical or viable solution here.





    Otherwise, recently:
    In my case I decided to live with a "breaking change" in XG3 and to
    change some things that may matter later. Then ended up tweaking some
    other things on my annoyance list (since I was already breaking existing binaries, better to cluster breakage to a singular event if doing it).

    ADD, ADDS.L, and ADDU.L have all been changed from Imm10u/n to Imm10s.
    The Imm10u cases are now Imm10s;
    The Imm10n sub-case is now dropped/reserved.
    May be reused later.
    This reclaims 3 out of the 20 Imm10 spots.
    Was mostly a case of it being harder to justify the encoding space.
    Old behavior will need to remain for XG1 and XG2.
    In this case, XG3 will explicitly deviate from XG1 and XG2 here.
    Does mean that XG3 now had less ADD/SUB Imm range than XG2, but...
    Only goes from 97.1% hit rate to 95.9%,
    no significant effect on overall code density.
    Could use the RV Imm12 ops (ADDI / ADDIW), but:
    Hit rate for the RV ops here is negligible;
    Much of these also happen to miss on one or both registers.

    The MULS.L and MULU.L ops were also switched to Imm10s.
    This means all of the Imm10 ALU ops are now unified on Imm10s.

    Relocated TST and TSTN from the F0-8 block (with the XMOV instructions)
    to the F0-9 block (with the other CMPxx 3R ops).

    A few very rarely used instructions were demoted from 32-bit to 64-bit encodings.


    Have experimentally added some 32-bit:
    Bcc Rm, Imm6s, (PC, Disp6s)
    instructions, where:
    Imm6s: Hits ~ 80% of these cases;
    Disp6s: Hits ~ 60% of these cases;
    Imm5s + Disp7s would hit slightly better, but,
    would have needed more new decoder logic...
    Resulting in it hitting about half over the:
    Bcc Rm, Imm17s, (PC, Disp10s)
    Cases, for an overall code-density improvement of ~ 0.5%, ...
    Dominant use-case: Final compare-and-branch in a short "for()" loop.
    Secondary use-case: Short non-predicated "if()" branches.
    But, is out-weighed by said predicated "if()" branches.
    Would likely see more use here if not using predication.
    If it would have hit for 100% of these, would have saved ~ 1%.

    This is debatable.

    This reused the encoding spots previously used for the Load-Disp5us ops,
    which still exist for XG1 and XG2 (decoder special-case handling), but
    were N/A in XG3 (they would be in effect entirely redundant with the
    Disp10s forms in XG3; but had non-redundant edge-cases in XG1 and XG2).

    Like with the Imm17s+Disp10s ops, these will still depend on the IMMB extension, as they still need the same basic mechanism.

    Was a fairly low-priority feature, in any case.


    Seemingly running low on obvious optimization paths.


    - anton

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 22:49:58 2026
    From Newsgroup: comp.arch


    quadi <quadibloc@ca.invalid> posted:

    On Mon, 25 May 2026 10:23:00 +0200, David Brown wrote:

    The hardware, of course, cannot always enable trapping on overflow if it
    is going to efficiently support a range of programming languages.

    Yes. And I am used to FORTRAN, which did not trap on integer overflows.

    WATfor and WATfive trapped on integer overflows.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 22:51:42 2026
    From Newsgroup: comp.arch


    quadi <quadibloc@ca.invalid> posted:

    On Mon, 25 May 2026 19:20:01 +0000, MitchAlsup wrote:
    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
    David Brown <david.brown@hesbynett.no> writes:
    On 25/05/2026 16:28, Anton Ertl wrote:

    Despite their eagerness to "optimize" based on the assumption
    that signed integer overflow does not happen, the GCC developers
    have avoided making -ftrap the default, even on platforms like MIPS
    and Alpha where the implementation of -ftrapv just means to use
    different instructions (e.g., add instead of addu on MIPS, and addv
    instead of add on Alpha).

    Both architectures got this one wrong--IMO--and so does RISC-V.

    You may not have been replying to what Anton Ertl wrote above, since there was a lot in between that I snipped. But it does mention two architectures that took an approach to trapping on integer overflow... that I also tend
    to disagree with.

    What I'm used to is the System/360. While it made the mistake of having
    two condition code bits instead of NZVC, the idea of having "trap on overflow" controlled by a bit in the PSW is... what I assumed to be normal and correct.

    And what My 66000 does....

    I purport that ANY Industrial quality ISA should provide a means to
    trap on integer overflow.

    I could be wrong, as I haven't examined that approach critically and given full consideration to the alternatives.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 22:59:10 2026
    From Newsgroup: comp.arch


    Thomas Koenig <tkoenig@netcologne.de> posted:

    David Brown <david.brown@hesbynett.no> schrieb:
    On 24/05/2026 23:39, quadi wrote:
    On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:
    quadi <quadibloc@ca.invalid> posted:

    It makes sense to trap on a floating-point overflow, but trapping on an >>>> integer overflow is usually a terrible idea.

    So, detecting something went wrong and you should inform the programmer >>> is a bad idea ???

    No, so being able to turn the trap for integer overflow on should
    definitely be allowed. But that shouldn't be the default behavior.
    Otherwise, programs like random number generators wouldn't work.

    John Savard

    That does not make sense. Code such as random number generators should
    be written so that they are correct in the language they are written in.

    In principle, yes.

    Principle is better in theory than in practice.

    In practice, people often used whatever "worked" on their systems.

    Face it, the poor slug writing the code may not have the faintest
    grasp at the system qualities we are discussing, and does not care
    to learn as long as he can slug through the writing and his pro-
    gram not blow up catastrophically while it is under his purview.

    That defines a lot of what is wrong with SW programming today.

    Implementors have a certain right because they control what their
    compiler does or does not do.

    You would be surprised at how little influence implementors have
    on compilers and other software.

    But users did so, as well, with
    Numerical Recipes a(n in)famous example.

    And yes, this bites people. You can see this at https://gcc.gnu.org/gcc-13/porting_to.html :

    # GCC 13 includes new optimizations which may change behavior
    # on integer overflow. Traditional code, like linear congruential
    # pseudo-random number generators in old programs and relying on
    # a specific, non-standard behavior may now generate unexpected
    # results. The option -fsanitize=undefined can be used to detect
    # such code at runtime.

    My VAX favorite was:

    for( int i = 1; i; i+=i )

    Traps instead of exiting the loop normally.

    # It is recommended to use the intrinsic subroutine RANDOM_NUMBER for
    # random number generators or, if the old behavior is desired, to use
    # the -fwrapv option. Note that this option can impact performance.


    If that is C, signed integer overflow is UB while unsigned integers
    have wrapping behaviour - thus if your code depends on wrapping, and it
    is written in C, it needs to use unsigned types or compiler-specific extensions, flags, etc. (Or C23 ckd_add and other checked arithmetic functions.)

    If it is written in Zig, you need to use the specific modulo arithmetic functions even for unsigned arithmetic. If it is written in Java,
    signed integer arithmetic is fine.

    It all depends on the language and/or any options the language and tools might support - and code should be written to work correctly according
    to the language rules.

    Fortran has no standard way of implementing this unless you
    restrict yourself to sizes which do not overflow a signed integer.

    Old FORTRAN had no unSigned integer type and no way to avoid overflows.

    Implementing LCGRNGs was one reason why I pushed for unsigned
    arithmetic (modulo 2**n) in Fortran. The attempt failed (not
    taken up by WG5 after being endorsed by J3), but I implemented it
    for gfortran anyway.

    The hardware, of course, cannot always enable trapping on overflow if it is going to efficiently support a range of programming languages. But
    as an optional feature it can be helpful for catching a few bugs in
    code, so it can be a good idea (both for signed and unsigned overflow).

    Sanitizers are also fairly good now, but of course cost performance.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 23:00:32 2026
    From Newsgroup: comp.arch


    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    MitchAlsup <user5857@newsgrouper.org.invalid> writes:

    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
    What you do want is compiled code that can trap on overflow and avoid >trapping on overflow without code substitution or being re-compiled.
    This way production code can avoid trapping but if the debugger is
    turned on, you can trap.

    Why do you consider that desirable?

    So you can debug production/released code to find subtle errors.

    long bar(long x, long y)
    {
    return x/2+y/2;
    }
    ...
    Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
    produces ILP32 code) produces a call to __addvsi3 instead of the
    expected add instruction:

    gcc -O3 -ftrapv gcc -O3
    lui gp,0x0 srl v0,a0,0x1f
    addiu gp,gp,0 srl v1,a1,0x1f
    addu gp,gp,t9 addu v0,v0,a0
    srl v1,a0,0x1f addu a1,v1,a1
    lw t9,__addvsi3(gp) sra v0,v0,0x1
    srl v0,a1,0x1f sra a1,a1,0x1
    addiu sp,sp,-32 jr ra
    addu a0,v1,a0 addu v0,v0,a1
    addu a1,v0,a1
    sra a0,a0,0x1
    sw ra,28(sp)
    sw gp,16(sp)
    jalr t9
    sra a1,a1,0x1
    lw ra,28(sp)
    jr ra
    addiu sp,sp,32

    The call costs a lot of overhead.

    Architectures without overflow traps are notorious for excess instruction >count when overflow detection is desired or mandated.

    MIPS' add traps on overflow. gcc could have emitted almost the same
    code for gcc -O3 -trapv as for gcc -O3, except that the last
    instruction would be an add, not an addu. But apparently nobody gives
    a damn about the efficiency of -trapv, possibly rightly so.

    If some overflow trapping when it can be done without additional
    instructions would be preferable over no overflow, gcc would compile
    signed adds that survive after optimization into add on MIPS rather
    than addu, by default. Given that it does not, the GCC developers
    probably found out that it is not preferable. I guess they would get
    too many customer complaints, including for "relevant" code, i.e.,
    code where the usual "it's UB, so your code is broken" excuse does not
    work.

    It is much harder than that. For example: does a signed shift left
    overflow when significant bits are shifted out ??

    -ftrapv specifies trapping on overflow only for additions,
    subtractions, and multiplications.


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 23:03:03 2026
    From Newsgroup: comp.arch


    quadi <quadibloc@ca.invalid> posted:

    On Mon, 25 May 2026 16:45:07 +0000, MitchAlsup wrote:

    My 66000 has an instruction bit that denotes the signedness of integer calculations {Signed, unSigned}. This bit is available as another OpCode bit for non-integer calculation instructions.

    That's nice. It's not an option I can consider, as having lots of
    orthogonal modifiers on instructions would tend to increase their length.

    And harm instruction Entropy.

    A major goal of the Concertina II, III, and IV architectures is for instructions not to be longer than similar instructions on the Motorola 68020 or the IBM System/360 if at all possible.

    Basically, the selling point is... "Your programs only get 10% bigger, if that, and yet you have 32 registers, so they run faster!".

    Mine are getting 30% smaller and needing fewer instructions at the same
    time

    Or they _would_, if the design didn't have so many extra transistors for supporting both IBM-format and Intel-format Decimal Floating Point, old- style IBM floats, simple floating (You too can work with numbers that go around the world 2 1/2 times!), packed decimal, mixed-radix arithmetic...

    But, hey, supporting these things in hardware is faster than doing them in software!

    And are people even going to _read_ the part of the manual that
    explains... as is noted in the description of the original Concertina architecture...

    This chip has 8-way simultaneous multi-threading, but only for programs which do not make use of extensions to the register set.

    Another One Bites the Dust.....

    Only two programs per core may use the extended register banks with 128 elements.

    Only one program per core may use the vector registers for long vector instructions. The 256-bit short vector registers, on the other hand, like the integer and floating-point registers, are available to all
    simultaneous threads.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 23:05:06 2026
    From Newsgroup: comp.arch


    BGB <cr88192@gmail.com> posted:

    On 5/25/2026 9:28 AM, Anton Ertl wrote:
    --------------
    Integer overflow happens far too often for trapping to be a good solution.

    Even on 64-bit variables/machines ??
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Mon May 25 20:02:52 2026
    From Newsgroup: comp.arch

    On 5/25/2026 3:34 PM, quadi wrote:
    On Mon, 25 May 2026 16:49:59 +0000, MitchAlsup wrote:
    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    RISC-V: add ignores signed overflow, there is no add that traps on
    signed overflow (and detecting signed overflow is pretty
    involved if both operands are unknown to the compiler).

    The worst of all possible semantic encodings

    Although I thought that making trapping on fixed-point overflow the
    default is a bad idea, I agree that making it impossible to do so, or even test for fixed-point overflow, is a much worse idea.


    Possibly true.

    The lack of things like ADD-with-Carry or ADD-with-Overflow are
    annoyance points on RISC-V.


    Though, it is less obvious what a useful behavior is at the language level:
    "signal()" ? ...
    Something like try/catch (mostly N/A to C)?
    Something similar to FENV_ACCESS?
    ...


    Well, and that if trapping were applied globally:
    Overhead due to trap detection/handling code causing excessive bloat;
    Overflows traps from any code that naively assumes wrap-on-overflow
    semantics;
    ...

    In some codebases, it is already enough of a pain to hunt and fix all
    the out-of-bounds and uninitialized variables mess.
    Signed integer overflows would likely "turn it up to 11";
    Then, how does one fix it? Ask that people start adding a bunch of casts
    to make it work?...

    One might say:
    Add "if()" cases to deal with the overflows, but, ... this only makes
    sense for cases where the overflows are not the expected behavior.

    Then again, could maybe classify code, say:
    1, signed, value doesn't (or shouldn't) go out-of-range;
    2, unsigned, value doesn't (or shouldn't) go out-of-range;
    3, signed, value is expected to be modulo;
    4, unsigned, value is expected to be modulo.

    "nasal demons" types assume 1 and 4 as dominant.
    Or, 1 as exclusive vs 3.

    For compilers, we often need to assume 3 and 4.
    Because, failure to uphold 3 results in misbehaving programs.
    And, if 3 were uncommon, RISC-V's "ADDW"/etc would be pure stupidity.
    Instead:
    Something like plain ADD plus ADDWU would have made sense.
    But, they dropped ADDWU instead (also stupid IMO).

    While, granted, a lot of 1 code likely exists, 3 code tends to generate
    the vast majority of overflows; and if there is any reasonable
    expectation for 'int' to overflow, and it is not desired for int to
    overflow.

    We mostly ignore 2 vs 4, because standard specifies 4 making 2 to be
    purely a programming error, in which case "2" becomes "should have used
    a bigger signed type instead".


    Then again, could maybe make sense to add a semantic distinction, say:
    "int" (plain):
    Maybe a case could be made that overflow be assumed unexpected.
    "signed int":
    Maybe make separate from plain case, explicitly modulo;
    So, could be made distinct;
    Explicitly like the "unsigned" case in being modulo.
    "unsigned int":
    Remains the same, no real controversy here.

    Or, say:
    char, short, int, long, long long:
    For code, assume that overflow may be unexpected / undesirable;
    signed char, signed int, signed long, signed long long:
    Assume signed modulo;
    Compiler should, ideally, always produce wrap-on-overflow semantics.
    unsigned ...:
    Unsigned modulo.

    For a compiler, then:
    -ftrapv:
    May ideally trap on lack of "signed";
    Explicit "signed", continues to wrap.
    -fwrapv:
    Both default and signed will wrap.
    Neither:
    Dunno, probably better for compiler to assume "-fwrapv" semantics;
    Maybe assume UB opts are safe if no "signed".


    Well, and for the programmer POV:
    If assuming maximum portability:
    Only unsigned overflow wrapping is "safe".
    If assuming "any reasonable system":
    Both will wrap in most cases;
    Absent "-fwrapv", UB opts may occur in certain obscure edge cases.
    Though usually in the form of "early" vs "late" type promotion;
    In most cases, where it does occur, early promotion is benign.
    Vs whatever "nasal demons" people may assert.
    What else, that it late propmotes?
    (as "-fwrapv" semantics would dictate...)


    Like, say:
    int x;
    long z;
    ...
    z = 42 - x;
    //Oh no! UB opt has turned this into a 64-bit RSUB instruction!

    Yeah...


    Granted, ATM, for BGBCC, wouldn't make much difference at present. Could
    maybe make sense to add a distinction either to strengthen semantic
    analysis, or if I decided to change away from my existing "assume wrap
    on overflow semantics as sole option" policy. Or maybe adding an
    "-fno-wrapv" option, with "wrapv" remaining default but allowing an
    option to opt-out, sort of like how there is an "-fptropts" option to
    "opt into" strict-aliasing / TBAA semantics, vs the default semantics of "assume every explicit store may alias" semantics. Though, may still
    assume that loads may be cached and reordered, unless "volatile" is
    used, which explicitly disallows caching and reordering loads, though at present is a little "shotgun" and will basically disable caching
    throughout the whole basic block; which works as a detractor to the
    "casually use volatile as a way to dispel TBAA" interpretation (works on
    GCC, and is less adverse for performance than the "use memcpy" option on
    some other compilers, ...).


    Or, say:
    Bare pointer cast and deref:
    GCC: averse (falls afoul of default semantics);
    MSVC: benign;
    BGBCC: benign.
    Volatile pointer cast and deref:
    GCC: benign (doesn't use TBAA on volatile pointers);
    MSVC: benign;
    BGBCC: detrimental, disables caching and ld/st reordering;
    Using memcpy:
    GCC: benign;
    MSVC:
    Old (15+ years):
    Averse (actually calls memcpy, significant impact);
    Some intermediate versions would do an inline for "REP MOVSB".
    Also kinda crap, but less bad vs calling "memcpy()".
    Mostly only matters if still targeting WinXP or similar.
    Newer: Mild detriment in some cases.
    Inline loads/stores
    may fail to optimize to plain register moves for locals.
    BGBCC;
    Mostly similar to newer MSVC here;
    Works, just less efficient than plain "cast and deref".

    ...


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon May 25 15:27:29 2026
    From Newsgroup: comp.arch

    An awkward thing about using trap on overflow is determining how
    precisely it is defined.

    Indeed, this is a nasty part of language design.

    [ IMO, the only sane choice (beside wrapping and explicit `ckd_add`) is
    to treat overflow not as a exception (in the sense of `try..catch`
    thingies, not in the CPU hardware sense of the word) but as an
    execution error comparable to memory exhaustion. ]

    Luckily, for `comp.arch` the same problem doesn't plague ISAs because
    it's accepted that a CPU should stick religiously to the literal
    semantics of the machine code, no matter how far it is from what
    really happens inside the machine.


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Tue May 26 05:39:02 2026
    From Newsgroup: comp.arch

    quadi <quadibloc@ca.invalid> schrieb:
    On Mon, 25 May 2026 10:23:00 +0200, David Brown wrote:

    The hardware, of course, cannot always enable trapping on overflow if it
    is going to efficiently support a range of programming languages.

    Yes. And I am used to FORTRAN, which did not trap on integer overflows.

    Incorrect.

    Integer overflow is illegal in Fortran, so what the compiler then
    does is not determined (see my post on random number generators).

    Example:

    $ cat overfl.f90
    program main
    integer :: a, b
    a = 12345678
    b = 2345678
    print *,a*b
    end program main
    $ gfortran -fsanitize=undefined overfl.f90
    $ ./a.out
    overfl.f90:5:13: runtime error: signed integer overflow: 12345678 * 2345678 cannot be represented in type 'integer(kind=4)'
    -1979197244
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Tue May 26 08:18:17 2026
    From Newsgroup: comp.arch

    On 25/05/2026 18:43, Anton Ertl wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 25/05/2026 16:28, Anton Ertl wrote:
    Despite their eagerness to "optimize" based on the assumption
    that signed integer overflow does not happen, the GCC developers have
    avoided making -ftrap the default, even on platforms like MIPS and
    Alpha where the implementation of -ftrapv just means to use different
    instructions (e.g., add instead of addu on MIPS, and addv instead of
    add on Alpha).

    An awkward thing about using trap on overflow is determining how
    precisely it is defined. Supposing you have the expression "a + b - a".
    Perhaps "a + b" overflows. I would hope than when using debug-related
    compiler flags such as "-fsanitize=signed-integer-overflow", a compiler
    would check for overflow on "a + b", and report it at runtime.
    (Unfortunately, gcc does not do that unless the partial expression is
    assigned to a variable.) But in "normal" usage, I'd expect the
    expression to be simplified, resulting in just "b" and no overflow.

    OTOH, cases like a+b+c where the result is in range, while an
    intermediate result is out of range are one of the reasons why I
    prefer -fwrapv over -ftrapv. As for your preference of nasal demons,
    given enough information, the compiler might "optimize" "a+b-a" into,
    e.g., 0.

    Anyway, the definition of -ftrapv is not very precise; for gcc-12.2:

    |'-ftrapv'
    | This option generates traps for signed overflow on addition,
    | subtraction, multiplication operations.


    My understanding is that the GCC developers would rather deprecate
    -ftrapv entirely, and encourage the use of -fsanitize instead as a way
    to detect run-time errors. I don't know the details of the internals,
    but I believe the GCC developers see the sanitize options as more
    accurate and more likely to be further developed in the future.

    As for what gcc-12.2 does for your example on AMD64:

    long foo(long a, long b)
    {
    return a+b-a;
    }

    is compiled with gcc -O3 -ftrapv to:

    0: 48 89 f0 mov %rsi,%rax
    3: c3 ret

    If "trap on overflow" has precise semantics in the code, then this
    disables a range of useful optimisations and re-arrangements. If it is
    just "use trapping arithmetic instructions", then it will miss many
    possible cases of actual overflow in the code, which we might want to
    catch.

    Which would you prefer by default?

    I don't know for sure. A "by default" choice has to be suitable for a
    wide variety of users and a wide variety of cases, and preferably err on
    the side of caution. For my own personal use, I'm happy with UB
    overflow and would have preferred that as the default even for unsigned arithmetic (but of course with a way to specify wrapping when I need
    it). But that's for /my/ use - I don't think that should necessarily be
    the default for others. Let those who are willing to spend the time and effort learning the details and the care needed use compiler flags to
    get the highest efficiency from their code, and let the defaults help
    others catch their bugs. However, the logical endpoint of that is that
    C should only be used by those that have a detailed understanding of the language and need it for peak efficiency, while other programmers should
    work with other languages that have more error handling.



    The gcc developers apparently took the latter approach, even when you
    ask for -ftrapv explicitly. So what, IYO, speaks against doing that
    by default on machines like MIPS and Alpha.

    And "trap on overflow" might either trigger when there is no
    overflow in the original code, or hinder optimisations. (Consider the
    expression "x / 2 + y / 2" - the compiler could implement that as a
    combined "(x + y) / 2", but that might introduce overflow.)

    x/2+y/2 produces a different result from (x+y)/2 when both x and y are
    odd integers.


    True. Can we pretend that is not the case, and still see my point? The
    point is that the compiler can, during re-arrangements, introduce new overflows as long as it knows the final results are correct (since the compiler knows the details of how instructions are actually implemented).

    gcc-12.2 compiles

    long bar(long x, long y)
    {
    return x/2+y/2;
    }


    on AMD64 to:

    gcc -O3 -ftrapv gcc -O3
    mov %rdi,%rax mov %rdi,%rax
    sub $0x8,%rsp mov %rsi,%rdx
    shr $0x3f,%rax shr $0x3f,%rax
    add %rax,%rdi shr $0x3f,%rdx
    mov %rsi,%rax add %rdi,%rax
    shr $0x3f,%rax add %rsi,%rdx
    sar %rdi sar %rax
    add %rax,%rsi sar %rdx
    sar %rsi add %rdx,%rax
    call __addvdi3@PLT ret
    add $0x8,%rsp
    ret

    so the -ftrapv introduces an additional mov and a call; I would have
    expected that the + would be compiled to an ADD instruction followed
    by a JO instruction.

    Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
    produces ILP32 code) produces a call to __addvsi3 instead of the
    expected add instruction:

    gcc -O3 -ftrapv gcc -O3
    lui gp,0x0 srl v0,a0,0x1f
    addiu gp,gp,0 srl v1,a1,0x1f
    addu gp,gp,t9 addu v0,v0,a0
    srl v1,a0,0x1f addu a1,v1,a1
    lw t9,__addvsi3(gp) sra v0,v0,0x1
    srl v0,a1,0x1f sra a1,a1,0x1
    addiu sp,sp,-32 jr ra
    addu a0,v1,a0 addu v0,v0,a1
    addu a1,v0,a1
    sra a0,a0,0x1
    sw ra,28(sp)
    sw gp,16(sp)
    jalr t9
    sra a1,a1,0x1
    lw ra,28(sp)
    jr ra
    addiu sp,sp,32

    The call costs a lot of overhead.

    Agreed. I don't know why GCC uses a function call here. In my quick
    godbolt testing, clang uses the "add, jump-on-overflow" sequence.

    Using

    -fsanitize=signed-integer-overflow -fsanitize-trap

    gives an add followed by a jump-on-overflow sequence.


    It is not easy to see how a tool can avoid false positives and false
    negatives and also conveniently optimise and re-arrange code.

    It can't. But it does not try to avoid false negatives even when
    explicitly asked for trapping on overflow.

    If some overflow trapping when it can be done without additional
    instructions would be preferable over no overflow, gcc would compile
    signed adds that survive after optimization into add on MIPS rather
    than addu, by default. Given that it does not, the GCC developers
    probably found out that it is not preferable. I guess they would get
    too many customer complaints, including for "relevant" code, i.e.,
    code where the usual "it's UB, so your code is broken" excuse does not
    work.

    If "-ftrapv" is to have any use at all, then overflow is no longer UB -
    it has to be defined to trap. But I have to conclude that in GCC,
    -ftrapv is too vaguely defined and too inconsistently and inefficiently implemented to be of any use. This matches my understanding that the "-fsanitize=signed-integer-overflow -fsanitize-trap" flags are preferred
    by the GCC developers.


    The fact that they don't even try to make -ftrapv produce efficient
    code indicates that there is no "relevant" interest in efficient
    -ftrapv. It would be interesting to know who came up with the idea of
    adding -ftrapv, and why they are still keeping it.

    Compilers have not always been good at taking advantage of all the
    features provided by hardware

    GCC is pretty good at implementing -fwrapv. For the two examples
    above, "gcc -O3 -fwrapv" produces the same code on AMD64 and MIPS as
    "gcc -O3".

    That is my experience too (though I expect your experience here vastly outweighs mine).


    nor have languages been good at exposing
    the possibilities in the language so that programmers can take advantage
    of them.

    Yes. But I leave that for another day.


    Good idea :-)

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Tue May 26 08:27:28 2026
    From Newsgroup: comp.arch

    On 26/05/2026 01:00, MitchAlsup wrote:

    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    MitchAlsup <user5857@newsgrouper.org.invalid> writes:

    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
    What you do want is compiled code that can trap on overflow and avoid
    trapping on overflow without code substitution or being re-compiled.
    This way production code can avoid trapping but if the debugger is
    turned on, you can trap.

    Why do you consider that desirable?

    So you can debug production/released code to find subtle errors.

    I think that when an unexpected error is detected (whether it is with
    hardware acceleration, like trap on overflow, or via explicit generated
    code), the way to handle it depends strongly on the situation. If a
    debugger is present, then it is most helpful to lead to a debugger break
    so that the developer can figure out what went wrong. When not
    debugging, there is no sensible default handling that works for jet
    engine controllers and video game frame generators.

    But I do support the aim of having the same generated code when
    debugging and when shipping - I am not a fan of "release" builds and
    "debug" builds. (Of course you might temporarily do builds with
    different flags while chasing down a particular bug.)

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Tue May 26 15:13:31 2026
    From Newsgroup: comp.arch

    On Sun, 24 May 2026 16:39:25 +0000, quadi wrote:

    On Sun, 24 May 2026 15:24:22 +0000, John Levine wrote:

    Sure they did. S/360 had separate unsigned versions of add and subtract
    instructions. The results were the same but the condition codes were
    different and the unsigned versions couldn't overflow.

    Ah, I didn't remember that!

    I just looked it up. It was, and is, the Add Logical instruction.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue May 26 18:02:51 2026
    From Newsgroup: comp.arch


    BGB <cr88192@gmail.com> posted:

    On 5/25/2026 3:34 PM, quadi wrote:
    On Mon, 25 May 2026 16:49:59 +0000, MitchAlsup wrote:
    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    RISC-V: add ignores signed overflow, there is no add that traps on
    signed overflow (and detecting signed overflow is pretty
    involved if both operands are unknown to the compiler).

    The worst of all possible semantic encodings

    Although I thought that making trapping on fixed-point overflow the
    default is a bad idea, I agree that making it impossible to do so, or even test for fixed-point overflow, is a much worse idea.


    Possibly true.

    The lack of things like ADD-with-Carry or ADD-with-Overflow are
    annoyance points on RISC-V.


    Though, it is less obvious what a useful behavior is at the language level:
    "signal()" ? ...
    Something like try/catch (mostly N/A to C)?
    Something similar to FENV_ACCESS?
    ...

    The important property is that overflow is detected precisely.
    Whether {trap, signal, throw} is performed is an environmental choice
    not an ISA choice.

    Well, and that if trapping were applied globally:
    Overhead due to trap detection/handling code causing excessive bloat; Overflows traps from any code that naively assumes wrap-on-overflow semantics;
    ...

    In some codebases, it is already enough of a pain to hunt and fix all
    the out-of-bounds and uninitialized variables mess.
    Signed integer overflows would likely "turn it up to 11";
    Then, how does one fix it? Ask that people start adding a bunch of casts
    to make it work?...

    One might say:
    Add "if()" cases to deal with the overflows, but, ... this only makes
    sense for cases where the overflows are not the expected behavior.

    If(overflow(??)) requires some flag to carry overflow from point of
    detection to if(()).

    And what happens if there is more than 1 overflow ??

    Then again, could maybe classify code, say:
    1, signed, value doesn't (or shouldn't) go out-of-range;
    2, unsigned, value doesn't (or shouldn't) go out-of-range;
    3, signed, value is expected to be modulo;
    4, unsigned, value is expected to be modulo.
    5, a language hint about in-range, wrap, trap, signal, throw

    "nasal demons" types assume 1 and 4 as dominant.
    Or, 1 as exclusive vs 3.

    For compilers, we often need to assume 3 and 4.
    Because, failure to uphold 3 results in misbehaving programs.
    And, if 3 were uncommon, RISC-V's "ADDW"/etc would be pure stupidity.

    You would prefer::

    AND R7,Rleft,#~(~0<<31)
    AND R8,Rright,#~(~0<<31)
    ADD Rd,R7,R8
    AND Rd,Rd,#~(~0<<31)

    That is ADDW range limits operands and performs a shorter ADD.
    Matching C's int a,b; semantic. In general the integer instructions
    ending with W apply C's int properties to the arithmetic. If compilers
    were (WERE) really good at range determination those instructions would
    be unnecessary--but they are not.

    I (My 66000) had to put in sized integer calculation reasons, and by
    doing so, gained 2%-4% in code density and a bit more in latency. -----------------------
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Tue May 26 14:28:56 2026
    From Newsgroup: comp.arch

    On 5/26/2026 1:02 PM, MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    On 5/25/2026 3:34 PM, quadi wrote:
    On Mon, 25 May 2026 16:49:59 +0000, MitchAlsup wrote:
    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    RISC-V: add ignores signed overflow, there is no add that traps on
    signed overflow (and detecting signed overflow is pretty
    involved if both operands are unknown to the compiler).

    The worst of all possible semantic encodings

    Although I thought that making trapping on fixed-point overflow the
    default is a bad idea, I agree that making it impossible to do so, or even >>> test for fixed-point overflow, is a much worse idea.


    Possibly true.

    The lack of things like ADD-with-Carry or ADD-with-Overflow are
    annoyance points on RISC-V.


    Though, it is less obvious what a useful behavior is at the language level: >> "signal()" ? ...
    Something like try/catch (mostly N/A to C)?
    Something similar to FENV_ACCESS?
    ...

    The important property is that overflow is detected precisely.
    Whether {trap, signal, throw} is performed is an environmental choice
    not an ISA choice.

    Yeah.

    Say:
    ADDV Rs, Rt, Rd
    BT __trap_overflow

    Which is how I would assume doing it, if I were to re-add ADDV to my ISA
    (this had existed in SuperH and BJX1, but got lost along the way, but
    could re-add if needed; just it was less often needed than even ADC/ADDC).



    Well, and that if trapping were applied globally:
    Overhead due to trap detection/handling code causing excessive bloat;
    Overflows traps from any code that naively assumes wrap-on-overflow
    semantics;
    ...

    In some codebases, it is already enough of a pain to hunt and fix all
    the out-of-bounds and uninitialized variables mess.
    Signed integer overflows would likely "turn it up to 11";
    Then, how does one fix it? Ask that people start adding a bunch of casts
    to make it work?...

    One might say:
    Add "if()" cases to deal with the overflows, but, ... this only makes
    sense for cases where the overflows are not the expected behavior.

    If(overflow(??)) requires some flag to carry overflow from point of
    detection to if(()).

    And what happens if there is more than 1 overflow ??


    Dunno.
    You would need to set a start point and an end/detection point, and have
    some way for the compiler to know to track overflows.

    Say:
    ADDV ...
    OR?T Re, 0x100, Re

    Then a way to feed Re back into C land to act upon.


    There could maybe either be a 32-bit variant (ADDV.L), or some shorthand
    way to detect that the value has gone outside of 32-bit range.


    Then again, could maybe classify code, say:
    1, signed, value doesn't (or shouldn't) go out-of-range;
    2, unsigned, value doesn't (or shouldn't) go out-of-range;
    3, signed, value is expected to be modulo;
    4, unsigned, value is expected to be modulo.
    5, a language hint about in-range, wrap, trap, signal, throw

    Well, possible, but C doesn't have any hints here...

    But, yeah:
    Leaving plain 'int' as the "probably shouldn't overflow" and 'signed
    int' and 'unsigned int' as "wrap on overflow expected" could make sense.



    "nasal demons" types assume 1 and 4 as dominant.
    Or, 1 as exclusive vs 3.

    For compilers, we often need to assume 3 and 4.
    Because, failure to uphold 3 results in misbehaving programs.
    And, if 3 were uncommon, RISC-V's "ADDW"/etc would be pure stupidity.

    You would prefer::

    AND R7,Rleft,#~(~0<<31)
    AND R8,Rright,#~(~0<<31)
    ADD Rd,R7,R8
    AND Rd,Rd,#~(~0<<31)

    That is ADDW range limits operands and performs a shorter ADD.
    Matching C's int a,b; semantic. In general the integer instructions
    ending with W apply C's int properties to the arithmetic. If compilers
    were (WERE) really good at range determination those instructions would
    be unnecessary--but they are not.

    I (My 66000) had to put in sized integer calculation reasons, and by
    doing so, gained 2%-4% in code density and a bit more in latency. -----------------------

    OK.

    Ironically, the 4-op sequence above would have been a single "ADDWU" instruction in the RV BitManip drafts, but ADDWU was dropped as arguably
    it didn't make a big enough difference on SPEC scores. They decided to
    keep a whole bunch of other random crap though that serves no real
    purpose other than to micro-optimize the benchmarks...

    I revived this for my own extensions, but left out ADDIWU as it was
    still not common enough to justify the encoding space cost (if one has jumbo-prefixes, this could be handled well enough via
    immediate-synthesis, and the 64-bit encoding wasn't too bad for
    something that is comparably infrequent).

    ...


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Tue May 26 15:29:08 2026
    From Newsgroup: comp.arch

    On Mon, 25 May 2026 23:05:06 GMT, MitchAlsup
    <user5857@newsgrouper.org.invalid> wrote:


    BGB <cr88192@gmail.com> posted:

    On 5/25/2026 9:28 AM, Anton Ertl wrote:
    --------------
    Integer overflow happens far too often for trapping to be a good solution.

    Even on 64-bit variables/machines ??

    Yes if there are options for 8/16/32 bit ops in 64 bit registers.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Tue May 26 22:09:28 2026
    From Newsgroup: comp.arch

    David Brown wrote:
    On 26/05/2026 01:00, MitchAlsup wrote:

    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    MitchAlsup <user5857@newsgrouper.org.invalid> writes:

    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
    What you do want is compiled code that can trap on overflow and avoid
    trapping on overflow without code substitution or being re-compiled.>>>> This way production code can avoid trapping but if the debugger is
    turned on, you can trap.

    Why do you consider that desirable?

    So you can debug production/released code to find subtle errors.
    I think that when an unexpected error is detected (whether it is with hardware acceleration, like trap on overflow, or via explicit generated code), the way to handle it depends strongly on the situation.  If a debugger is present, then it is most helpful to lead to a debugger break
    so that the developer can figure out what went wrong.  When not
    debugging, there is no sensible default handling that works for jet
    engine controllers and video game frame generators.

    But I do support the aim of having the same generated code when
    debugging and when shipping - I am not a fan of "release" builds and
    "debug" builds.  (Of course you might temporarily do builds with
    different flags while chasing down a particular bug.)
    I tend to like "Release with sometimes hard-to-grok debug info",
    typically resulting in a separate file with a best effort debug map of
    the executable.
    Then I can at least get some help when running the debugger and trying
    to binary search my way into the spot where the bug resides.
    Terje
    --
    - <Terje.Mathisen at tmsw.no>
    "almost all programming can be viewed as an exercise in caching"
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue May 26 20:54:30 2026
    From Newsgroup: comp.arch


    Terje Mathisen <terje.mathisen@tmsw.no> posted:

    David Brown wrote:
    On 26/05/2026 01:00, MitchAlsup wrote:

    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    MitchAlsup <user5857@newsgrouper.org.invalid> writes:

    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
    What you do want is compiled code that can trap on overflow and avoid >>>> trapping on overflow without code substitution or being re-compiled. >>>> This way production code can avoid trapping but if the debugger is
    turned on, you can trap.

    Why do you consider that desirable?

    So you can debug production/released code to find subtle errors.
    I think that when an unexpected error is detected (whether it is with hardware acceleration, like trap on overflow, or via explicit generated code), the way to handle it depends strongly on the situation.  If a debugger is present, then it is most helpful to lead to a debugger break so that the developer can figure out what went wrong.  When not debugging, there is no sensible default handling that works for jet
    engine controllers and video game frame generators.

    But I do support the aim of having the same generated code when
    debugging and when shipping - I am not a fan of "release" builds and "debug" builds.  (Of course you might temporarily do builds with different flags while chasing down a particular bug.)

    I tend to like "Release with sometimes hard-to-grok debug info",
    typically resulting in a separate file with a best effort debug map of
    the executable.

    Encrypt the debug information (and put it in a {1234-5678-9101-1121-...} folder) so that only the owner (not licensee) of the code can debug
    it.

    Then I can at least get some help when running the debugger and trying
    to binary search my way into the spot where the bug resides.

    Terje

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Tue May 26 19:13:21 2026
    From Newsgroup: comp.arch

    On 5/26/2026 2:29 PM, George Neuner wrote:
    On Mon, 25 May 2026 23:05:06 GMT, MitchAlsup <user5857@newsgrouper.org.invalid> wrote:


    BGB <cr88192@gmail.com> posted:

    On 5/25/2026 9:28 AM, Anton Ertl wrote:
    --------------
    Integer overflow happens far too often for trapping to be a good solution. >>
    Even on 64-bit variables/machines ??

    Yes if there are options for 8/16/32 bit ops in 64 bit registers.

    32-bit overflow is the dominant scenario here.
    While 8 and 16-bit ranges do overflow readily, the normal semantics are
    for them to auto-promote to 32 bits before then being narrowed back down
    to 8 or 16 bits, so they don't count.


    Ironically, for my BS2 language, the semantics were in cases like this
    to instead auto-promote to 64 bits; but can't really do this for C as it
    gives different results in some cases (and early promotion is itself a
    bug, even if early promotion would often be the most natural semantics
    for a 64-bit machine).


    Well, and there is the usual thing that one can't usually allow a
    variable to hold values outside the range of what would be allowed for
    that variable.


    Well, except for floating-point types, where typically code doesn't care
    about out of ranges of values (if a value fails to go to 0 or Inf in a computation in local variables, typically no one cares).

    For float, it isn't obvious because the dynamic range of Binary32 is
    already quite large. A "short float" effectively having Binary64's
    dynamic range when used in scalar computations is a bit incredulous, but
    given these smaller formats are non-standard anyways, it reasonable to
    be like "these formats are only necessarily confined to their formal
    range when in-memory, otherwise all bets are off".

    Or: precision and dynamic range >= requested format.

    Code can't entirely rely on the higher precision though, as the format
    may also revert to its defined precision without warning (even if
    intermediate computations may potentially wildly exceed it).

    But, then again, this would be analogous to if one has an FPU with
    native Binary128, occasionally performing "double" calculations at
    Binary128 precision even though "double" is stated as Binary64.

    Well, or implementing some operations by widening temporarily to a higher-precision format before narrowing the result.


    Though, OTOH, the main use-case for things like scalar "short float" is
    more for saving memory in structs and arrays, not for trying to rely on
    its crappy range and precision.

    So, floating point math is very different from integer math in this regard.

    ...

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Wed May 27 10:59:31 2026
    From Newsgroup: comp.arch

    MitchAlsup [2026-05-26 20:54:30] wrote:
    Encrypt the debug information (and put it in
    a {1234-5678-9101-1121-...} folder) so that only the owner (not
    licensee) of the code can debug it.

    I resent that. All code should be Free Software.


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Wed May 27 18:19:49 2026
    From Newsgroup: comp.arch

    On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:
    MitchAlsup [2026-05-26 20:54:30] wrote:

    Encrypt the debug information (and put it in a
    {1234-5678-9101-1121-...} folder) so that only the owner (not
    licensee) of the code can debug it.

    I resent that. All code should be Free Software.

    It is wonderful that we have the open-source software movement.

    However, people have the right to the fruit of their labors. To give them
    away for free is generous, but it should remain a personal choice.

    Of course, copyright has been misused, and deserves a critical
    examination, not the sort of uncritical expansion given to it by
    legislators in the United States - and imposed on the rest of the world by trade threats.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Wed May 27 15:24:09 2026
    From Newsgroup: comp.arch

    On 5/25/2026 5:59 PM, MitchAlsup wrote:

    Thomas Koenig <tkoenig@netcologne.de> posted:

    David Brown <david.brown@hesbynett.no> schrieb:
    On 24/05/2026 23:39, quadi wrote:
    On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:
    quadi <quadibloc@ca.invalid> posted:

    It makes sense to trap on a floating-point overflow, but trapping on an >>>>>> integer overflow is usually a terrible idea.

    So, detecting something went wrong and you should inform the programmer >>>>> is a bad idea ???

    No, so being able to turn the trap for integer overflow on should
    definitely be allowed. But that shouldn't be the default behavior.
    Otherwise, programs like random number generators wouldn't work.

    John Savard

    That does not make sense. Code such as random number generators should
    be written so that they are correct in the language they are written in.

    In principle, yes.

    Principle is better in theory than in practice.

    In practice, people often used whatever "worked" on their systems.

    Face it, the poor slug writing the code may not have the faintest
    grasp at the system qualities we are discussing, and does not care
    to learn as long as he can slug through the writing and his pro-
    gram not blow up catastrophically while it is under his purview.

    That defines a lot of what is wrong with SW programming today.

    Implementors have a certain right because they control what their
    compiler does or does not do.

    You would be surprised at how little influence implementors have
    on compilers and other software.


    Yeah.

    You can design the ISA and compiler as one likes.
    But, if existing C code breaks, well then this is not good.


    One might think:
    You know, wrap on overflow, and type promotion where it overflows and
    wraps, and *then* promotes to the wider type on the final assignment, is
    kinda stupid and sucks.

    And, if one goes by "well, signed overflow is UB anyways", then they
    should be able to turn it into a "promote first, then ADD" scenario (may
    be both potentially faster, and less likely to lose information).

    I would be inclined to agree.

    But... there is old code around that will quietly break if the integer overflow and promotion doesn't follow the specific behavior that mimics
    how it would have behaved on 32-bit systems.


    I vaguely remember a case of this involving some robot enemies that
    drive around in ROTT, where if the integer overflow failed to work in
    just the right way, they would all miss their way-points and end up
    crashing into walls or similar.

    Where, the robot enemies followed a path defined as a series of
    waypoints (in a grid world), and once the robot hits a particular spot
    on the grid cell, it will change directions and head along the path.
    But, the particular way the expression to handle this was written was sensitive to the type promotion and wrap-on-overflow semantics in C.

    Also a similar case involving the "elevators", which were effectively
    timed teleporters between different parts of the map (would close door,
    play elevator sound, then right at the end as the door opens, it would teleport the player to the other location and initiate a screen shaking
    effect at around the same time). If the overflow was wrong, the teleport
    would fail and the player would still be in the original location.


    One could fix this stuff with casts or similar, but, when does one draw
    the line exactly?...

    Easier sometimes to make it to work, than to try to justify the code was already broken due to reliance on UB.

    Well, and to match the behavior of the other compilers, needed to
    implement the behavior the way ROTT expected.


    Where, as noted, ROTT uses fixed-point math with "fixed" as a signed
    32-bit integer, and some cases involve calculations with coordinates
    well outside the world bounds with the seeming intention that these
    high-order components simply disappear into the ether (with the world essentially treated as a wrapping modulo space).


    But, as noted, it differed from my BS2 language, where the default was effectively to auto-promote values to the widest reasonable integer type
    in these cases and then drop down to the final range afterwards (to
    avoid some integer overflows in cases they would happen in C).

    Well, and within BGBCC, there was some non-zero bleed-over between C and
    BS2 (where originally I had been implementing BS2 via BGBCC, with the intention that it would compile to an IL image that would then be run in
    the VM).

    The original VM however, while fast, ended up with horrible code-bloat.
    Had gotten creative with the use of the C preprocessor in ways that were ultimately a terrible idea (errm, trying to use it sorta like a
    poor-man's version of C++ templates). Binaries got huge, build times
    sucked. This VM was a dead end.


    Ironically, some of my current ISA projects were built on some of the groundwork left by this experiment, but also as a warning for something
    not to do.

    Or, when I learned the merit of actually writing all the opcode handler functions and similar by hand and not trying to do combinatorial stuff
    via the preprocessor.


    Also for the follow up VM (for BS2), had went back to ye-olde stack
    machine (vs a Register IR model). But, some parts of this were relevant
    to targeting an "actual CPU".

    The way JX2VM works isn't too far removed from those VMs in some ways,
    apart from JX2VM's general avoidance of getting too clever with the C preprocessor.

    ...


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Sat May 30 04:02:45 2026
    From Newsgroup: comp.arch

    On Mon, 25 May 2026 23:03:03 +0000, MitchAlsup wrote:

    Another One Bites the Dust.....

    Yes, it certainly is true that Concertina IV retains a lot of baggage
    which might be considered silly from even the original Concertina design.

    And, since I have a "set flag" instruction still... I needed to have predicated instructions. So I added those in... giving an instruction
    format which included either a predicated 32-bit instruction, or a
    predicated pair of 16-bit short instructions... which now could have full register specifications! And with predicated instructions, I also brought
    back the break bit.

    So even without block structure, I brought back VLIW features!

    I was so dismayed by how limited my 16-bit short instructions were, that
    this was nice - but having two 16-bit short instructions inside a 48-bit instruction was not a gain on using 24-bit short instructions instead!

    Well, I added a new 80-bit instruction format, which no longer allowed predication, but which allowed those short instructions to be used with
    less overhead.

    I felt I could do even better. I wanted to add 112-bit instructions, to
    split the 16 bits of overhead between three pairs of these nicer short instructions. It was hard to find the opcode space for them, but I finally
    did it.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Sat May 30 04:06:14 2026
    From Newsgroup: comp.arch

    On Mon, 25 May 2026 23:03:03 +0000, MitchAlsup wrote:
    quadi <quadibloc@ca.invalid> posted:

    A major goal of the Concertina II, III, and IV architectures is for
    instructions not to be longer than similar instructions on the Motorola
    68020 or the IBM System/360 if at all possible.

    Basically, the selling point is... "Your programs only get 10% bigger,
    if that, and yet you have 32 registers, so they run faster!".

    Mine are getting 30% smaller and needing fewer instructions at the same
    time

    Well, then you're obviously doing something amazing with MY 68000, and I
    don't have the experience to know which modifier bits, if added, would
    save instructions often enough to more than pay for the space they take up.

    I have to be content with doing the best I can, despite not being capable
    of doing much more than slavishly copying existing commercial
    architectures.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Sat May 30 15:47:00 2026
    From Newsgroup: comp.arch

    On Sat, 30 May 2026 04:02:45 +0000, quadi wrote:

    So even without block structure, I brought back VLIW features!

    I had a little opcode space remaining. So now I have made what is perhaps
    my maddest addition to the Concertina IV architecture yet!

    In the normal instruction set of the Concertina IV, it was necessary to
    extend the 32-bit instruction set to intrude, ever so slightly, into the portion of the opcode space where instructions begin with 11.

    This was because in the 3/4 of the opcode space initially allocated to 32-
    bit instructions, there wasn't quite enough room for a Halfword Immediate instruction that was 32 bits long, but allowed all 32 registers to be used
    as destination registers.

    Well, for the primary instruction set, this was no real problem. It may
    have made decoding the lengths of instructions less simple and elegant,
    but there was still enough space for instructions longer than 32 bits and
    for the short instructions, both 16-bit and 24-bit - which chopped that remaining space up into pieces anyways.

    But in the 48-bit instructions with an instruction that can be predicated,
    and the 80-bit and 112-bit instructions with two or three instructions
    which can be indicated explicitly as parallelizable... there's a field
    that can _only_ be used for a 32-bit instruction.

    So in there, the opcode space of 32-bit instructions starting with 11 is almost completely unused... but I can't use it for paired 15-bit short instructions because of that Halfword Immediate instruction.

    Well, now the Halfword Immediate instruction for that case has been
    modified, so that paired short instructions including short instructions
    other than register-to-register operate instructions can be used.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat May 30 19:03:18 2026
    From Newsgroup: comp.arch


    quadi <quadibloc@ca.invalid> posted:

    On Mon, 25 May 2026 23:03:03 +0000, MitchAlsup wrote:
    quadi <quadibloc@ca.invalid> posted:

    A major goal of the Concertina II, III, and IV architectures is for
    instructions not to be longer than similar instructions on the Motorola
    68020 or the IBM System/360 if at all possible.

    Basically, the selling point is... "Your programs only get 10% bigger,
    if that, and yet you have 32 registers, so they run faster!".

    Mine are getting 30% smaller and needing fewer instructions at the same time

    Well, then you're obviously doing something amazing with MY 68000, and I

    s/68/66/

    don't have the experience to know which modifier bits, if added, would
    save instructions often enough to more than pay for the space they take up.

    1) never use instructions to paste constant bits together
    2) never use LDs to fetch constants from data-memory
    3) provide ENTER and EXIT to setup and tear-down stack frames
    4) provide [Rbase + Rindex<<scale + Displacement] addressing
    5) encode orthogonal features in a single encode field
    6) spend years reading ASM code from your compiler

    The rest (encoding) is the easy part.

    I have to be content with doing the best I can, despite not being capable
    of doing much more than slavishly copying existing commercial
    architectures.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat May 30 19:15:56 2026
    From Newsgroup: comp.arch


    quadi <quadibloc@ca.invalid> posted:

    On Sat, 30 May 2026 04:02:45 +0000, quadi wrote:

    So even without block structure, I brought back VLIW features!

    I had a little opcode space remaining. So now I have made what is perhaps
    my maddest addition to the Concertina IV architecture yet!

    In the normal instruction set of the Concertina IV, it was necessary to extend the 32-bit instruction set to intrude, ever so slightly, into the portion of the opcode space where instructions begin with 11.

    This was because in the 3/4 of the opcode space initially allocated to 32- bit instructions, there wasn't quite enough room for a Halfword Immediate instruction that was 32 bits long, but allowed all 32 registers to be used as destination registers.

    Yet, My 66000 only has 29 instructions that use 16-bit (or larger) in instruction constants (immediates and displacements)--this includes 2 instructions for Branch on Bit, 2 instructions for branch on condition,
    2 26-bit branch instructions, 13 Disp16 memory references, {9 integer,
    and 2 miscellaneous instructions} with 16-bit immediates.

    Only 29 from an OpCode space of 64 slots with 6 permanently reserved to
    prevent executing code. So, only 1/2 my Major OpCode space is used with immediates--with 16-slots available for the future (22 if you count the reserved slots).

    Well, for the primary instruction set, this was no real problem. It may
    have made decoding the lengths of instructions less simple and elegant,
    but there was still enough space for instructions longer than 32 bits and for the short instructions, both 16-bit and 24-bit - which chopped that remaining space up into pieces anyways.

    It costs me only 6 gates (2 gates of delay) to decode the length of an instruction--whereas it takes 4 gates to decode S/360 2-bit code for instruction length.

    But in the 48-bit instructions with an instruction that can be predicated, and the 80-bit and 112-bit instructions with two or three instructions
    which can be indicated explicitly as parallelizable... there's a field
    that can _only_ be used for a 32-bit instruction.

    An architecture is just as much about what you leave out as what you
    put in.

    So in there, the opcode space of 32-bit instructions starting with 11 is almost completely unused... but I can't use it for paired 15-bit short instructions because of that Halfword Immediate instruction.

    Based on my above: you should not need more than 1/2 OpCode space for instructions with 16-bit immediates.

    Well, now the Halfword Immediate instruction for that case has been modified, so that paired short instructions including short instructions other than register-to-register operate instructions can be used.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Sun May 31 01:22:51 2026
    From Newsgroup: comp.arch

    On Sat, 30 May 2026 15:47:00 +0000, quadi wrote:
    On Sat, 30 May 2026 04:02:45 +0000, quadi wrote:

    So even without block structure, I brought back VLIW features!

    I had a little opcode space remaining. So now I have made what is
    perhaps my maddest addition to the Concertina IV architecture yet!

    At least this reminded me that embedding instructions inside long
    instructions is, in one very important respect, very different from having
    a block structure for program code. So I have now added a warning about
    how branching to an embedded instruction will not work unless a number of strict conditions are met.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Sun May 31 02:57:08 2026
    From Newsgroup: comp.arch

    On Sat, 30 May 2026 15:47:00 +0000, quadi wrote:

    So in there, the opcode space of 32-bit instructions starting with 11 is almost completely unused... but I can't use it for paired 15-bit short instructions because of that Halfword Immediate instruction.

    Well, now the Halfword Immediate instruction for that case has been
    modified, so that paired short instructions including short instructions other than register-to-register operate instructions can be used.

    I felt that this, while tempting, was still a crazy idea. But now I see
    what my subconscious motivation could have been.

    Adding this additional, seemingly redundant, short instruction
    capability... now makes it possible to think of removing the one feature
    of Concertina IV that I dislike the most: the 24-bit short instructions.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Sun May 31 12:05:00 2026
    From Newsgroup: comp.arch

    On Sun, 31 May 2026 01:22:51 +0000, quadi wrote:

    At least this reminded me that embedding instructions inside long instructions is, in one very important respect, very different from
    having a block structure for program code. So I have now added a warning about how branching to an embedded instruction will not work unless a
    number of strict conditions are met.

    And now I've added the Branch to Embedded instruction, which points to the larger instruction, and then indicates which embedded instruction within
    it to which control is to be transferred as a method of avoiding these restrictions, should anyone ever need such an instruction.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Sun May 31 17:26:41 2026
    From Newsgroup: comp.arch

    On Sun, 31 May 2026 12:05:00 +0000, quadi wrote:

    And now I've added the Branch to Embedded instruction, which points to
    the larger instruction, and then indicates which embedded instruction
    within it to which control is to be transferred as a method of avoiding
    these restrictions, should anyone ever need such an instruction.

    And now a minor change: since the opcode space was available, the shift instructions, not only the operate instructions, among the 24-bit short instructions, may now affect the condition codes.

    Oh yes, and I've added 144-bit instructions that provide four embedded 32-
    bit instructions with an explicit indication of parallelism.

    John Savard

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Sun May 31 11:05:37 2026
    From Newsgroup: comp.arch

    On 5/30/2026 12:15 PM, MitchAlsup wrote:

    quadi <quadibloc@ca.invalid> posted:

    snip

    But in the 48-bit instructions with an instruction that can be predicated, >> and the 80-bit and 112-bit instructions with two or three instructions
    which can be indicated explicitly as parallelizable... there's a field
    that can _only_ be used for a 32-bit instruction.

    An architecture is just as much about what you leave out as what you
    put in.

    John's answer - leave out as little as possible, preferably nothing! :-)
    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sun May 31 18:40:48 2026
    From Newsgroup: comp.arch


    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> posted:

    On 5/30/2026 12:15 PM, MitchAlsup wrote:

    quadi <quadibloc@ca.invalid> posted:

    snip

    But in the 48-bit instructions with an instruction that can be predicated, >> and the 80-bit and 112-bit instructions with two or three instructions
    which can be indicated explicitly as parallelizable... there's a field
    that can _only_ be used for a 32-bit instruction.

    An architecture is just as much about what you leave out as what you
    put in.

    John's answer - leave out as little as possible, preferably nothing! :-)

    Which is why his architecture is converging so rapidly.



    NOT.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 01:14:12 2026
    From Newsgroup: comp.arch

    On Wed, 20 May 2026 17:47:59 +0000, John Levine wrote:

    Having looked into this in some detail, both when IBM used bigendian
    order on S/360 and DEC used little-endian on the PDP-11, neither
    documented the reasons for the byte order choice at all. Not even a
    litle bit.

    I suppose that, at the time, it was something that nobody felt was
    important enough to document.

    But to people who were around back then, the reasons would have been
    obvious.

    IBM mainframes were designed to ooze quality! So here and there, an extra transistor or two was added if something seemed better. That's why the IBM 7090 used sign-magnitude arithmetic for integers.

    And that's why the IBM 360 jumped ahead to the end of an integer and
    worked backwards to add, because putting things in reverse order would
    have shouted cheap.

    Plus, the 360 came in a variety of bus widths. So when would you start
    putting the small part first? (They didn't know the answer the PDP-11 came
    up with. Nobody back then could even imagine it, it was so new.)

    The original PDP-11 only came with a 16-bit bus. But its designers aspired
    to the level of consistency that the 360 had, but they wanted to do it on
    a rock-bottom minicomputer budget. DEC minis, in fact, were cheaper than
    most other brands of minicomputer at the time.

    So they were going to put the most significant 16-bit word of a 32-bit
    integer last. But they got the brilliant idea - that more pedestrian
    designers would never even considered for a second, or even thought of as possible - of numbering the bytes in a word backwards too, so as to attain consistency.

    The PDP-11 made little-endian a thing. It was so new that the people
    designing the floating-point unit didn't get the memo. But the concept of making little-endian consistent, instead of something you did in one particular case, the case where something was twice the size of your
    biggest register... that was only born with the PDP-11.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 01:40:59 2026
    From Newsgroup: comp.arch

    On Sun, 31 May 2026 11:05:37 -0700, Stephen Fuld wrote:
    On 5/30/2026 12:15 PM, MitchAlsup wrote:

    An architecture is just as much about what you leave out as what you
    put in.

    Those are words of wisdom, undoubtedly.

    John's answer - leave out as little as possible, preferably nothing!

    So why do I choose openly to defy good sense, and neglect them?

    That's a fair question.

    My answer, though, is a simple one. I've opened my eyes, and looked at the world around me.

    When it comes to desktop computers, the ones people generally use when
    trying to solve a problem more serious than could be dealt with on a smartphone... what processor is in them?

    Well, there _is_ the Macintosh, which also used x86 for a time, but is now using ARM.

    But in general, x86 is dominant. There's too much software written to run
    on x86 Windows.

    So what I've learned is that the world of computer architectures seems to
    be like _Highlander_... "There can be only one".

    And if that one leaves out a feature, then that means that feature is basically not available. I want everyone to have a chance to efficiently
    solve their problems, whatever special instructions or data formats they
    may need.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Mon Jun 1 02:20:36 2026
    From Newsgroup: comp.arch

    It appears that quadi <quadibloc@ca.invalid> said:
    On Wed, 20 May 2026 17:47:59 +0000, John Levine wrote:

    Having looked into this in some detail, both when IBM used bigendian
    order on S/360 and DEC used little-endian on the PDP-11, neither
    documented the reasons for the byte order choice at all. Not even a
    litle bit.

    I suppose that, at the time, it was something that nobody felt was
    important enough to document.

    Evidently.

    But to people who were around back then, the reasons would have been >obvious.

    As I may have said once or twice before, we have plenty of guesses, but
    since there is no documentation, the guesses are a waste of time.

    IBM mainframes were designed to ooze quality! So here and there, an extra >transistor or two was added if something seemed better. That's why the IBM >7090 used sign-magnitude arithmetic for integers.

    The 7090 used sign-magnitude because the vacuum tube 709 used sign magnitude because the 704 used sign-magnitude and they quite reasonably wanted to keep them program compatible. The preceding 701 was also sign-magnitude but had a strange addressing scheme which let you treat memory (which was flaky Williams tubes) as either 36 bit full words or 18 bit half words. Full words were addressed by even negative addresses from -0000 to -4094 while half words were even and odd positive addresses from +0000 to +4095. The 704 did not do that, thank heavens.

    I presume you are aware that the 704 and successors did indexing by two's complement subtraction, which is not sign-magnitude. There is no documentation for that either, and I have looked quite hard. Pretty please, do not guess unless you can cite sources.

    And that's why the IBM 360 jumped ahead to the end of an integer and
    worked backwards to add, because putting things in reverse order would
    have shouted cheap.

    IBM's 702, 705, and 7080 decimal mainframes addressed the low digit of a number and I can assure you they were not cheap.

    The original PDP-11 only came with a 16-bit bus. But its designers aspired >to the level of consistency that the 360 had, but they wanted to do it on
    a rock-bottom minicomputer budget. DEC minis, in fact, were cheaper than >most other brands of minicomputer at the time.

    I am familiar with this guess, but having looked at a lot of contemporary DEC documentation, there is no reason to believe it's true. If they saved any transistors by making it little-endian, the difference was trivial.

    You should look at the DG Nova, designed by some DEC renegades, really cheap due
    to using then-new MSI chips, and word addressed with a bigendian feel.

    The PDP-11 made little-endian a thing. It was so new that the people >designing the floating-point unit didn't get the memo.

    Nor did the people designing the extended multiplier, but they got it
    mostly conssitent in the Vax.
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Mon Jun 1 05:36:10 2026
    From Newsgroup: comp.arch

    quadi <quadibloc@ca.invalid> schrieb:

    So what I've learned is that the world of computer architectures seems to
    be like _Highlander_... "There can be only one".

    That is what people thought about the /360 until the Minis came
    along, where companies were content with lower margins to serve
    new markets and customers at lower margins, but higher volume.

    And then RISC, and PCs... and the low end that PCs are being attacked
    from right now is mobile devices, and ARM.

    For this kind of cycle, I highly recommend reading https://en.wikipedia.org/wiki/The_Innovator%27s_Dilemma (the book
    not the Wikipedia article itself) It talks a lot about hard drives,
    but parallels to computers are obvious.
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 1 07:47:42 2026
    From Newsgroup: comp.arch

    John Levine <johnl@taugh.com> writes:
    I presume you are aware that the 704 and successors did indexing by two's >complement subtraction, which is not sign-magnitude.

    Looking at the 704 manual <https://ia802904.us.archive.org/12/items/bitsavers_ibm7042466_32932660/24-6661-2_704_Manual_1955_text.pdf>,
    it says:

    |Type A instructions use two 15 -bit fields (decrement and address)
    |containing numbers in the octal range 00000 to 77777.

    I did not find other descriptions of addresses; given this
    description, it seems that the addresses and the index registers are
    unsigned. Appendix A discusses binary arithmetic, but explains
    subtraction with borrows rather than addition of the 2s-complement
    (borrows is probably easier to understand given the background of the
    readers, but adding a 1s-complement and one is easier to implement).

    In any case, I don't think that the IBM 704 manual documents
    2s-complement representation of negative numbers for any purpose.

    So why did the S/360 architects go for 2s-complement?

    One speculation is that they wanted 32-bit (unsigned) addreesses and
    wanted to be able to use the same adder for the addresses as for the
    integers. But the S/360 only has 24-bit addresses, so going for,
    e.g., sign-magnitude and only declaring the positive numbers <2^24 to
    be valid addresses would also have worked with one adder.

    An alternative speculation is that they really wanted to extend the
    range of the S/360 implementations as far as possible, also on the
    lower end, and the 2s-complement representation for negative numbers
    is cheaper to implement, in particular when you implemant a
    bit-serial, nybble-serial, or somesuch machine.

    [quadi <quadibloc@ca.invalid> said:]
    The PDP-11 made little-endian a thing. It was so new that the people >>designing the floating-point unit didn't get the memo.

    Nor did the people designing the extended multiplier, but they got it
    mostly conssitent in the Vax.

    This all indicates that byte-ordering decisions worked like in our
    student group. The "right" choice seemed so obvious to everyone that
    we did not communicate about it nor document it nor document the
    reasons for it, and different contributors took different "right"
    choices.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 1 08:36:22 2026
    From Newsgroup: comp.arch

    quadi <quadibloc@ca.invalid> writes:
    But they got the brilliant idea - that more pedestrian
    designers would never even considered for a second, or even thought of as >possible - of numbering the bytes in a word backwards too, so as to attain >consistency.

    The designers of the DataPoint 2200 did that, too, in their
    instruction encoding, for no technical reason that I am aware of. And
    the Datapoint 2200 came out within months of the PDP-11, so it is
    unlikely that they took inspiration from the PDP-11 in this decision.

    When you introduce byte addressing, you have to take the byte ordering decision. Some designers decide for big-endian, and some for
    little-endian, and the decision is mostly arbitrary. And, as John
    Levine writes, the designers of the S/360 and the PDP-11 did not
    document their reasons for that.

    For the 6502, the decision is not arbitrary when implementing the
    addressing modes "ABS,X", "ABS,Y" and "(IND),Y". So for that they
    decided to go for little-endian to simplify the implementation.

    Its predecessor, the 6800, does not have any operations, where 16-bit
    numbers coming from memory are added to something else (at least I did
    not find such operations), and therefore the decision could be made arbitrarily, and they decided for big-endian. But I think that the
    6809 and the 68000 have addressing modes where the big-endian nature complicates the implementation.

    I looked at how this turned out for the offspring of the Datapoint
    2200: For the Z80, I did not find any instruction where the
    little-endian byte order provided an advantage: when a 16-bit value is
    accessed in memory, it is used directly instead of being added or
    somesuch. For the 8088, in theory little-endian might provide an
    advantage when it comes to addressing modes such as disp16[BX], but
    AFAIK in practice the 8088 was internally mostly an 8086, with a
    16-bit adder, so it loaded the whole 16-bit number anyway before doing
    the full 16-bit add (am I wrong?). Likewise for the 386SX.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Mon Jun 1 16:04:26 2026
    From Newsgroup: comp.arch

    Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:

    So why did the S/360 architects go for 2s-complement?

    Brooks (who was program manager for /360) writes about this in
    "The Design of Design". Unique zero and unified hardware were
    his main points, IIRC.
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Paul Clayton@paaronclayton@gmail.com to comp.arch on Sun May 31 18:38:28 2026
    From Newsgroup: comp.arch

    On 5/30/26 3:15 PM, MitchAlsup wrote:
    [snip]
    It costs me only 6 gates (2 gates of delay) to decode the length of an instruction--whereas it takes 4 gates to decode S/360 2-bit code for instruction length.

    Does the current version of My 66000 have three instruction
    lengths or four? You mentioned before dropping "large" constants
    as store operands, but I am not certain what that means.

    Earlier, if I understood correctly, the longest instruction was
    a store of a 64-bit constant with a 64-bit displacement,
    requiring five 32-bit words.

    If My 66000 has the same variability in instruction length as
    S/360 (three sizes), then presumably the extra length decode
    effort provides some other advantage, perhaps more flexibility
    in length allocation (with a 2-bit size indicator, major opcodes
    can only be allocated at 25% granularity)?

    There may be an advantage in having different lengths have
    different detection speed.

    Since My 66000 only uses the extra words for immediates, there
    *may* even be an advantage to detecting some illegal opcodes and
    speculating that such are from constant words. (An illegal
    opcode field can indicate an immediate, a faulting instruction,
    or a skipped instruction.) Such could introduce variable timing
    for parsing a given fetch chunk, but that might be handled by
    reducing the number of parsed instructions emitted and inserting
    the slowly parsed instructions into the start of the next group
    of parsed instructions.

    My guess is that such would just be silly complexity even at 16-
    wide parsing, especially given the likely minuscule (typical)
    timing benefit (if any!). Process variation probably would have
    vastly more impact on frequency than trying to exploit a
    statistical bias in encoding. (The concept just seemed
    interesting.)

    Given that register dependencies also "carry", there may be some
    opportunity for "width pipelining" (like the staggered ALUs of
    the Pentium 4) in parsing, extracting register names, renaming
    (at least with RAT-based renaming), and even insertion into a
    scheduler. If a dependency means it would not be useful to
    insert the operation into a scheduler, this additional delay
    might be exploited.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 17:59:28 2026
    From Newsgroup: comp.arch

    On Mon, 01 Jun 2026 02:20:36 +0000, John Levine wrote:

    As I may have said once or twice before, we have plenty of guesses, but
    since there is no documentation, the guesses are a waste of time.

    I understand that you would like to have actual documentation. But it
    doesn't appear to exist.

    But I don't think my "guesses" are wild. I'm familiar with the other
    computers that existed in those years, with the milieu in which the
    System/360 and the PDP-11 existed.

    I presume you are aware that the 704 and successors did indexing by
    two's complement subtraction, which is not sign-magnitude. There is no documentation for that either, and I have looked quite hard. Pretty
    please, do not guess unless you can cite sources.

    I admit that the fact that one subtracts the index on an IBM 704 seems
    very weird to me. Since the IBM 704 was made out of vacuum tubes, saving
    them, instead of mere discrete transistors, let alone transistors on a microchip with a billion of them, was probably more important.

    My guess that sign-magnitude arithmetic was regarded as more prestigious, until IBM outgrew that notion with the 360, does have a source, although
    not an IBM source.

    A 24-bit computer was advertised as having sign-magnitude integer
    arithmetic, unlike cheaper machines which either used one's complement
    integer arithmetic, or, even worse, two's complement integer arithmetic.

    I think it was the DDP-24, but offhand I'm not completely sure.

    To guess - or to attempt to derive intelligence from the available
    information - one might think that IBM considered indexing to be less important or less visible than ordinary integer arithmetic per se.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 18:08:38 2026
    From Newsgroup: comp.arch

    On Mon, 01 Jun 2026 07:47:42 +0000, Anton Ertl wrote:

    This all indicates that byte-ordering decisions worked like in our
    student group. The "right" choice seemed so obvious to everyone that we
    did not communicate about it nor document it nor document the reasons
    for it, and different contributors took different "right"
    choices.

    As has been argued many times, byte-ordering is completely arbitrary, and
    so either choice is just as good. Given that widespread belief, that kind
    of behavior is not surprising.

    Some will think that of course a computer should be little-endian, because arithmetic is faster and simpler that way (if you're doing any multi-word arithmetic).

    Some will think that of course a computer should be big-endian, because
    that's just the natural way we write numbers, and anything else would be hopelessly confusing.

    As I've pointed out, though, there is *one* particular case where there actually is a genuine difference between big-endian and little-endian.

    If, like the System/360, your computer performs BCD arithmetic and not
    just binary arithmetic, and if, unlike the System/360, you did your BCD arithmetic in the same registers you use for binary arithmetic...

    Then, because binary arithmetic is done in the same registers as BCD arithmetic, they should both have the same endianness.

    And because BCD numbers are directly related to character strings
    representing numbers - just take the last four bits of each digit
    character - they ought to have the same endianness. And character strings
    that represent numbers _are_ big-endian.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 18:10:37 2026
    From Newsgroup: comp.arch

    On Mon, 01 Jun 2026 05:36:10 +0000, Thomas Koenig wrote:
    quadi <quadibloc@ca.invalid> schrieb:

    So what I've learned is that the world of computer architectures seems
    to be like _Highlander_... "There can be only one".

    That is what people thought about the /360 until the Minis came along,
    where companies were content with lower margins to serve new markets and customers at lower margins, but higher volume.

    And then RISC, and PCs... and the low end that PCs are being attacked
    from right now is mobile devices, and ARM.

    For this kind of cycle, I highly recommend reading https://en.wikipedia.org/wiki/The_Innovator%27s_Dilemma (the book not
    the Wikipedia article itself) It talks a lot about hard drives, but
    parallels to computers are obvious.

    This made me think of a different kind of cycle, called the "wheel of reincarnation", discussed in a book on interactive graphical displays.

    John Savard

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 1 18:01:36 2026
    From Newsgroup: comp.arch

    Thomas Koenig <tkoenig@netcologne.de> writes:
    Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:

    So why did the S/360 architects go for 2s-complement?

    Brooks (who was program manager for /360) writes about this in
    "The Design of Design". Unique zero and unified hardware were
    his main points, IIRC.

    This made me remember: G. M. Amdahl, G. A. Blaauw, F. P. Brooks, Jr.: "Architecture of the IBM System/ 360" <https://www.ece.ucdavis.edu/~vojin/CLASSES/EEC272/S2005/Papers/IBM360-Amdahl_april64.pdf>,
    which John Levine pointed to. It says on page 92:

    |Sign representations. For the fixed-point arithmetic system, which is |binary,the two's complement representation for negative numbers was |selected.The well-known virtues of this system are the unique
    |representation of zero and the absence of recomplementation. These |substantial advantages are augmented by several properties especially
    |useful in address arithmetic, particularly in the large models, where
    |address arithmetic has its own hardware. With two's complement
    |notation, this indexing hardware requires no true/complement gates
    |and thus works faster. In the smaller, serial models, the fact that |high-order bits of address arithmetic can be elided without changing
    |the low-order bits also permits a gain in speed. The same truncation
    |property simplifies double-precision calculations. Furthermore, for
    |table calculation, rounding or truncation to an integer changes all
    |variables in the same direction, thus giving a more acceptable
    |distribution than does an absolute-value-plus-sign representation.
    |
    |The established commercial rounding convention made the use of
    |complement notation awkward for decimal data; therefore, |absolute-value-plus-sign is used here.

    What is "recomplementation"?

    As an aside: When listing authors in alphabetic order, choose your
    co-authors wisely: You have a name like "Brooks", and yet only get the
    last spot out of three:-).
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Mon Jun 1 18:13:34 2026
    From Newsgroup: comp.arch

    According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
    John Levine <johnl@taugh.com> writes:
    I presume you are aware that the 704 and successors did indexing by two's >>complement subtraction, which is not sign-magnitude.

    Looking at the 704 manual ><https://ia802904.us.archive.org/12/items/bitsavers_ibm7042466_32932660/24-6661-2_704_Manual_1955_text.pdf>,
    In any case, I don't think that the IBM 704 manual documents
    2s-complement representation of negative numbers for any purpose.

    The documentation was a bit sparse, but see item 7 on page 17.

    The manual for the 7090 which had a superset of the 704's instruction set
    is more complete. See "Complement Arithmetic" on page 10 where it says

    Effective addresses are always formed in the computer by the addition
    of the 2's complement of the contents of the index register.

    https://bitsavers.org/pdf/ibm/7090/22-6528-4_7090Manual.pdf

    So why did the S/360 architects go for 2s-complement?

    One speculation ...

    We don't have to guess, because they told us in the Amdahl et al article
    in 1964 in the IBMSJ.

    Sign representations. For the fixed-point arithmetic
    system, which is binary, the two’s complement representa-
    tion for negative numbers was selected. The well-known
    virtues of this system are the unique representation
    of zero and the absence of recomplementation. These
    substantial advantages are augmented by several properties
    especially useful in address arithmetic, particularly in the
    large models, where address arithmetic has its own hard-
    ware. With two’s complement notation, this indexing
    hardware requires no true/complement gates and thus
    works faster. In the smaller, serial models, the fact that
    high-order bits of address arithmetic can be elided with-
    out changing the low-order bits also permits a gain in
    speed. The same truncation property simplifies double-
    precision calculations. Furthermore, for table calculation,
    rounding or truncation to an integer changes all variables
    in the same direction, thus giving a more acceptable
    distribution than does an absolute-value-plus-sign repre-
    sentation.

    They go on to explain why decimal numbers are still sign magnitude,
    mostly becaue it made rounding easier, and float because it made
    normalizing easier.

    Nor did the people designing the extended multiplier, but they got it >>mostly conssitent in the Vax.

    This all indicates that byte-ordering decisions worked like in our
    student group. The "right" choice seemed so obvious to everyone that
    we did not communicate about it nor document it nor document the
    reasons for it, and different contributors took different "right"
    choices.

    That would seem to be the case. Sometimes things are obscure at the
    time and obvious in retrospect, sometimes the converse.
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Mon Jun 1 18:26:58 2026
    From Newsgroup: comp.arch

    According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
    |Sign representations. For the fixed-point arithmetic system, which is >|binary,the two's complement representation for negative numbers was >|selected.The well-known virtues of this system are the unique >|representation of zero and the absence of recomplementation.

    What is "recomplementation"?

    To do sign magnitude arithmetic, you basically do it in one's
    complement: bit flip negative operands to make them one's complement,
    do the arithmetic, then bit flip the result if it's negative. That
    last bit flip is recomplementation.

    Straight one's complement doesn't have the recomplementation but does
    have end around carry if there's a carry out of the high bit, and
    shares with sign-magnitude the question of how you handle +0 and -0
    which are different bit patterns but mathemetically equal.
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 19:56:38 2026
    From Newsgroup: comp.arch

    On Sun, 31 May 2026 18:40:48 +0000, MitchAlsup wrote:

    Which is why his architecture is converging so rapidly.

    NOT.

    Indeed, it's not converging as rapidly as I'd like.

    I decided that one of my 32-bit instructions really needed to be allocated twice as much opcode space as I had originally given it.

    Even if that meant dropping the 24-bit short instructions to make the
    room! (Now that I have paired 15-bit short instructions, which also
    include short shift instructions, and short branch instructions, I felt I didn't need them as badly, and I had disliked having instructions that
    were an odd number of bytes long.)

    Well, after making the changes, I still had room - 1/4 as much as I had
    before - for 24-bit short instructions.

    I wasn't happy. So I noticed that I actually had some unused space that I could squeeze out. So now the 24-bit short instructions have 1/2 as much
    space as they used to, which meant the only thing I had to give up was the ability to change the condition codes. Fine, when you want to do that, use
    a full 32-bit operate instruction. So I was happy.

    John Savard

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Mon Jun 1 20:00:51 2026
    From Newsgroup: comp.arch

    According to quadi <quadibloc@ca.invalid>:
    I admit that the fact that one subtracts the index on an IBM 704 seems
    very weird to me. Since the IBM 704 was made out of vacuum tubes, saving >them, instead of mere discrete transistors, let alone transistors on a >microchip with a billion of them, was probably more important.

    We can guess that someone thought that counting down indexes was important
    but they turned out to be wrong. Fortran stored arrays in reverse order to make indexing easier.

    My guess that sign-magnitude arithmetic was regarded as more prestigious, >until IBM outgrew that notion with the 360, does have a source, although
    not an IBM source.

    My equally uninformed guess is that their tab machines and their commerical computers were decimal sign magnitude, so binary sign magnitude was a
    short step away. It evidently took a while to realize that while the
    two's complement negative represntation seemed less intuitive, the logic
    was a lot simpler.


    A 24-bit computer was advertised as having sign-magnitude integer >arithmetic, unlike cheaper machines which either used one's complement >integer arithmetic, or, even worse, two's complement integer arithmetic.

    I think it was the DDP-24, but offhand I'm not completely sure.

    To guess - or to attempt to derive intelligence from the available >information - one might think that IBM considered indexing to be less >important or less visible than ordinary integer arithmetic per se.

    John Savard
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 21:57:21 2026
    From Newsgroup: comp.arch

    On Mon, 01 Jun 2026 20:00:51 +0000, John Levine wrote:

    My equally uninformed guess is that their tab machines and their
    commerical computers were decimal sign magnitude, so binary sign
    magnitude was a short step away. It evidently took a while to realize
    that while the two's complement negative represntation seemed less
    intuitive, the logic was a lot simpler.

    I agree with that. Remember, IBM made tab machines long before they got
    into computers, and commercial computers, not scientific ones, were their
    core business later.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 23:40:45 2026
    From Newsgroup: comp.arch

    On Mon, 01 Jun 2026 19:56:38 +0000, quadi wrote:

    Well, after making the changes, I still had room - 1/4 as much as I had before - for 24-bit short instructions.

    I wasn't happy. So I noticed that I actually had some unused space that
    I could squeeze out. So now the 24-bit short instructions have 1/2 as
    much space as they used to, which meant the only thing I had to give up
    was the ability to change the condition codes.

    When it was 1/4 as much, I was no longer able to fit in a modified form of
    the Halfword Immediate instruction as an embedded 32-bit instruction
    strictly confined to the opcode space of 32-bit instructions that don't
    begin with 11.

    But when it was 1/2 as much, I didn't realize that I had enough space to
    put that back in. So I've made the fix.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 01:22:37 2026
    From Newsgroup: comp.arch


    John Levine <johnl@taugh.com> posted:

    According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
    |Sign representations. For the fixed-point arithmetic system, which is >|binary,the two's complement representation for negative numbers was >|selected.The well-known virtues of this system are the unique >|representation of zero and the absence of recomplementation.

    What is "recomplementation"?

    To do sign magnitude arithmetic, you basically do it in one's
    complement: bit flip negative operands to make them one's complement,
    do the arithmetic, then bit flip the result if it's negative. That
    last bit flip is recomplementation.

    In microarchitecture, you can make the registers 2^(3+n)+1 bits long.
    Then simply record that the mantissa is complemented (or not) when
    used as an operand. We do this all the time in microarchitecture to
    save gates/time/... depending on the implementation technology
    constraints.

    Straight one's complement doesn't have the recomplementation but does
    have end around carry if there's a carry out of the high bit, and
    shares with sign-magnitude the question of how you handle +0 and -0
    which are different bit patterns but mathemetically equal.



    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 01:52:28 2026
    From Newsgroup: comp.arch


    Paul Clayton <paaronclayton@gmail.com> posted:

    On 5/30/26 3:15 PM, MitchAlsup wrote:
    [snip]
    It costs me only 6 gates (2 gates of delay) to decode the length of an instruction--whereas it takes 4 gates to decode S/360 2-bit code for instruction length.

    Does the current version of My 66000 have three instruction
    lengths or four? You mentioned before dropping "large" constants
    as store operands, but I am not certain what that means.

    1-word, 2-words, and 3-words.

    Earlier, if I understood correctly, the longest instruction was
    a store of a 64-bit constant with a 64-bit displacement,
    requiring five 32-bit words.

    Yes, it was. We measured its use at 0.2%.

    If My 66000 has the same variability in instruction length as
    S/360 (three sizes), then presumably the extra length decode
    effort provides some other advantage, perhaps more flexibility
    in length allocation (with a 2-bit size indicator, major opcodes
    can only be allocated at 25% granularity)?

    There are 64-slots in the Major Opcode, 42 are in use, 6 permanently
    reserved and 16 free for the future.

    There may be an advantage in having different lengths have
    different detection speed.

    Only 1/8th of the Major group is allowed to have VLI. And all of
    these have the same 4-bit encoding--which is called operand routing
    and is responsible for {inversion, negation, constant substitution}

    Since My 66000 only uses the extra words for immediates, there
    *may* even be an advantage to detecting some illegal opcodes and
    speculating that such are from constant words.

    One of the reasons for the 6 permanently reserved slots if to prevent
    that.

    (An illegal
    opcode field can indicate an immediate, a faulting instruction,
    or a skipped instruction.) Such could introduce variable timing
    for parsing a given fetch chunk, but that might be handled by
    reducing the number of parsed instructions emitted and inserting
    the slowly parsed instructions into the start of the next group
    of parsed instructions.

    My 66000 is specified such that ALL unspecified patterns must be
    detected and raise UNIMPLEMENTED. And not just on Major OpCodes,
    every unimplemented pattern must be detected. It is better to
    prevent mayhem than to allow it to damage all future implementations
    {no Carry when shift-count == 0 on x86 comes to mind}.

    When performing LL/SC sequences--some sequences are not allowed
    and will also raise UNIMPLEMENTED. Silently doing unexpected stuff
    is worse than doing nothing.

    My guess is that such would just be silly complexity even at 16-
    wide parsing, especially given the likely minuscule (typical)
    timing benefit (if any!). Process variation probably would have
    vastly more impact on frequency than trying to exploit a
    statistical bias in encoding. (The concept just seemed
    interesting.)

    Given that register dependencies also "carry", there may be some
    opportunity for "width pipelining" (like the staggered ALUs of
    the Pentium 4) in parsing, extracting register names, renaming
    (at least with RAT-based renaming), and even insertion into a
    scheduler. If a dependency means it would not be useful to
    insert the operation into a scheduler, this additional delay
    might be exploited.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Tue Jun 2 01:57:26 2026
    From Newsgroup: comp.arch

    According to MitchAlsup <user5857@newsgrouper.org.invalid>:
    What is "recomplementation"?

    To do sign magnitude arithmetic, you basically do it in one's
    complement: bit flip negative operands to make them one's complement,
    do the arithmetic, then bit flip the result if it's negative. That
    last bit flip is recomplementation.

    In microarchitecture, you can make the registers 2^(3+n)+1 bits long.
    Then simply record that the mantissa is complemented (or not) when
    used as an operand. We do this all the time in microarchitecture to
    save gates/time/... depending on the implementation technology
    constraints.

    You can do that now, not so much when building computers out of vacuum
    tubes in the 1950s.

    Also, that works OK for registers, but at some point you need to
    store values in memory at which point I'd think you'd need to do
    the recomplementing.
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon Jun 1 14:51:14 2026
    From Newsgroup: comp.arch

    quadi [2026-05-27 18:19:49] wrote:
    On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:
    MitchAlsup [2026-05-26 20:54:30] wrote:
    Encrypt the debug information (and put it in a
    {1234-5678-9101-1121-...} folder) so that only the owner (not
    licensee) of the code can debug it.
    I resent that. All code should be Free Software.
    [...]
    However, people have the right to the fruit of their labors. To give them away for free is generous, but it should remain a personal choice.

    You don't need to encrypt the debug information of your programs in
    order to earn a decent living.


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 06:19:05 2026
    From Newsgroup: comp.arch

    On Mon, 01 Jun 2026 14:51:14 -0400, Stefan Monnier wrote:
    quadi [2026-05-27 18:19:49] wrote:

    However, people have the right to the fruit of their labors. To give
    them away for free is generous, but it should remain a personal choice.

    You don't need to encrypt the debug information of your programs in
    order to earn a decent living.

    Perhaps. But if someone can write a program that is so useful that it
    could make him wealthy beyond the dreams of avarice, who am I to judge him
    for seeking to maximize its revenue potential?

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 06:20:53 2026
    From Newsgroup: comp.arch

    On Mon, 01 Jun 2026 19:56:38 +0000, quadi wrote:

    I decided that one of my 32-bit instructions really needed to be
    allocated twice as much opcode space as I had originally given it.

    There was another 32-bit instruction that was also short of opcode space -
    but this time, I didn't even have to extensively reorganize the opcodes of other instructions in order to remedy that.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 07:47:36 2026
    From Newsgroup: comp.arch

    On Tue, 02 Jun 2026 06:19:05 +0000, quadi wrote:
    On Mon, 01 Jun 2026 14:51:14 -0400, Stefan Monnier wrote:
    quadi [2026-05-27 18:19:49] wrote:

    However, people have the right to the fruit of their labors. To give
    them away for free is generous, but it should remain a personal
    choice.

    You don't need to encrypt the debug information of your programs in
    order to earn a decent living.

    Perhaps. But if someone can write a program that is so useful that it
    could make him wealthy beyond the dreams of avarice, who am I to judge
    him for seeking to maximize its revenue potential?

    Perhaps this answer was too casual, and a more detailed and serious answer
    is needed.

    To say that one doesn't "need" to encrypt debug information "to earn a
    decent living" is true enough, but you're also implying that this is all anyone has the right to expect.

    To me, this implies a mindset that says that everyone should remain a
    laborer, and that it's wrong to transition to rent-seeking.

    I don't share that view. While there are excesses in the free-enterprise system as we have it now, I have no quarrel with its basic principles. I
    see the ownership of property, including intellectual property, and
    including capital property, as fully legitimate.

    So a person can write a program once and make a living from selling copies
    of it, instead of just from providing services to its users. If the
    program is good, there's nothing illegitimate about that. And to defend
    the program against piracy and reverse-engineering is also basically legitimate.

    However, to encrypt debug information is strange. Why would a copy of the debug information in any form be included with distributed copies of
    software? I suppose it could be there in an encrypted form to be used in conjunction with remote diagnostic tools, in the case of software which
    has to be maintained on customer premises, unlike mass-market applications.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From jgd@jgd@cix.co.uk (John Dallman) to comp.arch on Tue Jun 2 14:01:40 2026
    From Newsgroup: comp.arch

    In article <10uks4f$1dqo$1@gal.iecc.com>, johnl@taugh.com (John Levine)
    wrote:

    Having looked into this in some detail, both when IBM used
    bigendian order on S/360 and DEC used little-endian on the
    PDP-11, neither documented the reasons for the byte order
    choice at all. Not even a litle bit.

    Brooks and Blaauw, two of the S/360 architects, consider the subject in
    their much later book _Computer Architecture_, on p. 99:

    "The more logical convention, the Big Endian, considers the whole
    storage space as one steam of bits. Bits, bytes and words are
    numbered from left to right, following the convention of writing
    in Western culture."

    That explains why IBM mainframes number the most significant bit as zero,
    the opposite way around to all the platforms I've worked on, which number
    the least significant bit as zero.

    I've always find the latter convention helpful for doing hex arithmetic
    in my head or on paper. I _think_ big-endian SPARC, MIPS and POWER all
    regard the least significant bit as bit zero, but I can no longer easily
    check that,

    John
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 14:54:24 2026
    From Newsgroup: comp.arch

    On Tue, 02 Jun 2026 14:00:00 +0100, John Dallman wrote:

    Brooks and Blaauw, two of the S/360 architects, consider the subject in
    their much later book _Computer Architecture_, on p. 99:

    "The more logical convention, the Big Endian, considers the whole
    storage space as one steam of bits. Bits, bytes and words are numbered
    from left to right, following the convention of writing in Western
    culture."

    On the other hand, Arabic is written from right to left, and yet the Arabs also write numbers with the most significant digit on the left. Hence, little-endian would seem more logical to them for the same reason.

    Since this is, therefore, a cultural matter, and not something universal,
    like the laws of physics or mathematics, we can't tell what a man from
    Mars would prefer.

    So, while they can call it "the more logical convention", this isn't
    something everyone would agree with. The famous article on the subject,
    "On Holy Wars and a Plea for Peace", by Danny Cohen from 1981 thus termed
    it as being much less important which standard was chosen than for
    everyone to choose the same one for compatibility, but he wasn't shy about expressing his personal preference for little-endian, referring to those
    who practiced big-endian as "outlaws".

    The case for big-endian is...

    It makes computers easier to understand for most people in Western
    societies.
    It makes core dumps easier to read.
    Multi-precision compare is faster.

    The case for little-endian is...

    People don't need to poke around in core dumps or even program in
    assembler very much these days. We have compilers.
    Multi-precision add and subtract is faster, and it's much more common than compare.

    At least, those are the usual arguments, and from that set of arguments,
    it does seem like there's little difference and it's just a personal preference.

    But, as I've noted, I've finally come up with a more compelling
    justification for big-endian. It still assumes that, if you're processing
    text data, that text data will be from a society that writes from left to right.

    Think of text records that include words and numbers in character format.
    Like

    00134700 John Smith
    00250000 Richard Roe

    and so on.

    The numerical portion has the most significant digit on the left, the alphabetic portion has the first character on the left. Thus, these
    characters will be stored in memory at succeeding addresses from left to right; the most significant digit is stored at the lower address.

    So numbers as text strings are stored in big-endian order.

    That means that it's simplest to convert a text number to a packed decimal number that's in the same order.

    And an ALU that performs binary arithmetic can be modified to also perform decimal arithmetic by changing when carries take place out of each group
    of four bits. If that's done, binary and decimal numbers ought to have the same endianness, so that one doesn't need two load and store instructions
    for the accumulator or the registers.

    I know some computers, regardless of endianness, number the least
    significant bit as one instead of zero. Either way, this convention is considered to make sense for wiring a 12-bit DAC to a 16-bit data bus,
    since now each number corresponds to the same power of two no matter how
    wide your bus is.

    Of course, one can argue for considering fixed-point numbers as fractions
    in [-1,1), but that is needed far less often than using them as integers.
    A few computers were designed this way; it meant that integers had to be represented with a wasted bit, or that a shift was usually needed after a multiply, so it did not get popular.

    The IBM 360 made bit numbering consistent not just to make reading the
    manuals easier, but out of habit - since their most recent previous
    computer with a 64-bit word was the STRETCH (or 7030)... which had the
    ability to do bit-addressing as a prominent feature. There, consistency actually mattered.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Tue Jun 2 17:50:38 2026
    From Newsgroup: comp.arch

    Stefan Monnier wrote:
    quadi [2026-05-27 18:19:49] wrote:
    On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:
    MitchAlsup [2026-05-26 20:54:30] wrote:
    Encrypt the debug information (and put it in a
    {1234-5678-9101-1121-...} folder) so that only the owner (not
    licensee) of the code can debug it.
    I resent that. All code should be Free Software.
    [...]
    However, people have the right to the fruit of their labors. To give them >> away for free is generous, but it should remain a personal choice.

    You don't need to encrypt the debug information of your programs in
    order to earn a decent living.

    I'd say rather the opposite!

    In the current environment where every language is expected to be
    compatible with a generic IDE like Visual Studio Code, via open source interface specifications, having a proprietary debug format seems like a
    good way to strongly limit your potential customer base.

    Terje
    --
    - <Terje.Mathisen at tmsw.no>
    "almost all programming can be viewed as an exercise in caching"
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Tue Jun 2 16:13:28 2026
    From Newsgroup: comp.arch

    quadi <quadibloc@ca.invalid> writes:

    The case for big-endian is...

    <snip>

    It makes core dumps easier to read.

    Actually the program that analyzes the core dump can handle
    endedness without the programmer even being aware of it.

    It's been more than half a century since programmers looked at raw
    memory dumps.....

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 17:04:55 2026
    From Newsgroup: comp.arch


    John Levine <johnl@taugh.com> posted:

    According to MitchAlsup <user5857@newsgrouper.org.invalid>:
    What is "recomplementation"?

    To do sign magnitude arithmetic, you basically do it in one's
    complement: bit flip negative operands to make them one's complement,
    do the arithmetic, then bit flip the result if it's negative. That
    last bit flip is recomplementation.

    In microarchitecture, you can make the registers 2^(3+n)+1 bits long.
    Then simply record that the mantissa is complemented (or not) when
    used as an operand. We do this all the time in microarchitecture to
    save gates/time/... depending on the implementation technology >constraints.

    You can do that now, not so much when building computers out of vacuum
    tubes in the 1950s.

    Also, that works OK for registers, but at some point you need to
    store values in memory at which point I'd think you'd need to do
    the recomplementing.

    Sure, but there is plenty of time to re-complement when storing
    the value.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 17:09:26 2026
    From Newsgroup: comp.arch


    jgd@cix.co.uk (John Dallman) posted:

    In article <10uks4f$1dqo$1@gal.iecc.com>, johnl@taugh.com (John Levine) wrote:

    Having looked into this in some detail, both when IBM used
    bigendian order on S/360 and DEC used little-endian on the
    PDP-11, neither documented the reasons for the byte order
    choice at all. Not even a litle bit.

    Brooks and Blaauw, two of the S/360 architects, consider the subject in
    their much later book _Computer Architecture_, on p. 99:

    "The more logical convention, the Big Endian, considers the whole
    storage space as one steam of bits. Bits, bytes and words are
    numbered from left to right, following the convention of writing
    in Western culture."

    That explains why IBM mainframes number the most significant bit as zero,
    the opposite way around to all the platforms I've worked on, which number
    the least significant bit as zero.

    I've always find the latter convention helpful for doing hex arithmetic
    in my head or on paper. I _think_ big-endian SPARC, MIPS and POWER all
    regard the least significant bit as bit zero, but I can no longer easily check that,

    Do you want to isolate the register bit as::

    bit = ((register) >> (register_bits - bit) ) & 1;

    or

    bit = ((register) >> bit) & 1;

    John
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 17:13:40 2026
    From Newsgroup: comp.arch


    quadi <quadibloc@ca.invalid> posted:

    On Tue, 02 Jun 2026 14:00:00 +0100, John Dallman wrote:

    Brooks and Blaauw, two of the S/360 architects, consider the subject in their much later book _Computer Architecture_, on p. 99:

    "The more logical convention, the Big Endian, considers the whole
    storage space as one steam of bits. Bits, bytes and words are numbered
    from left to right, following the convention of writing in Western
    culture."

    On the other hand, Arabic is written from right to left, and yet the Arabs also write numbers with the most significant digit on the left. Hence, little-endian would seem more logical to them for the same reason.

    Chinese and Japanese is written top to bottom ...

    Since this is, therefore, a cultural matter, and not something universal, like the laws of physics or mathematics, we can't tell what a man from
    Mars would prefer.

    Middle endian!! Start in the middle and then one step left followed by
    one step write--more or less like PDP-11 FP.

    So, while they can call it "the more logical convention", this isn't something everyone would agree with. The famous article on the subject,
    "On Holy Wars and a Plea for Peace", by Danny Cohen from 1981 thus termed
    it as being much less important which standard was chosen than for
    everyone to choose the same one for compatibility, but he wasn't shy about expressing his personal preference for little-endian, referring to those
    who practiced big-endian as "outlaws".

    The case for big-endian is...

    It makes computers easier to understand for most people in Western societies.

    Just core dumps--they can be read without dumping hex on one side and characters on the other.

    It makes core dumps easier to read.
    Multi-precision compare is faster.

    Not up to 256-bits.

    The case for little-endian is...

    It won.


    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Tue Jun 2 15:59:33 2026
    From Newsgroup: comp.arch

    jgd@cix.co.uk (John Dallman) writes:
    Brooks and Blaauw, two of the S/360 architects, consider the subject in
    their much later book _Computer Architecture_, on p. 99:

    "The more logical convention, the Big Endian, considers the whole
    storage space as one steam of bits. Bits, bytes and words are
    numbered from left to right, following the convention of writing
    in Western culture."

    That explains why IBM mainframes number the most significant bit as zero,
    the opposite way around to all the platforms I've worked on, which number
    the least significant bit as zero.

    I've always find the latter convention helpful for doing hex arithmetic
    in my head or on paper. I _think_ big-endian SPARC, MIPS and POWER all
    regard the least significant bit as bit zero, but I can no longer easily >check that,

    Power(PC) gives the MSB bit of GPRs the number 0 and the LSB bit
    number 63. It's not clear how that works in 32-bit implementations,
    and if it plays a role at all. AFAICS, it plays no role (no
    instructions refer to the bit number as defined in the manual).

    The 68020 is bit-little-endian and byte-big-endian, and it has
    bitfield instructions, and from what I have read, this has led to
    problems (e.g., consider what to do if you have an array of 17-bit
    fields: how do you access the nth element of the array?

    The 88000 is bit-little-endian and byte-big-endian (Section 2.2.3 of
    the manual is quite clear about this at the start, and then discusses
    the byte-little-endian option; AFAIK all 88000 machines are
    byte-big-endian). It has bit-field instuctions that specify the
    bitfield as a offset from the LSB of the register and a width. Given
    that the bitfield instructions work on registers, and the load
    instructions require alignment, I don't expect the difference in order
    to cause many problems; maybe confusion if you try to deal with bit
    fields that cross words.

    MIPS also is bit-little-endian; there are byte-big-endian and byte-little-endian machines with MIPS CPUs. MIPS64r2 has bit-field instructions that use little-endian bit order, and before MIPS64r6 it
    also required alignment (MIPS64r6 allows either unaligned support or
    trapping on unaligned access). With unaligned support and big-endian
    byte order, problems like on the 68020 may arise.

    SPARCv9 <https://www.cs.utexas.edu/~novak/sparcv9.pdf> is
    bit-little-endian, and "uses big-endian byte order by default"
    (3.2.1.2) and I am not aware of any little-endian SPARC machine.
    AFAICS SPARC does not have instructions that use bit numbers, so the
    numbering of bits in the manual does not have any effect on the
    instruction set and programming.

    My take is, that in a world with different access widths (e.g.,
    accessing a register for a 32-bit value or a 64-bit value),
    bit-big-endian is a bad idea. And we already see that in the IBM 704
    manual which gives its most significant bits in its 38-bit accumulator
    the names (starting from the most significant)

    S (sign, maybe out of contest, but shown to the left of Q)
    Q (not present in memory)
    P (not present in memory; this is the carry bit for the ACL instruction)
    1

    If they had used bit-little-endian (and started at 0 instead of 1),
    they could have called P 35, Q 36, and S could be called 37, but given
    that it is a sign/magnitude machine, S is ok).

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Tue Jun 2 18:25:12 2026
    From Newsgroup: comp.arch

    MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

    quadi <quadibloc@ca.invalid> posted:

    On Tue, 02 Jun 2026 14:00:00 +0100, John Dallman wrote:

    Brooks and Blaauw, two of the S/360 architects, consider the subject in
    their much later book _Computer Architecture_, on p. 99:

    "The more logical convention, the Big Endian, considers the whole
    storage space as one steam of bits. Bits, bytes and words are numbered >> > from left to right, following the convention of writing in Western
    culture."

    On the other hand, Arabic is written from right to left, and yet the Arabs >> also write numbers with the most significant digit on the left. Hence,
    little-endian would seem more logical to them for the same reason.

    Chinese and Japanese is written top to bottom ...

    In classical times, yes, but modern texts are written left to right.


    Since this is, therefore, a cultural matter, and not something universal, >> like the laws of physics or mathematics, we can't tell what a man from
    Mars would prefer.

    Middle endian!! Start in the middle and then one step left followed by
    one step write--more or less like PDP-11 FP.

    Very Turing machine-like.
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Tue Jun 2 11:44:46 2026
    From Newsgroup: comp.arch

    On 6/2/2026 10:13 AM, MitchAlsup wrote:


    Since this is, therefore, a cultural matter, and not something universal,
    like the laws of physics or mathematics, we can't tell what a man from
    Mars would prefer.

    Middle endian!! Start in the middle and then one step left followed by
    one step write--more or less like PDP-11 FP.

    But then you have the "discussion" with those who want to start with a
    step to the right, followed by one to the left! :-). And that doesn't
    even address (pun intended), the issue of when you have an even number
    of bits/bytes/words, do you start with the one to the right of the
    "middle" or the left. :-)
    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 19:07:38 2026
    From Newsgroup: comp.arch

    On Tue, 02 Jun 2026 18:25:12 +0000, Thomas Koenig wrote:
    MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

    Chinese and Japanese is written top to bottom ...

    In classical times, yes, but modern texts are written left to right.

    In Taiwan and Hong Kong, books written top to bottom, and then right to
    left, and bound like books written right to left, were still being printed
    in the 1960s.

    As far as endianness is concerned, however, the Chinese wrote numbers with
    the most significant digit on the top, so for purposes of this discussion, Chinese was big-endian even traditionally.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 19:10:51 2026
    From Newsgroup: comp.arch

    On Tue, 02 Jun 2026 17:50:38 +0200, Terje Mathisen wrote:

    In the current environment where every language is expected to be
    compatible with a generic IDE like Visual Studio Code, via open source interface specifications, having a proprietary debug format seems like a
    good way to strongly limit your potential customer base.

    You appear to have understood his post in a different way than I did.

    I wasn't thinking of the kind of debug information provided by a compiler.

    I was thinking of leaving debug information in when one was distributing software to customers.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 19:37:17 2026
    From Newsgroup: comp.arch


    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> posted:

    On 6/2/2026 10:13 AM, MitchAlsup wrote:


    Since this is, therefore, a cultural matter, and not something universal, >> like the laws of physics or mathematics, we can't tell what a man from
    Mars would prefer.

    Middle endian!! Start in the middle and then one step left followed by
    one step write--more or less like PDP-11 FP.

    But then you have the "discussion" with those who want to start with a
    step to the right, followed by one to the left! :-). And that doesn't
    even address (pun intended), the issue of when you have an even number
    of bits/bytes/words, do you start with the one to the right of the
    "middle" or the left. :-)

    You could do random endian where a LSFR based on the address sequence determines MEL or MER.


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Tue Jun 2 22:00:06 2026
    From Newsgroup: comp.arch

    According to John Dallman <jgd@cix.co.uk>:
    In article <10uks4f$1dqo$1@gal.iecc.com>, johnl@taugh.com (John Levine) >wrote:

    Having looked into this in some detail, both when IBM used
    bigendian order on S/360 and DEC used little-endian on the
    PDP-11, neither documented the reasons for the byte order
    choice at all. Not even a litle bit.

    Brooks and Blaauw, two of the S/360 architects, consider the subject in
    their much later book _Computer Architecture_, on p. 99:

    "The more logical convention, the Big Endian, considers the whole
    storage space as one steam of bits. Bits, bytes and words are
    numbered from left to right, following the convention of writing
    in Western culture."

    I'd forgotten about that. Given who they were it's not surprising they found their preconceptions to be "more logical".

    On the next page they said (written in 1997):

    "Unlike Swift's, the computer Endian controversy is not pointless. The Little Endian design has many complications in use; we much prefer the
    Big Endian. Having two active conventions is very painful. Several recent
    Big Endian RISC computers., including the MIPS, the Motorola 88000, and
    the Intel i860 provide a data-movement operation that can perform the Big Endian-Little Endian permuation [Hennesy and Patterson, 1990]. We predict
    that Little Endian addressing will die out, just as decimal addressing did."

    Uh huh.

    A few years later IBM added LOAD REVERSED and STORE REVERSED to z/Architecture and retroactively to S/390 mode on Z machines.
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 00:21:03 2026
    From Newsgroup: comp.arch

    On Tue, 02 Jun 2026 22:00:06 +0000, John Levine wrote:

    "We predict that Little Endian addressing will die out, just as
    decimal addressing did."

    Uh huh.

    A few years later IBM added LOAD REVERSED and STORE REVERSED to z/Architecture and retroactively to S/390 mode on Z machines.

    I certainly would not hazard such a bold prediction.

    The prediction, though, is not hard to understand. If big-endian is more straightforward and easier to understand, but just costs an extra
    transistor here and there, then in the age of billion-transistor chips,
    why wouldn't it die out?

    However, just because something is going to die out _eventually_ doesn't
    mean it will do so any time soon. Interoperating and communicating with
    that little-endian monster IBM created in 1981 is going to be important
    for generating revenue for decades to come.

    So the existence of load reversed and store reversed instructions doesn't prove they were wrong... even though I still would not dare to say they
    are definitely right. I just think it's not unreasonable to think as they
    did, provided you account for a sufficiently long timeframe.

    Of course, given a sufficiently long timeframe, we might all be speaking Arabic, in which case little-endian would be the logical choice. Although
    that would require fossil fuels being important for longer than the
    climate could sustain it...

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 00:37:06 2026
    From Newsgroup: comp.arch

    On Tue, 02 Jun 2026 15:59:33 +0000, Anton Ertl wrote:

    My take is, that in a world with different access widths (e.g.,
    accessing a register for a 32-bit value or a 64-bit value),
    bit-big-endian is a bad idea.

    There is an argument for that.

    But if a computer does have bit-field instructions, I tend to consider it insane for it to number bits in the opposite direction of its endianness.

    Even though the problem isn't necessarily all that bad; as long as the bit fields are genuinely contiguous, then only the names of the bits in a byte
    are encoded funny.

    So if a 32 bit number is stored in bytes 5001, 5000, 4999, and 4998, from
    most significant byte to least significant, and you specify a 9-bit field starting in bit 6 of byte 4999... and the bits are numbered in big-endian order... the same thing should happen as if you specified bit 1 of byte
    4999 on a little-endian machine with little-endian bit numbering. You get
    nine bits, the seven least significant of which are bits 0 through 6 of
    byte 4999, and the remaining two of which are bits 6 and 7 of byte 5000.

    In the more common case, where the machine is big-endian, and it is the
    bit numbering that's little-endian, specifying a nine-bit field starting
    in bit 6 of byte 4999 would give you bits 6 through 0 of byte 4999,
    followed by bits 7 and 6 of byte 5000. Here, though, you're going from
    most significant to least significant, but in both cases you're moving
    forward to higher addresses, just as you do when accessing multi-byte
    numbers with a byte address.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 00:41:43 2026
    From Newsgroup: comp.arch

    On Wed, 20 May 2026 05:38:07 +0000, Anton Ertl wrote:

    * The last descendent of the PDP-11 was canceled long before the most
    prominent big-endien architecture (SPARC) was canceled, and long
    before Power switched its Linux support to little-endian, so the
    PDP-11 had little, if any, influence on the outcome.

    The long decline of big-endian happened later.

    But there wouldn't have _been_ little-endian architectures to out-compete big-endian if it hadn't been for the PDP-11. That was where the idea of little-endian got started.

    It wasn't the first machine to store two-word numbers least-significant-
    word first. But it was the first machine to be little-endian in any other
    way but that. Little-endian, as something more than an ad-hoc way to
    handle one case of double-precision integers, wasn't a thing until the
    PDP-11 came along.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed Jun 3 00:55:35 2026
    From Newsgroup: comp.arch


    quadi <quadibloc@ca.invalid> posted:

    On Tue, 02 Jun 2026 17:50:38 +0200, Terje Mathisen wrote:

    In the current environment where every language is expected to be compatible with a generic IDE like Visual Studio Code, via open source interface specifications, having a proprietary debug format seems like a good way to strongly limit your potential customer base.

    You appear to have understood his post in a different way than I did.

    I wasn't thinking of the kind of debug information provided by a compiler.

    I was thinking of leaving debug information in when one was distributing software to customers.

    Yes, you the vendor do not want random customer debugging the code,
    however, you want the ability to debug the code that was distributed
    on whatever medium on customer's system(s)--

    AND you want to debug one copy of the running code while others are using
    other processes running the code under normal use.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed Jun 3 01:03:26 2026
    From Newsgroup: comp.arch


    quadi <quadibloc@ca.invalid> posted:

    On Tue, 02 Jun 2026 22:00:06 +0000, John Levine wrote:

    "We predict that Little Endian addressing will die out, just as
    decimal addressing did."

    Uh huh.

    A few years later IBM added LOAD REVERSED and STORE REVERSED to z/Architecture and retroactively to S/390 mode on Z machines.

    I certainly would not hazard such a bold prediction.

    The prediction, though, is not hard to understand. If big-endian is more straightforward and easier to understand, but just costs an extra
    transistor here and there, then in the age of billion-transistor chips,
    why wouldn't it die out?

    Linux has gone all in on LE. So, if you want to start a HW company,
    you are forced to either choose LE or develop your own Operating
    System (with all the accoutrement involved.)

    However, just because something is going to die out _eventually_ doesn't mean it will do so any time soon. Interoperating and communicating with
    that little-endian monster IBM created in 1981 is going to be important
    for generating revenue for decades to come.

    The whole internet is Dual-endian !! With part LE and other parts BE.

    So the existence of load reversed and store reversed instructions doesn't prove they were wrong... even though I still would not dare to say they
    are definitely right. I just think it's not unreasonable to think as they did, provided you account for a sufficiently long timeframe.

    A byte reverse instruction will also work.

    Of course, given a sufficiently long timeframe, we might all be speaking Arabic, in which case little-endian would be the logical choice. Although that would require fossil fuels being important for longer than the
    climate could sustain it...

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 01:52:13 2026
    From Newsgroup: comp.arch

    On Wed, 03 Jun 2026 01:03:26 +0000, MitchAlsup wrote:

    Linux has gone all in on LE.

    It's true that Linux doesn't support the big-endian version of RISC-V. But
    it runs on other big-endian architectures.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Wed Jun 3 08:29:27 2026
    From Newsgroup: comp.arch

    On 2026-Jun-02 13:13, MitchAlsup wrote:

    Middle endian!! Start in the middle and then one step left followed by
    one step write--more or less like PDP-11 FP.

    We're doing the time warp... again!


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Wed Jun 3 13:54:01 2026
    From Newsgroup: comp.arch

    quadi <quadibloc@ca.invalid> schrieb:
    On Tue, 02 Jun 2026 22:00:06 +0000, John Levine wrote:

    "We predict that Little Endian addressing will die out, just as
    decimal addressing did."

    Uh huh.

    A few years later IBM added LOAD REVERSED and STORE REVERSED to
    z/Architecture and retroactively to S/390 mode on Z machines.

    I certainly would not hazard such a bold prediction.

    The prediction, though, is not hard to understand. If big-endian is more straightforward and easier to understand, but just costs an extra
    transistor here and there, then in the age of billion-transistor chips,
    why wouldn't it die out?

    It causes problems with badly-written software.

    Consider the following test program:

    #include <stdio.h>

    void printit(void *p)
    {
    char *c = p;
    printf ("Value is: %d\n", *c);
    }

    int main()
    {
    int i = 42;
    printit (&i);
    return 0;
    }

    On a little-endian system, this prints
    Value is: 42

    On a big-endian system, this prints
    Value is: 0

    If software designers play games with this sort of thing
    (knowingly or unknowingly), then software that will run
    on a little-endian system will not run on a big-endian
    system.
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 15:33:53 2026
    From Newsgroup: comp.arch

    On Wed, 03 Jun 2026 13:54:01 +0000, Thomas Koenig wrote:

    It causes problems with badly-written software.

    I don't see that as a fault of big-endian.

    One has to exert oneself to write a program equivalent to

    INTEGER*2 IP
    EQUIVALENCE (I, IP)
    I = 42
    WRITE(6,11) IP
    STOP
    11 FORMAT(' ', 'VALUE IS: ', I3)
    END

    and so the fact that it will print

    VALUE IS: 0

    is not a bug, it's exactly what one should expect.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Wed Jun 3 17:36:45 2026
    From Newsgroup: comp.arch

    According to quadi <quadibloc@ca.invalid>:
    On Wed, 03 Jun 2026 13:54:01 +0000, Thomas Koenig wrote:

    It causes problems with badly-written software.

    I don't see that as a fault of big-endian.

    Agreed. There were plenty of bugs porting BSD software from
    the little-endian Vax to big-endian 68000 series. Buggy software
    is buggy software.
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Wed Jun 3 17:13:08 2026
    From Newsgroup: comp.arch

    quadi <quadibloc@ca.invalid> writes:
    But there wouldn't have _been_ little-endian architectures to out-compete >big-endian if it hadn't been for the PDP-11. That was where the idea of >little-endian got started.

    The Datapoint 2200 would most likely have had the little-endian
    encoding if the PDP-11 had been big-endian, because it was developed independently and in parallel. The 8080's call and ret instructions
    were the first where little-endian extended beyond instruction
    encoding, but I expect that, once you have jump instructions with
    little-endian targets, you also want return addresses to be stored in little-endian byte order (simplifies implementation). And from there
    it goes to the 8086 which has 16-bit data memory accesses, and where
    you also stick with little-endian if you already have done so for jump
    targets and return addresses. And following the
    8086, IA-32 and AMD64 would have been little-endian, too.

    The 6502 would have been little-endian if the PDP-11 had been
    big-endian, for technical reasons. They ignored even the big-endian
    byte order of its predecessor, the 6800. And following the 6502, the
    ARM would have been little-endian even if the PDP-11 had been
    big-endian.

    So the architectures that dominate now would be little-endian even if
    the PDP-11 had been big-endian. Would they have been less successful
    if the PDP-11 had been big-endian? I doubt it. At a point around
    1990, most of the Unix market was big-endian, based on the 68000 being big-endian, and it seemed that if any byte order would win, it would
    be big-endian. IA-32 and VAX were expected to die because they were
    CISCs, and ARM was just a minor player in the RISC market at the time.

    And yet, IA-32/AMD64 and ARM's instruction sets outlived all the
    highly successful RISCs of the time. This would also have happened if
    the PDP-11 and the VAX would have been big-endian.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Wed Jun 3 17:44:20 2026
    From Newsgroup: comp.arch

    quadi <quadibloc@ca.invalid> writes:
    On Tue, 02 Jun 2026 15:59:33 +0000, Anton Ertl wrote:

    My take is, that in a world with different access widths (e.g.,
    accessing a register for a 32-bit value or a 64-bit value),
    bit-big-endian is a bad idea.

    There is an argument for that.

    But if a computer does have bit-field instructions, I tend to consider it >insane for it to number bits in the opposite direction of its endianness.

    So if big-endian bit numbering is a bad idea (and it is), big endian
    byte order is a bad idea, too.

    In the more common case, where the machine is big-endian, and it is the
    bit numbering that's little-endian, specifying a nine-bit field starting
    in bit 6 of byte 4999 would give you bits 6 through 0 of byte 4999,
    followed by bits 7 and 6 of byte 5000.

    This does not make sense. You have the 32-bit word with address 4998.
    If you access a 9-bit field at bit 6, it extends to bit 14, and these
    bits will be in the bytes at addresses 5001 and 5000 on your
    byte-big-endian machine. As long as you only access this bit field
    through a 32-bit access at this address, the difference does not play
    a role. But once you want to access it through an 32-bit access at
    4999 (now it's bit 14 through 22), a 16-bit access at 5000 (bit 6
    through 14 again), or a 32-bit access at 5000 (bit 22 through 30), the different orders become confusing.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Wed Jun 3 10:42:09 2026
    From Newsgroup: comp.arch

    Yes, you the vendor do not want random customer debugging the code,

    I also want a pony, but that doesn't make it right.

    The customer will usually not want to debug your code, but sometimes
    they will have to (e.g. because you the vendor don't exist any more or
    don't find that product of commercial value any more, ...).

    The customer deserves to be able to debug the code it's paid for.


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Wed Jun 3 18:36:51 2026
    From Newsgroup: comp.arch

    Stefan Monnier <monnier@iro.umontreal.ca> writes:

    <snip discussion of including debug data in distributed software>

    Yes, you the vendor do not want random customer debugging the code,

    I also want a pony, but that doesn't make it right.

    The customer will usually not want to debug your code, but sometimes
    they will have to (e.g. because you the vendor don't exist any more or
    don't find that product of commercial value any more, ...).

    The customer deserves to be able to debug the code it's paid for.

    There are several reasons that a vendor may wish to refrain from
    distributing the DWARF (or windows equiv) data with a software
    package. For example, program identifiers may inadvertently
    identify other customers, internal proprietary
    information or internal codenames.

    Being able to debug code without the source code doesn't seem
    a particulary common use case, nor would it be a viable way
    to continue to use orphaned software, other than to, perhaps,
    get it working sufficient to export any application
    data in an interchange form (e.g. csv or xml if supported
    by the application). I would certainly not recommend
    that it be used for production.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Wed Jun 3 19:22:49 2026
    From Newsgroup: comp.arch

    quadi <quadibloc@ca.invalid> schrieb:
    On Wed, 03 Jun 2026 13:54:01 +0000, Thomas Koenig wrote:

    [big-endian]
    It causes problems with badly-written software.

    I don't see that as a fault of big-endian.

    Neither do I. I am glad I still have access to a few big-endian
    machines. For example:

    $ lscpu
    Architecture: ppc64
    CPU op-mode(s): 32-bit, 64-bit
    Byte Order: Big Endian
    CPU(s): 64
    On-line CPU(s) list: 0-63
    Model name: POWER10 (architected), altivec supported
    [...]

    (which shows that big-endian Linux is still supported).

    But if the software you are targeting is primarily written on,
    and for, little-endian systems like x86, then the little-endian
    assumption will tend to creep in - certain things like writing
    an int to memory and reading back a char will "just work", and
    programmers may not know or care that they are violating language
    standards; they very rarely do.

    So what to do? Submit bug reports and patches and hoped they are
    integrated, or just bite the bullet and offer a little-endian
    version as well? IBM chose the latter.

    And refering to the point above, the code


    One has to exert oneself to write a program equivalent to

    INTEGER*2 IP
    EQUIVALENCE (I, IP)
    I = 42
    WRITE(6,11) IP
    STOP
    11 FORMAT(' ', 'VALUE IS: ', I3)
    END

    also violates the FORTRAN standard going back to Fortran 66.
    After the "I = 42" statement, IP becomes undefined according to
    the language definition.

    and so the fact that it will print

    VALUE IS: 0

    is not a bug, it's exactly what one should expect.

    It could also launch World War III, provided the right operational
    hardware has been installed.
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed Jun 3 21:24:09 2026
    From Newsgroup: comp.arch


    Stefan Monnier <monnier@iro.umontreal.ca> posted:

    Yes, you the vendor do not want random customer debugging the code,

    I also want a pony, but that doesn't make it right.

    The customer will usually not want to debug your code, but sometimes
    they will have to (e.g. because you the vendor don't exist any more or
    don't find that product of commercial value any more, ...).

    The customer deserves to be able to debug the code it's paid for.

    Does MS allow you to debug W11 or Office ...
    Does Corel allow you to debug Draw ...
    Does Adobe allow you to debug PDF reader ...

    Which is why, sooner or later, open source should win.


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Wed Jun 3 19:05:54 2026
    From Newsgroup: comp.arch

    On Tue, 02 Jun 2026 15:59:33 GMT, anton@mips.complang.tuwien.ac.at
    (Anton Ertl) wrote:

    The 68020 is bit-little-endian and byte-big-endian, and it has
    bitfield instructions, and from what I have read, this has led to
    problems (e.g., consider what to do if you have an array of 17-bit
    fields: how do you access the nth element of the array?

    <array>[n] ?

    In C the default would be an aligned array of 32-bit containers each
    of which stored a 17-bit field.

    If you mean a /packed/ array in which the 17-bit fields are stored bit contiguously ... well that could get interesting.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Thu Jun 4 04:06:45 2026
    From Newsgroup: comp.arch

    On Mon, 01 Jun 2026 19:56:38 +0000, quadi wrote:

    I wasn't happy. So I noticed that I actually had some unused space that
    I could squeeze out. So now the 24-bit short instructions have 1/2 as
    much space as they used to, which meant the only thing I had to give up
    was the ability to change the condition codes.

    I found that I had some unused space within the 80-bit instructions, and
    that was enough to let me restore the 24-bit short instructions to their former glory.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Thu Jun 4 04:44:33 2026
    From Newsgroup: comp.arch

    On Thu, 04 Jun 2026 04:06:45 +0000, quadi wrote:

    I found that I had some unused space within the 80-bit instructions, and
    that was enough to let me restore the 24-bit short instructions to their former glory.

    Then another crazy idea came into my head. The 16-bit short instructions
    are limited to operating on the first eight registers. Some believe that
    this limitation will make them essentially useless.

    They have a lot more opcode space allocated to them than the 24-bit short instructions. If I took that space, and gave it to the 24-bit short instructions, I could perhaps add a 24-bit memory-reference instruction!

    Well, I tried, and found out that I could indeed almost do that, with, of course, a restriction to 12-bit displacements... but, at best, I could
    only use two registers as destination registers for those memory-reference instructions!

    So that idea had to be discarded.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Thu Jun 4 12:45:39 2026
    From Newsgroup: comp.arch

    George Neuner <gneuner2@comcast.net> writes:
    On Tue, 02 Jun 2026 15:59:33 GMT, anton@mips.complang.tuwien.ac.at
    (Anton Ertl) wrote:

    The 68020 is bit-little-endian and byte-big-endian, and it has
    bitfield instructions, and from what I have read, this has led to
    problems (e.g., consider what to do if you have an array of 17-bit
    fields: how do you access the nth element of the array?
    [...]
    If you mean a /packed/ array in which the 17-bit fields are stored bit >contiguously ... well that could get interesting.

    For a consistently little-endian architecture that has no alignment requirements, the access is relativelysimple:

    nbit = n*17
    nbyte = nbit/8
    bitoffset = nbit%8
    tmp = load32b(array+nbyte)
    element = ext(tmp,bitoffset,17)

    (ext extracts the bitfield with length 17 at bitoffset from tmp; 88000
    and MIPS64r2 have such instructions).

    I leave the consistently big-endian version and the byte-big-endian bit-little-endian version as exercise to those who think that these
    are good ideas. I guess, for consistent big-endian, given an
    appropriate definition of ext, it's pretty similar, if not the same as
    above. The inconsistent variants (e.g., 68020, 88000, MIPS64r2) are
    not so easy, however.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Brian G. Lucas@bagel99@gmail.com to comp.arch on Thu Jun 4 11:53:31 2026
    From Newsgroup: comp.arch

    On 6/3/26 12:36 PM, John Levine wrote:
    According to quadi <quadibloc@ca.invalid>:
    On Wed, 03 Jun 2026 13:54:01 +0000, Thomas Koenig wrote:

    It causes problems with badly-written software.

    I don't see that as a fault of big-endian.

    Agreed. There were plenty of bugs porting BSD software from
    the little-endian Vax to big-endian 68000 series. Buggy software
    is buggy software.

    That followed the port to the Interdata 7/32 which was big-endian,
    so BSD must not have learned from that.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Thu Jun 4 23:46:11 2026
    From Newsgroup: comp.arch

    Scott Lurndal [2026-06-03 18:36:51] wrote:
    Being able to debug code without the source code doesn't seem
    a particulary common use case,

    Indeed, the source code should also be available, of course.
    I started this thread by mentioning Free Software. 🙂


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Fri Jun 5 08:36:53 2026
    From Newsgroup: comp.arch

    Stefan Monnier <monnier@iro.umontreal.ca> schrieb:
    Scott Lurndal [2026-06-03 18:36:51] wrote:
    Being able to debug code without the source code doesn't seem
    a particulary common use case,

    Indeed, the source code should also be available, of course.
    I started this thread by mentioning Free Software. 🙂

    I am a big proponent of free software, but it has a basic problem:
    Getting developers paid is not easy.

    An example is OpenFOAM. This is a very widely used CFD package,
    both in academia (because it costs nothing, and ANSYS is very
    expensive, also for universities) and also now in industry because
    people who come in from university have learned this during their
    PhDs (you need quite some time to learn).

    Funding? They want 500 k€ in 2026, which is far from excessive,
    see https://openfoam.org/news/funding-2026/ , both compared to
    commercial CFD companies and the value that OpenFOAM provides.
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Fri Jun 5 15:07:25 2026
    From Newsgroup: comp.arch

    On Wed, 03 Jun 2026 00:55:35 GMT, MitchAlsup
    <user5857@newsgrouper.org.invalid> wrote:


    quadi <quadibloc@ca.invalid> posted:

    On Tue, 02 Jun 2026 17:50:38 +0200, Terje Mathisen wrote:

    In the current environment where every language is expected to be
    compatible with a generic IDE like Visual Studio Code, via open source
    interface specifications, having a proprietary debug format seems like a >> > good way to strongly limit your potential customer base.

    You appear to have understood his post in a different way than I did.

    I wasn't thinking of the kind of debug information provided by a compiler. >>
    I was thinking of leaving debug information in when one was distributing
    software to customers.

    Yes, you the vendor do not want random customer debugging the code,
    however, you want the ability to debug the code that was distributed
    on whatever medium on customer's system(s)--

    AND you want to debug one copy of the running code while others are using >other processes running the code under normal use.

    That is true, but the issue at hand is how to achieve that. Leaving
    debug information /in/ the executable, I think, is a bad idea.

    However, many (most?) toolchains provide a way to separate debug
    symbols from the executable - either by generating a separate symbol
    database in the 1st place, or by allowing debug data to be stripped
    from the executables. If you have to debug at the client site, you
    simply take the symbol database with you.

    Another useful method is to write out debug information as the program
    executes and arrange that it either is suppressed or (alternatively)
    goes to /dev/null unless some undocumented flag is given.
    [Obviously where speed is paramount you can't be generating
    unnecessary output, so the utility of this method is situation
    dependent.]


    I have used both of these methods in the past.
    YMMV.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Sat Jun 6 01:37:13 2026
    From Newsgroup: comp.arch

    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    MitchAlsup <user5857@newsgrouper.org.invalid> writes:

    anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

    long bar(long x, long y)
    {
    return x/2+y/2;
    }
    ...
    Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
    produces ILP32 code) produces a call to __addvsi3 instead of the
    expected add instruction:

    gcc -O3 -ftrapv gcc -O3
    lui gp,0x0 srl v0,a0,0x1f
    addiu gp,gp,0 srl v1,a1,0x1f
    addu gp,gp,t9 addu v0,v0,a0
    srl v1,a0,0x1f addu a1,v1,a1
    lw t9,__addvsi3(gp) sra v0,v0,0x1
    srl v0,a1,0x1f sra a1,a1,0x1
    addiu sp,sp,-32 jr ra
    addu a0,v1,a0 addu v0,v0,a1
    addu a1,v0,a1
    sra a0,a0,0x1
    sw ra,28(sp)
    sw gp,16(sp)
    jalr t9
    sra a1,a1,0x1
    lw ra,28(sp)
    jr ra
    addiu sp,sp,32

    The call costs a lot of overhead.

    Architectures without overflow traps are notorious for excess instruction >>count when overflow detection is desired or mandated.

    MIPS' add traps on overflow. gcc could have emitted almost the same
    code for gcc -O3 -trapv as for gcc -O3, except that the last
    instruction would be an add, not an addu. But apparently nobody gives
    a damn about the efficiency of -trapv, possibly rightly so.

    My guess is that GCC developers care more about -trapv than about
    MIPS. AFAICS several architectures officialy supported by GCC
    struggle to work at all. I suspect that maintainers of MIPS
    backend are happy that -trapv works and do not have resources
    to make it efficient.
    --
    Waldek Hebisch
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Sat Jun 6 07:57:46 2026
    From Newsgroup: comp.arch

    Waldek Hebisch <antispam@fricas.org> schrieb:
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

    MIPS' add traps on overflow. gcc could have emitted almost the same
    code for gcc -O3 -trapv as for gcc -O3, except that the last
    instruction would be an add, not an addu. But apparently nobody gives
    a damn about the efficiency of -trapv, possibly rightly so.

    My guess is that GCC developers care more about -trapv than about
    MIPS.

    It is a common misconception to treat GCC developers as a
    monolithic group. There are hobbyists (such as myself) but
    I would guess only a small minority of work is done by them,
    with the notable exception of some front ends such as Fortran or
    (the most recent example) Algol 68. There are employees by
    different companies: Linux distributors like RedHat or Suse,
    Large software companies like Google, hardware vendors like
    IBM, Intel or Qualcomm, ...

    For MIPS, there are not so many active people and commits.
    mips64-linux-gnu is a secondary platform, so if it fails
    bootstrap, a release would be held up, but a wrong-code
    regression will not.

    Counting changes since 2025-01-01 in the gcc/config directories
    can give a good idea of the relative activity for different
    subdirectories; I cut this off below 7, where the PDP-11 is (note
    that architecture names are often historical, so i386 includes
    x86_64, s390 includes Z, rs6000 includes POWER and so on).

    539 ./riscv
    435 ./i386
    432 ./aarch64
    177 ./loongarch
    100 ./arm
    85 ./s390
    75 ./avr
    72 ./xtensa
    60 ./rs6000
    51 ./gcn
    39 ./nvptx
    25 ./sparc
    25 ./mips
    20 ./arc
    19 ./pa
    19 ./bpf
    19 ./alpha
    18 ./pru
    15 ./sh
    13 ./rx
    13 ./cris
    12 ./or1k
    12 ./microblaze
    12 ./m68k
    11 ./lm32
    11 ./ia64
    11 ./h8300
    9 ./vax
    9 ./nds32
    8 ./mcore
    8 ./epiphany
    8 ./c6x
    7 ./visium
    7 ./rl78
    7 ./pdp11
    7 ./frv
    7 ./csky

    AFAICS several architectures officialy supported by GCC
    struggle to work at all. I suspect that maintainers of MIPS
    backend are happy that -trapv works and do not have resources
    to make it efficient.

    First, they would need to know about this, which requires a PR,
    but resources may well be lacking.

    There are currently 28 open "missed-optimization" bugs with mips
    in their target field. Looking at a few architectures above,
    RISC-V has 118, x86 has 943, aarch64 has 305, power has 133.
    (Some bugs affect more than one architecture, of course).

    But it is worth submitting a PR nonetheless, if anybody cares enough :-)
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Sat Jun 6 08:57:09 2026
    From Newsgroup: comp.arch

    George Neuner <gneuner2@comcast.net> writes:
    That is true, but the issue at hand is how to achieve that. Leaving
    debug information /in/ the executable, I think, is a bad idea.

    On the contrary, it's an excellent idea. It means that the debug
    information goes with the code. No chance of confusing yourself by inadvertantly associating the wrong debugging information with the
    code, and much less chance of not finding the correct debug
    information.

    Best of all, of course, is to deliver the source code.

    Another useful method is to write out debug information as the program >executes and arrange that it either is suppressed or (alternatively)
    goes to /dev/null unless some undocumented flag is given.

    Undocumented features are forgotten and reimplemented. There's the
    story of Microsoft embedding some watermark into Microsoft BASIC
    twice, the second time apparently because they had forgotten about the
    first time.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Sun Jun 7 16:42:25 2026
    From Newsgroup: comp.arch

    George Neuner <gneuner2@comcast.net> writes:
    On Wed, 03 Jun 2026 00:55:35 GMT, MitchAlsup ><user5857@newsgrouper.org.invalid> wrote:


    quadi <quadibloc@ca.invalid> posted:

    On Tue, 02 Jun 2026 17:50:38 +0200, Terje Mathisen wrote:

    In the current environment where every language is expected to be
    compatible with a generic IDE like Visual Studio Code, via open source >>> > interface specifications, having a proprietary debug format seems like a >>> > good way to strongly limit your potential customer base.

    You appear to have understood his post in a different way than I did.

    I wasn't thinking of the kind of debug information provided by a compiler. >>>
    I was thinking of leaving debug information in when one was distributing >>> software to customers.

    Yes, you the vendor do not want random customer debugging the code, >>however, you want the ability to debug the code that was distributed
    on whatever medium on customer's system(s)--

    AND you want to debug one copy of the running code while others are using >>other processes running the code under normal use.

    That is true, but the issue at hand is how to achieve that. Leaving
    debug information /in/ the executable, I think, is a bad idea.

    However, many (most?) toolchains provide a way to separate debug
    symbols from the executable - either by generating a separate symbol
    database in the 1st place, or by allowing debug data to be stripped
    from the executables. If you have to debug at the client site, you
    simply take the symbol database with you.

    Indeed, and that's been the common paradigm at my employers

    I'll also note that many linux distributions include the debug
    symbols for the distribution in optionally loaded packages.


    Another useful method is to write out debug information as the program >executes and arrange that it either is suppressed or (alternatively)
    goes to /dev/null unless some undocumented flag is given.

    We arrange for the application to be able to be configured
    (both statically before startup and dynamically during
    runtime) to produce additional debug logging. Generally
    arranged in the code to avoid significant impact to non-debug
    performance (e.g. using __builtin_expect with GCC toolchains).

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Sun Jun 7 15:05:24 2026
    From Newsgroup: comp.arch

    On 6/4/2026 8:46 PM, Stefan Monnier wrote:
    Scott Lurndal [2026-06-03 18:36:51] wrote:
    Being able to debug code without the source code doesn't seem
    a particulary common use case,

    Indeed, the source code should also be available, of course.
    I started this thread by mentioning Free Software. 🙂

    Note that free does not equal open source. There is a fair amount of
    software that is freely available for which the source is not. Many of
    these are reduced functionality versions of paid for software, e.g.
    Adobe PDF reader, but there are others.
    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 8 01:19:17 2026
    From Newsgroup: comp.arch

    On Sun, 07 Jun 2026 15:05:24 -0700, Stephen Fuld wrote:

    Note that free does not equal open source. There is a fair amount of software that is freely available for which the source is not. Many of
    these are reduced functionality versions of paid for software, e.g.
    Adobe PDF reader, but there are others.

    Commonly, when this distinction is discussed in the open-source community,
    the phrases "free as in beer" and "free as in freedom" are used to
    distinguish between freeware that remains proprietary versus true open-
    source software under the GPL.

    John Savard

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 8 06:05:59 2026
    From Newsgroup: comp.arch

    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/4/2026 8:46 PM, Stefan Monnier wrote:
    I started this thread by mentioning Free Software. 🙂

    Note that free does not equal open source. There is a fair amount of >software that is freely available for which the source is not. Many of >these are reduced functionality versions of paid for software, e.g.
    Adobe PDF reader, but there are others.

    The Adobe PDF reader is chained software (aka proprietary software),
    not free software.

    In the appendix of "1984" George Orwell wrote:

    |To give a single example, the word free still existed in Newspeak, but
    |could only be used in such statements as "The dog is free from lice"
    |or "This field is free from weeds." It could not be used in its old
    |sense of "politically free" or "intellectually free," since political
    |and intellectual freedom no longer existed even as concepts, and were |therefore of necessity nameless.

    Some of us obviously already write and think in Newspeak.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.arch on Mon Jun 8 09:25:32 2026
    From Newsgroup: comp.arch

    On Mon, 8 Jun 2026 01:19:17 -0000 (UTC)
    quadi <quadibloc@ca.invalid> wrote:

    On Sun, 07 Jun 2026 15:05:24 -0700, Stephen Fuld wrote:

    Note that free does not equal open source. There is a fair amount
    of software that is freely available for which the source is not.
    Many of these are reduced functionality versions of paid for
    software, e.g. Adobe PDF reader, but there are others.

    Commonly, when this distinction is discussed in the open-source
    community, the phrases "free as in beer" and "free as in freedom" are
    used to distinguish between freeware that remains proprietary versus
    true open- source software under the GPL.

    John Savard


    I strongly disagree with statement that true open source software is
    equivalent of GPL.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Mon Jun 8 11:45:51 2026
    From Newsgroup: comp.arch

    Michael S wrote:
    On Mon, 8 Jun 2026 01:19:17 -0000 (UTC)
    quadi <quadibloc@ca.invalid> wrote:

    On Sun, 07 Jun 2026 15:05:24 -0700, Stephen Fuld wrote:

    Note that free does not equal open source. There is a fair amount
    of software that is freely available for which the source is not.
    Many of these are reduced functionality versions of paid for
    software, e.g. Adobe PDF reader, but there are others.

    Commonly, when this distinction is discussed in the open-source
    community, the phrases "free as in beer" and "free as in freedom" are
    used to distinguish between freeware that remains proprietary versus
    true open- source software under the GPL.

    John Savard


    I strongly disagree with statement that true open source software is equivalent of GPL.

    The obviously "most free" sw must be public domain, right?

    Followed by free use but attribution required/requested?

    Terje
    --
    - <Terje.Mathisen at tmsw.no>
    "almost all programming can be viewed as an exercise in caching"
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Mon Jun 8 07:30:46 2026
    From Newsgroup: comp.arch

    On 6/7/2026 11:05 PM, Anton Ertl wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/4/2026 8:46 PM, Stefan Monnier wrote:
    I started this thread by mentioning Free Software. 🙂

    Note that free does not equal open source. There is a fair amount of
    software that is freely available for which the source is not. Many of
    these are reduced functionality versions of paid for software, e.g.
    Adobe PDF reader, but there are others.

    The Adobe PDF reader is chained software (aka proprietary software),
    not free software.

    I don't want to get into a semantic argument here. I don't know what
    you mean by the term "chained software". I only meant that anyone
    could use it without paying anything to anyone. In the sense that John
    talked about, it is free beer.

    If I misinterpreted Stefan's use of the word free, then I apologize.





    In the appendix of "1984" George Orwell wrote:

    |To give a single example, the word free still existed in Newspeak, but |could only be used in such statements as "The dog is free from lice"
    |or "This field is free from weeds." It could not be used in its old
    |sense of "politically free" or "intellectually free," since political
    |and intellectual freedom no longer existed even as concepts, and were |therefore of necessity nameless.

    Some of us obviously already write and think in Newspeak.

    I hardly think that using the word free to mean "you don't have to pay
    for it" is Newspeak.
    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Mon Jun 8 15:19:57 2026
    From Newsgroup: comp.arch

    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/7/2026 11:05 PM, Anton Ertl wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/4/2026 8:46 PM, Stefan Monnier wrote:
    I started this thread by mentioning Free Software. 🙂

    Note that free does not equal open source. There is a fair amount of
    software that is freely available for which the source is not. Many of
    these are reduced functionality versions of paid for software, e.g.
    Adobe PDF reader, but there are others.

    The Adobe PDF reader is chained software (aka proprietary software),
    not free software.

    I don't want to get into a semantic argument here. I don't know what
    you mean by the term "chained software". I only meant that anyone
    could use it without paying anything to anyone. In the sense that John >talked about, it is free beer.

    Acroread sends basic telemetry to Adobe every time you use it,
    so in a sense, it's not exactly free.

    xpdf on the other hand....

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 8 16:18:37 2026
    From Newsgroup: comp.arch

    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/7/2026 11:05 PM, Anton Ertl wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/4/2026 8:46 PM, Stefan Monnier wrote:
    I started this thread by mentioning Free Software. 🙂

    Note that free does not equal open source. There is a fair amount of
    software that is freely available for which the source is not. Many of
    these are reduced functionality versions of paid for software, e.g.
    Adobe PDF reader, but there are others.

    The Adobe PDF reader is chained software (aka proprietary software),
    not free software.

    I don't want to get into a semantic argument here. I don't know what
    you mean by the term "chained software".

    Non-free software. I put the more commonly used term in parentheses.

    I only meant that anyone
    could use it without paying anything to anyone.

    That's not what "free software" means. The four essential freedoms of
    software are
    <https://www.gnu.org/philosophy/free-sw.en.html#fs-definition>:

    |* The freedom to run the program as you wish, for any purpose (freedom 0).
    |
    |* The freedom to study how the program works, and change it so it does
    | your computing as you wish (freedom 1). Access to the source code is
    | a precondition for this.
    |
    |* The freedom to redistribute copies so you can help others (freedom 2).
    |
    |* The freedom to distribute copies of your modified versions to others
    | (freedom 3). By doing this you can give the whole community a chance
    | to benefit from your changes. Access to the source code is a
    | precondition for this.
    |
    |A program is free software if it gives users adequately all of these |freedoms. Otherwise, it is nonfree.

    In the appendix of "1984" George Orwell wrote:

    |To give a single example, the word free still existed in Newspeak, but
    |could only be used in such statements as "The dog is free from lice"
    |or "This field is free from weeds." It could not be used in its old
    |sense of "politically free" or "intellectually free," since political
    |and intellectual freedom no longer existed even as concepts, and were
    |therefore of necessity nameless.

    Some of us obviously already write and think in Newspeak.

    I hardly think that using the word free to mean "you don't have to pay
    for it" is Newspeak.

    Orwell did not think about that meaning when he gave an example of
    Newspeak use of "free", so if the meaning "gratis" for "free" existed
    when he wrote the book in 1949, it was not widely-enough used to make
    it into the book. In any case, the meaning "free from lice" existed
    when Orwell wrote the book and still exists in Newspeak. Newspeak
    does not introduce new meanings, but the elimines the "freedom"
    meaning. And in your case, Newspeak obviously has been successful
    (not the Ingsoc variant ("free from lice"), but the surveillance
    capitalism variant ("you don't pay [money] for it")).

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 8 17:11:35 2026
    From Newsgroup: comp.arch

    On Mon, 08 Jun 2026 09:25:32 +0300, Michael S wrote:
    On Mon, 8 Jun 2026 01:19:17 -0000 (UTC)
    quadi <quadibloc@ca.invalid> wrote:
    On Sun, 07 Jun 2026 15:05:24 -0700, Stephen Fuld wrote:

    Note that free does not equal open source. There is a fair amount of
    software that is freely available for which the source is not. Many
    of these are reduced functionality versions of paid for software,
    e.g. Adobe PDF reader, but there are others.

    Commonly, when this distinction is discussed in the open-source
    community, the phrases "free as in beer" and "free as in freedom" are
    used to distinguish between freeware that remains proprietary versus
    true open- source software under the GPL.

    I strongly disagree with statement that true open source software is equivalent of GPL.

    I did not mean to imply that _only_ GPL-licensed software is truly open source. The GPL license is only the most common example. There is, as
    another reply has already noted, also MIT-licensed software and public
    domain software.

    And the term "open source", of course, is broader than this, as well. It
    isn't incorrect to use that term for any software the source code of which
    is open to inspection, even if the software itself is proprietary.

    John Savard

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Mon Jun 8 10:40:17 2026
    From Newsgroup: comp.arch

    On 6/8/2026 9:18 AM, Anton Ertl wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/7/2026 11:05 PM, Anton Ertl wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/4/2026 8:46 PM, Stefan Monnier wrote:
    I started this thread by mentioning Free Software. 🙂

    Note that free does not equal open source. There is a fair amount of
    software that is freely available for which the source is not. Many of >>>> these are reduced functionality versions of paid for software, e.g.
    Adobe PDF reader, but there are others.

    The Adobe PDF reader is chained software (aka proprietary software),
    not free software.

    I don't want to get into a semantic argument here. I don't know what
    you mean by the term "chained software".

    Non-free software. I put the more commonly used term in parentheses.

    OK.


    I only meant that anyone
    could use it without paying anything to anyone.

    That's not what "free software" means. The four essential freedoms of software are
    <https://www.gnu.org/philosophy/free-sw.en.html#fs-definition>:

    |* The freedom to run the program as you wish, for any purpose (freedom 0).
    |
    |* The freedom to study how the program works, and change it so it does
    | your computing as you wish (freedom 1). Access to the source code is
    | a precondition for this.
    |
    |* The freedom to redistribute copies so you can help others (freedom 2).
    |
    |* The freedom to distribute copies of your modified versions to others
    | (freedom 3). By doing this you can give the whole community a chance
    | to benefit from your changes. Access to the source code is a
    | precondition for this.
    |
    |A program is free software if it gives users adequately all of these |freedoms. Otherwise, it is nonfree.

    That is certainly *a* definition. It is obviously your preferred
    definition. But there are others.

    snipped the Orwell quotation.

    Can you accept that others might have a different definition without
    insulting them? (I take the assertion that I am using Newspeak as an
    insult.)
    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon Jun 8 09:18:02 2026
    From Newsgroup: comp.arch

    Indeed, the source code should also be available, of course.
    I started this thread by mentioning Free Software. 🙂
    Note that free does not equal open source. There is a fair amount of software that is freely available for which the source is not. Many of
    these are reduced functionality versions of paid for software, e.g. Adobe
    PDF reader, but there are others.

    You may want to check on Wikipedia what is [Free Software](https://en.wikipedia.org/wiki/Free_software) before jumping
    to conclusions.

    I capitalized "Free Software" for a reason.


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.arch on Mon Jun 8 22:43:40 2026
    From Newsgroup: comp.arch

    On Mon, 08 Jun 2026 16:18:37 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/7/2026 11:05 PM, Anton Ertl wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/4/2026 8:46 PM, Stefan Monnier wrote:
    I started this thread by mentioning Free Software. 🙂

    Note that free does not equal open source. There is a fair
    amount of software that is freely available for which the source
    is not. Many of these are reduced functionality versions of paid
    for software, e.g. Adobe PDF reader, but there are others.

    The Adobe PDF reader is chained software (aka proprietary
    software), not free software.

    I don't want to get into a semantic argument here. I don't know
    what you mean by the term "chained software".

    Non-free software. I put the more commonly used term in parentheses.

    I only meant that anyone
    could use it without paying anything to anyone.

    That's not what "free software" means. The four essential freedoms of software are
    <https://www.gnu.org/philosophy/free-sw.en.html#fs-definition>:

    |* The freedom to run the program as you wish, for any purpose
    (freedom 0). |
    |* The freedom to study how the program works, and change it so it
    does | your computing as you wish (freedom 1). Access to the source
    code is | a precondition for this.
    |
    |* The freedom to redistribute copies so you can help others (freedom
    2). |
    |* The freedom to distribute copies of your modified versions to
    others | (freedom 3). By doing this you can give the whole community
    a chance | to benefit from your changes. Access to the source code
    is a | precondition for this.
    |
    |A program is free software if it gives users adequately all of these |freedoms. Otherwise, it is nonfree.

    In the appendix of "1984" George Orwell wrote:

    |To give a single example, the word free still existed in
    Newspeak, but |could only be used in such statements as "The dog
    is free from lice" |or "This field is free from weeds." It could
    not be used in its old |sense of "politically free" or
    "intellectually free," since political |and intellectual freedom
    no longer existed even as concepts, and were |therefore of
    necessity nameless.

    Some of us obviously already write and think in Newspeak.

    I hardly think that using the word free to mean "you don't have to
    pay for it" is Newspeak.

    Orwell did not think about that meaning when he gave an example of
    Newspeak use of "free", so if the meaning "gratis" for "free" existed
    when he wrote the book in 1949, it was not widely-enough used to make
    it into the book. In any case, the meaning "free from lice" existed
    when Orwell wrote the book and still exists in Newspeak. Newspeak
    does not introduce new meanings, but the elimines the "freedom"
    meaning. And in your case, Newspeak obviously has been successful
    (not the Ingsoc variant ("free from lice"), but the surveillance
    capitalism variant ("you don't pay [money] for it")).

    - anton
    I tend to think that it was the other way around.
    RMS invented a new meaning of the term "free software" and then he and
    his devotees started to insist that it the the only correct meaning.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.arch on Mon Jun 8 22:51:52 2026
    From Newsgroup: comp.arch

    On Mon, 08 Jun 2026 15:19:57 GMT
    scott@slp53.sl.home (Scott Lurndal) wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/7/2026 11:05 PM, Anton Ertl wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    On 6/4/2026 8:46 PM, Stefan Monnier wrote:
    I started this thread by mentioning Free Software. 🙂

    Note that free does not equal open source. There is a fair
    amount of software that is freely available for which the source
    is not. Many of these are reduced functionality versions of paid
    for software, e.g. Adobe PDF reader, but there are others.

    The Adobe PDF reader is chained software (aka proprietary
    software), not free software.

    I don't want to get into a semantic argument here. I don't know
    what you mean by the term "chained software". I only meant that
    anyone could use it without paying anything to anyone. In the sense
    that John talked about, it is free beer.

    Acroread sends basic telemetry to Adobe every time you use it,
    so in a sense, it's not exactly free.

    xpdf on the other hand....

    I prefer SumatraPdf. GPL3, but not avalable outside Windows.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon Jun 8 18:03:43 2026
    From Newsgroup: comp.arch

    Anton Ertl [2026-06-08 16:18:37] wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    I hardly think that using the word free to mean "you don't have to pay
    for it" is Newspeak.
    Orwell did not think about that meaning when he gave an example of
    Newspeak use of "free", so if the meaning "gratis" for "free" existed
    when he wrote the book in 1949, it was not widely-enough used to make
    it into the book. In any case, the meaning "free from lice" existed
    when Orwell wrote the book and still exists in Newspeak. Newspeak
    does not introduce new meanings, but the elimines the "freedom"
    meaning. And in your case, Newspeak obviously has been successful
    (not the Ingsoc variant ("free from lice"), but the surveillance
    capitalism variant ("you don't pay [money] for it")).

    Well, Stephen is hardly using a recent meaning of the word "free".
    According to the OED, "free" as in "free of charge" traces back to the
    13th century, so it clearly existed in Orwell's time.

    But yes, I find it demoralizing that people within the computer world
    are still making this mistake, after more than 40 years of FSF.


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon Jun 8 17:44:59 2026
    From Newsgroup: comp.arch

    Terje Mathisen [2026-06-08 11:45:51] wrote:
    Michael S wrote:
    The obviously "most free" sw must be public domain, right?

    As with most things related to freedom ... it depends.

    Public domain offers "more freedom" when you consider the point of view
    of the developers, who can use that software any way they want with no restrictions at all.

    But not when you consider the point of view of the end-users who may
    receive code compiled/derived from that public domain source code with
    no way to recover that public domain source code, or to change or fix
    it. It may even be illegal to try to recover it (since the DMCA
    disallows several forms of reverse engineering).

    From that end-user point of view, the GPL arguably ensures "more
    freedom" than public domain.


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Tue Jun 9 10:06:48 2026
    From Newsgroup: comp.arch

    On 09/06/2026 00:03, Stefan Monnier wrote:
    Anton Ertl [2026-06-08 16:18:37] wrote:
    Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
    I hardly think that using the word free to mean "you don't have to pay
    for it" is Newspeak.
    Orwell did not think about that meaning when he gave an example of
    Newspeak use of "free", so if the meaning "gratis" for "free" existed
    when he wrote the book in 1949, it was not widely-enough used to make
    it into the book. In any case, the meaning "free from lice" existed
    when Orwell wrote the book and still exists in Newspeak. Newspeak
    does not introduce new meanings, but the elimines the "freedom"
    meaning. And in your case, Newspeak obviously has been successful
    (not the Ingsoc variant ("free from lice"), but the surveillance
    capitalism variant ("you don't pay [money] for it")).

    Well, Stephen is hardly using a recent meaning of the word "free".
    According to the OED, "free" as in "free of charge" traces back to the
    13th century, so it clearly existed in Orwell's time.

    But yes, I find it demoralizing that people within the computer world
    are still making this mistake, after more than 40 years of FSF.


    It is not a mistake - it is merely a different but perfectly reasonable
    use of the same word. The FSF has done (and continues to do) wonderful
    things that are of huge benefit to the computing world, and I am a big
    fan of what they term "Free Software". But they do not own the word
    "free", nor do they have rights to determine the definition of the
    phrase "free software". People can, and do, use the phrase meaning
    "gratis software". In any discussion on the topic, it is good to be
    entirely clear on the intended meanings - but that applies equally to
    those who write "free software" meaning "libre software" and "free
    software" meaning "gratis software". Neither are "mistaken", and both
    can cause confusion. (In longer phrases, acronyms, or proper names of organisations, there should be no confusion - FOSS or FSF should be
    clear to all.)

    (As for the discussion about what is the most "free", or "libre",
    licensing model - you can argue about it until you are blue in the face,
    but no conclusion can be reached because it depends on the point of
    view. Freedoms are always a balance and a tradeoff to some extent.)

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Tue Jun 9 17:24:24 2026
    From Newsgroup: comp.arch

    Michael S <already5chosen@yahoo.com> writes:
    On Mon, 08 Jun 2026 16:18:37 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:
    Orwell did not think about that meaning when he gave an example of
    Newspeak use of "free", so if the meaning "gratis" for "free" existed
    when he wrote the book in 1949, it was not widely-enough used to make
    it into the book. In any case, the meaning "free from lice" existed
    when Orwell wrote the book and still exists in Newspeak. Newspeak
    does not introduce new meanings, but the elimines the "freedom"
    meaning. And in your case, Newspeak obviously has been successful
    (not the Ingsoc variant ("free from lice"), but the surveillance
    capitalism variant ("you don't pay [money] for it")).
    =20
    - anton

    I tend to think that it was the other way around.
    RMS invented a new meaning of the term "free software"

    The story that I read was that all software originally was free in the
    FSF sense (i.e., provided the four freedoms).[1] Then some people
    removed some or all freedoms from some software, typically with the
    goal of making money from the software. Removing the freedoms and yet
    not asking for money is a later development; this has often been
    called shareware or freeware, but Stephen Fuld is the first one I have
    seen who has called it "free software", and actually misunderstood a
    reference to "Free Software" (capitalized).

    [1] As an example, <https://en.wikipedia.org/wiki/SHARE_(computing)>
    states:

    |Originally, IBM distributed what software it provided in source
    |form[2][3][4] and systems programmers commonly made small local
    |additions or modifications and exchanged them with other users.

    All four freedoms were exercised here, more than two decades before
    the Free Software Foundation.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Tue Jun 9 21:15:41 2026
    From Newsgroup: comp.arch

    Anton Ertl wrote:
    Michael S <already5chosen@yahoo.com> writes:
    On Mon, 08 Jun 2026 16:18:37 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:
    Orwell did not think about that meaning when he gave an example of
    Newspeak use of "free", so if the meaning "gratis" for "free" existed
    when he wrote the book in 1949, it was not widely-enough used to make
    it into the book. In any case, the meaning "free from lice" existed
    when Orwell wrote the book and still exists in Newspeak. Newspeak
    does not introduce new meanings, but the elimines the "freedom"
    meaning. And in your case, Newspeak obviously has been successful
    (not the Ingsoc variant ("free from lice"), but the surveillance
    capitalism variant ("you don't pay [money] for it")).
    =20
    - anton

    I tend to think that it was the other way around.
    RMS invented a new meaning of the term "free software"

    The story that I read was that all software originally was free in the
    FSF sense (i.e., provided the four freedoms).[1] Then some people
    removed some or all freedoms from some software, typically with the
    goal of making money from the software. Removing the freedoms and yet
    not asking for money is a later development; this has often been
    called shareware or freeware, but Stephen Fuld is the first one I have
    seen who has called it "free software", and actually misunderstood a reference to "Free Software" (capitalized).

    [1] As an example, <https://en.wikipedia.org/wiki/SHARE_(computing)>
    states:

    |Originally, IBM distributed what software it provided in source |form[2][3][4] and systems programmers commonly made small local
    |additions or modifications and exchanged them with other users.

    All four freedoms were exercised here, more than two decades before
    the Free Software Foundation.

    Yes, with one importnt restriction:

    The software was free, but you could not use it except on IBM hardware,
    which was quite expensive.

    When clones started to appear (Amdahl?) I believe the free sw
    disappeared, now it was explicitly licensed to only run on "real" IBM hardware?

    Terje
    --
    - <Terje.Mathisen at tmsw.no>
    "almost all programming can be viewed as an exercise in caching"
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Tue Jun 9 16:29:01 2026
    From Newsgroup: comp.arch

    On 6/9/2026 12:15 PM, Terje Mathisen wrote:
    Anton Ertl wrote:
    Michael S <already5chosen@yahoo.com> writes:
    On Mon, 08 Jun 2026 16:18:37 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:
    Orwell did not think about that meaning when he gave an example of
    Newspeak use of "free", so if the meaning "gratis" for "free" existed
    when he wrote the book in 1949, it was not widely-enough used to make
    it into the book.  In any case, the meaning "free from lice" existed
    when Orwell wrote the book and still exists in Newspeak.  Newspeak
    does not introduce new meanings, but the elimines the "freedom"
    meaning.  And in your case, Newspeak obviously has been successful
    (not the Ingsoc variant ("free from lice"), but the surveillance
    capitalism variant ("you don't pay [money] for it")).
    =20
    - anton

    I tend to think that it was the other way around.
    RMS invented a new meaning of the term "free software"

    The story that I read was that all software originally was free in the
    FSF sense (i.e., provided the four freedoms).[1] Then some people
    removed some or all freedoms from some software, typically with the
    goal of making money from the software.  Removing the freedoms and yet
    not asking for money is a later development; this has often been
    called shareware or freeware, but Stephen Fuld is the first one I have
    seen who has called it "free software", and actually misunderstood a
    reference to "Free Software" (capitalized).

    [1] As an example, <https://en.wikipedia.org/wiki/SHARE_(computing)>
    states:

    |Originally, IBM distributed what software it provided in source
    |form[2][3][4] and systems programmers commonly made small local
    |additions or modifications and exchanged them with other users.

    All four freedoms were exercised here, more than two decades before
    the Free Software Foundation.

    Yes, with one importnt restriction:

    The software was free, but you could not use it except on IBM hardware, which was quite expensive.

    When clones started to appear (Amdahl?) I believe the free sw
    disappeared, now it was explicitly licensed to only run on "real" IBM hardware?

    Sort of, but it was more complicated than that. In the 1960s (and
    before), IBM "bundled" (i.e. it was freely included) all software with
    the hardware. In 1969, the US government filed an anti-trust case
    against IBM, claiming, among other things, monopolization of the
    software market. One of the Government's goals was to support an
    independent software market (which couldn't exist if IBM gave everything
    away for free). The suit dragged on for years and was ultimately
    withdrawn, but IBM was scared about what the suit could do. So it
    initiated "unbundling" of software (and other things like education
    classes), now charging separately for each software product the customer wanted. This was ultimately successful for the government, leading to
    the success of companies like Syncsort (1971), and later several
    competitive database systems (e.g Total, IDMS, etc.) But it also
    allowed Amdahl (in 1971), and later other PCM hardware companies, to
    sell competitive CPUs with the assurance that they could license the OS,
    etc. from IBM.
    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Wed Jun 10 06:01:19 2026
    From Newsgroup: comp.arch

    Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:

    The story that I read was that all software originally was free in the
    FSF sense (i.e., provided the four freedoms).[1]

    There is an anecdote in "Abstracting Away the Machine".
    IBM supplied a customer with its Fortran compiler (Fortran I at
    the time). The customer noted that tape use was inefficient,
    leading to longer than necessary compile times, and asked for
    the source to improve it. Somebody at IBM refused, quipping "IBM
    does not supply source code". So the customer went ahead, reverse
    engieered the compiler and added the improvements anyway (which
    were huge). When IBM noticed that, they asked for the improvement,
    and the customer uipped back "$COMPANY does not supply object code",
    and refused.

    I'd have to search for the anecdote in the book to get the details
    exactly right.
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Thu Jun 11 15:14:26 2026
    From Newsgroup: comp.arch

    According to Stefan Monnier <monnier@iro.umontreal.ca>:
    But not when you consider the point of view of the end-users who may
    receive code compiled/derived from that public domain source code with
    no way to recover that public domain source code, or to change or fix
    it. It may even be illegal to try to recover it (since the DMCA
    disallows several forms of reverse engineering).

    If it's really public domain, there is no bar to reverse engineering
    since there is nobody who can complain about it. I agree there are
    other kinds of software where the executable is freely available but
    the authors choose not to provide source and could use the DMCA
    against people who reverse engineer.

    From that end-user point of view, the GPL arguably ensures "more
    freedom" than public domain.

    I definitely agree with "arguably".

    Speaking of the DMCA, the US Copyright Office is starting its tenth proceeding to update the list of exemptions to the DMCA for research, analysis and other non-infringing uses. Worth a look if you're interested in the topic.

    https://www.copyright.gov/1201/2027/
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Thu Jun 11 13:43:29 2026
    From Newsgroup: comp.arch

    On Mon, 08 Jun 2026 17:44:59 -0400, Stefan Monnier
    <monnier@iro.umontreal.ca> wrote:

    Terje Mathisen [2026-06-08 11:45:51] wrote:
    Michael S wrote:
    The obviously "most free" sw must be public domain, right?

    As with most things related to freedom ... it depends.

    Public domain offers "more freedom" when you consider the point of view
    of the developers, who can use that software any way they want with no >restrictions at all.

    But not when you consider the point of view of the end-users who may
    receive code compiled/derived from that public domain source code with
    no way to recover that public domain source code, or to change or fix
    it. It may even be illegal to try to recover it (since the DMCA
    disallows several forms of reverse engineering).

    From that end-user point of view, the GPL arguably ensures "more
    freedom" than public domain.


    === Stefan

    Not to mention that there are many countries that do not recognize
    public domain. And even where it technically is recognized, some
    countries have legal procedures that must be followed to relinquish
    your rights and so complicate actually putting something into public
    domain.

    Putting <whatever> under some kind of license - regardless of how
    permissive it is - actually is easier to do in many places, and is
    recognized in more places.

    MMV.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Fri Jun 12 01:04:53 2026
    From Newsgroup: comp.arch

    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 25/05/2026 16:28, Anton Ertl wrote:
    Despite their eagerness to "optimize" based on the assumption
    that signed integer overflow does not happen, the GCC developers have
    avoided making -ftrap the default, even on platforms like MIPS and
    Alpha where the implementation of -ftrapv just means to use different
    instructions (e.g., add instead of addu on MIPS, and addv instead of
    add on Alpha).

    An awkward thing about using trap on overflow is determining how
    precisely it is defined. Supposing you have the expression "a + b - a".
    Perhaps "a + b" overflows. I would hope than when using debug-related >>compiler flags such as "-fsanitize=signed-integer-overflow", a compiler >>would check for overflow on "a + b", and report it at runtime. >>(Unfortunately, gcc does not do that unless the partial expression is >>assigned to a variable.) But in "normal" usage, I'd expect the
    expression to be simplified, resulting in just "b" and no overflow.

    OTOH, cases like a+b+c where the result is in range, while an
    intermediate result is out of range are one of the reasons why I
    prefer -fwrapv over -ftrapv. As for your preference of nasal demons,
    given enough information, the compiler might "optimize" "a+b-a" into,
    e.g., 0.

    Anyway, the definition of -ftrapv is not very precise; for gcc-12.2:

    |'-ftrapv'
    | This option generates traps for signed overflow on addition,
    | subtraction, multiplication operations.

    As for what gcc-12.2 does for your example on AMD64:

    long foo(long a, long b)
    {
    return a+b-a;
    }

    is compiled with gcc -O3 -ftrapv to:

    0: 48 89 f0 mov %rsi,%rax
    3: c3 ret

    That is what I expect from '-ftrapv': running code should deliver
    result as if using infinite precision arithmetic or overflow trap.
    More tight specification could be that optimized code should not
    generate overflow trap in cases when computing naively using C
    semantics does not lead to overflow. Since the result above
    agrees with result obtained using infinite precision arithmetic,
    the code is fine and there is no need for runtime checks.

    Of course, languages like C++ which turn traps into exceptions
    and allow to use this as part of computations may have problem
    here. More precisely, if they specify that overflow exception
    must happen at given computational step, then optimizer may be
    forced to generate code which is ther only to generate trap
    and serves no other purpose. But the orignal intent of
    overflow trap is to signal that real machine using fixed size
    numbers can not deliver the same result as ideal machine
    using infinte precion. If a language respects the intent,
    then compiler can do a lot of optimizations.
    --
    Waldek Hebisch
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Fri Jun 12 01:57:06 2026
    From Newsgroup: comp.arch

    Thomas Koenig <tkoenig@netcologne.de> wrote:
    Waldek Hebisch <antispam@fricas.org> schrieb:
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

    MIPS' add traps on overflow. gcc could have emitted almost the same
    code for gcc -O3 -trapv as for gcc -O3, except that the last
    instruction would be an add, not an addu. But apparently nobody gives
    a damn about the efficiency of -trapv, possibly rightly so.

    My guess is that GCC developers care more about -trapv than about
    MIPS.

    It is a common misconception to treat GCC developers as a
    monolithic group. There are hobbyists (such as myself) but
    I would guess only a small minority of work is done by them,
    with the notable exception of some front ends such as Fortran or
    (the most recent example) Algol 68. There are employees by
    different companies: Linux distributors like RedHat or Suse,
    Large software companies like Google, hardware vendors like
    IBM, Intel or Qualcomm, ...

    Well, normal employees care about things that their employer
    tells them to do. Whatever the reason GCC developers
    (that is people contributing to GCC) each have their agenda,
    care more about some things and less about other.

    For MIPS, there are not so many active people and commits.
    mips64-linux-gnu is a secondary platform, so if it fails
    bootstrap, a release would be held up, but a wrong-code
    regression will not.

    Counting changes since 2025-01-01 in the gcc/config directories
    can give a good idea of the relative activity for different
    subdirectories; I cut this off below 7, where the PDP-11 is (note
    that architecture names are often historical, so i386 includes
    x86_64, s390 includes Z, rs6000 includes POWER and so on).

    539 ./riscv
    435 ./i386
    432 ./aarch64
    177 ./loongarch
    100 ./arm
    85 ./s390
    75 ./avr
    72 ./xtensa
    60 ./rs6000
    51 ./gcn
    39 ./nvptx
    25 ./sparc
    25 ./mips
    20 ./arc
    19 ./pa
    19 ./bpf
    19 ./alpha
    18 ./pru
    15 ./sh
    13 ./rx
    13 ./cris
    12 ./or1k
    12 ./microblaze
    12 ./m68k
    11 ./lm32
    11 ./ia64
    11 ./h8300
    9 ./vax
    9 ./nds32
    8 ./mcore
    8 ./epiphany
    8 ./c6x
    7 ./visium
    7 ./rl78
    7 ./pdp11
    7 ./frv
    7 ./csky

    AFAICS several architectures officialy supported by GCC
    struggle to work at all. I suspect that maintainers of MIPS
    backend are happy that -trapv works and do not have resources
    to make it efficient.

    First, they would need to know about this, which requires a PR,
    but resources may well be lacking.

    There are currently 28 open "missed-optimization" bugs with mips
    in their target field. Looking at a few architectures above,
    RISC-V has 118, x86 has 943, aarch64 has 305, power has 133.
    (Some bugs affect more than one architecture, of course).

    But it is worth submitting a PR nonetheless, if anybody cares enough :-)

    Frankly, I do not care enough. I mean, I like fact that GCC
    supports several architectures. But I have use of x86_64, arm (both
    32 bit and 64-bit one), RISC-V and few embedded processors, that is
    I have processors and can run GCC outout on them. I even have some
    use of s390/z, that is I have emulator (Hercules) and have some
    interst in software running inside emulator. But I have essentially
    no use of MIPS.

    And in slightly different spirit, IIRC there were cases when bug
    reports caused reaction of sort "Yes, it is buggy. It would be
    too much effort to fix it, so we will just remove support".
    I support is removed, then work needed to revive an architecture
    is likely to be significantly larger than in case of bitrotten,
    but still in-tree architecture.
    --
    Waldek Hebisch
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Thu Jun 11 15:21:14 2026
    From Newsgroup: comp.arch

    David Brown [2026-06-09 10:06:48] wrote:
    On 09/06/2026 00:03, Stefan Monnier wrote:
    Well, Stephen is hardly using a recent meaning of the word "free".
    According to the OED, "free" as in "free of charge" traces back to the
    13th century, so it clearly existed in Orwell's time.
    But yes, I find it demoralizing that people within the computer world
    are still making this mistake, after more than 40 years of FSF.
    It is not a mistake - it is merely a different but perfectly reasonable use of the same word.

    In an arbitrary context, I could agree, but here we're talking about
    a subthread that started with:

    On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:
    > MitchAlsup [2026-05-26 20:54:30] wrote:
    >> Encrypt the debug information (and put it in a
    >> {1234-5678-9101-1121-...} folder) so that only the owner (not
    >> licensee) of the code can debug it.
    > I resent that. All code should be Free Software.

    I think there is no ambiguity here.

    Treating this "Free Software" to refer to price rather than to freedom
    is an error that can be explained only by a lack of familiarity with the
    idea of software freedom.


    === Stefan
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Fri Jun 12 13:02:41 2026
    From Newsgroup: comp.arch

    On 11/06/2026 21:21, Stefan Monnier wrote:
    David Brown [2026-06-09 10:06:48] wrote:
    On 09/06/2026 00:03, Stefan Monnier wrote:
    Well, Stephen is hardly using a recent meaning of the word "free".
    According to the OED, "free" as in "free of charge" traces back to the
    13th century, so it clearly existed in Orwell's time.
    But yes, I find it demoralizing that people within the computer world
    are still making this mistake, after more than 40 years of FSF.
    It is not a mistake - it is merely a different but perfectly reasonable use >> of the same word.

    In an arbitrary context, I could agree, but here we're talking about
    a subthread that started with:

    On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:
    > MitchAlsup [2026-05-26 20:54:30] wrote:
    >> Encrypt the debug information (and put it in a
    >> {1234-5678-9101-1121-...} folder) so that only the owner (not
    >> licensee) of the code can debug it.
    > I resent that. All code should be Free Software.

    I think there is no ambiguity here.

    Treating this "Free Software" to refer to price rather than to freedom
    is an error that can be explained only by a lack of familiarity with the
    idea of software freedom.


    Fair enough - I agree that in that context, the term "Free Software" is unambiguous. The capitalisation is important, and that was lost by the
    post to which I replied. (There are a /lot/ of posts in this thread,
    and when switching between two computers I have undoubtedly skipped many
    of them.)

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Sat Jun 13 03:01:00 2026
    From Newsgroup: comp.arch

    According to George Neuner <gneuner2@comcast.net>:
    As with most things related to freedom ... it depends.

    Not to mention that there are many countries that do not recognize
    public domain. And even where it technically is recognized, some
    countries have legal procedures that must be followed to relinquish
    your rights and so complicate actually putting something into public
    domain.

    I am not aware of any countries that do not have the public domain for
    material whose copyright has expired, or for whatever reason was not
    eligible for copyright in the first place. But you're right, in some
    places it is impossible or at least impractical to relinquish your
    rights and put something in the P.D. before it would get there anyway.

    Putting <whatever> under some kind of license - regardless of how
    permissive it is - actually is easier to do in many places, and is
    recognized in more places.

    Agreed. There are lots of licenses other than the GPL that are
    used successfully for open source software.

    R's,
    John
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Sat Jun 13 05:49:19 2026
    From Newsgroup: comp.arch

    On Sat, 13 Jun 2026 03:01:00 -0000 (UTC), John Levine
    <johnl@taugh.com> wrote:

    According to George Neuner <gneuner2@comcast.net>:
    As with most things related to freedom ... it depends.

    Not to mention that there are many countries that do not recognize
    public domain. And even where it technically is recognized, some
    countries have legal procedures that must be followed to relinquish
    your rights and so complicate actually putting something into public >>domain.

    I am not aware of any countries that do not have the public domain for >material whose copyright has expired, or for whatever reason was not
    eligible for copyright in the first place. But you're right, in some
    places it is impossible or at least impractical to relinquish your
    rights and put something in the P.D. before it would get there anyway.

    The Berne convention defined an implicit copyright that exists by
    virtue of authorship and persists until the author's death. Though
    the US does not recognize or enforce these implicit copyrights, most signatories to either Berne (1886) or UCC (1952) conventions do
    recognize and enforce Berne copyrights.

    Explicit copyrights - filed with Copyright offices - can be
    voluntarily surrendered at any time. It is giving up the implicit
    copyright that is the problem with public domain.


    Putting <whatever> under some kind of license - regardless of how >>permissive it is - actually is easier to do in many places, and is >>recognized in more places.

    Agreed. There are lots of licenses other than the GPL that are
    used successfully for open source software.

    R's,
    John
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Sat Jun 13 10:52:08 2026
    From Newsgroup: comp.arch

    George Neuner <gneuner2@comcast.net> writes:
    The Berne convention defined an implicit copyright that exists by
    virtue of authorship and persists until the author's death. Though
    the US does not recognize or enforce these implicit copyrights, most >signatories to either Berne (1886) or UCC (1952) conventions do
    recognize and enforce Berne copyrights.

    According to <https://en.wikipedia.org/wiki/Berne_convention>:

    |The United States acceded to the convention on 16 November 1988, and
    |the convention entered into force for the United States on 1 March
    |1989.

    How can the convention have entered into force in the US without the
    US recognizing or enforcing implicit copyrights?

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadibloc@quadibloc@invalid.com (John Savard) to comp.arch on Sat Jun 13 14:20:57 2026
    From Newsgroup: comp.arch

    On Sat, 13 Jun 2026 10:52:08 GMT, anton@mips.complang.tuwien.ac.at
    (Anton Ertl) wrote:
    George Neuner <gneuner2@comcast.net> writes:

    The Berne convention defined an implicit copyright that exists by
    virtue of authorship and persists until the author's death. Though
    the US does not recognize or enforce these implicit copyrights, most >>signatories to either Berne (1886) or UCC (1952) conventions do
    recognize and enforce Berne copyrights.

    According to <https://en.wikipedia.org/wiki/Berne_convention>:

    |The United States acceded to the convention on 16 November 1988, and
    |the convention entered into force for the United States on 1 March
    |1989.

    How can the convention have entered into force in the US without the
    US recognizing or enforcing implicit copyrights?

    Implicit copyrights do exist now in the U.S. because of its
    ratification of the Berne convention.

    But there are some limitations.

    Nothing that entered the public domain prior to this ratification
    became copyrighted again; there was no retroactive effect.

    Also, U.S. parties are still incentivized to register their
    copyrights, because this is necessary to recieve statutory damages and attorney's fees from a copyright lawsuit.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Sat Jun 13 15:07:00 2026
    From Newsgroup: comp.arch

    John Levine <johnl@taugh.com> wrote:
    According to George Neuner <gneuner2@comcast.net>:
    As with most things related to freedom ... it depends.

    Not to mention that there are many countries that do not recognize
    public domain. And even where it technically is recognized, some
    countries have legal procedures that must be followed to relinquish
    your rights and so complicate actually putting something into public >>domain.

    I am not aware of any countries that do not have the public domain for material whose copyright has expired,

    My country (Poland) has a rule that once copyright has expired
    distributior of work should pay royalites to the state. In
    am not sure how it works in "interesting" case, but clearly
    this is quite different from US/UK meaning of public domain.

    Also, law of my country declares some author right as
    untransfreable. Basically, author can sue if he/she/it
    thinks that artistic integrity of the work is violated.
    Theoretically, one could imagine some old, not longer commercialy
    viable program to be released as public domain, new people fixing
    old bugs and original developers suing that bug fixes deprive
    users of orignal experience and hence violate artistic integrity.
    Probably not going to work in court, but we had case when
    sensible improvements to buildings were blocked by architects.

    BTW. Our copyright law has notion of "area of exploration" and
    states that copyright transfer is effective only for explicitely
    transferred rights. All right in "areas of exploration" which
    are not explicitely transferred stay with autors. I guess that
    training LLM would count as new "area of exploration"...
    --
    Waldek Hebisch
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From John Levine@johnl@taugh.com to comp.arch on Sat Jun 13 17:25:27 2026
    From Newsgroup: comp.arch

    According to Waldek Hebisch <antispam@fricas.org>:
    My country (Poland) has a rule that once copyright has expired
    distributior of work should pay royalites to the state. In
    am not sure how it works in "interesting" case, but clearly
    this is quite different from US/UK meaning of public domain.

    Do distributors pay state royalties on works of Shakespeare?
    The Bible? Wow.

    Also, law of my country declares some author right as
    untransfreable. Basically, author can sue if he/she/it
    thinks that artistic integrity of the work is violated.

    Those are moral rights, introduced into Berne in the 1920s by
    everyone's favorite copyright advocate, Benito Mussolini.

    The US only recognizes moral rights for visual works like
    paintings and sculpture. There was an interesting case in
    2013 where the owner of a building containing artists' studios
    who had allowed elaborate graffiti on the outside of the building
    decided to tear it down, first whitewashing over the art. The
    artists sued and won $6.7 million. https://en.wikipedia.org/wiki/5_Pointz

    So don't let people doodle on your circuit boards, I guess.
    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From quadibloc@quadibloc@invalid.com (John Savard) to comp.arch on Sun Jun 14 19:14:39 2026
    From Newsgroup: comp.arch

    On Sat, 13 Jun 2026 15:07:00 -0000 (UTC), antispam@fricas.org (Waldek
    Hebisch) wrote:

    My country (Poland) has a rule that once copyright has expired
    distributior of work should pay royalites to the state. In
    am not sure how it works in "interesting" case, but clearly
    this is quite different from US/UK meaning of public domain.

    I once read a science-fiction story in which, after Earth joined an interplanetary confederation, works that were in the public domain
    became works for which the United Nations could charge royalties to
    people from other planets.

    The story was a detective story. It wasn't Shakespeare, but instead
    Bollywood movies that a criminal was modifying and re-selling to a
    planet to the culture of which those movies were well suited.

    Also, law of my country declares some author right as
    untransfreable. Basically, author can sue if he/she/it
    thinks that artistic integrity of the work is violated.

    Most European countries recognize the moral rights of authors.

    John Savard
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Mon Jun 15 16:32:47 2026
    From Newsgroup: comp.arch

    John Levine <johnl@taugh.com> wrote:
    According to Waldek Hebisch <antispam@fricas.org>:
    My country (Poland) has a rule that once copyright has expired
    distributior of work should pay royalites to the state. In
    am not sure how it works in "interesting" case, but clearly
    this is quite different from US/UK meaning of public domain.

    Do distributors pay state royalties on works of Shakespeare?
    The Bible? Wow.

    I am not sure how they handle what in english zone is called
    "derived works". If they reproduce first edition of Shakespeare
    work or say Gutenberg Bible distributors are supposed to pay
    (and I think that they do pay). Copyright to currently sold
    Bible is attributed to translators.
    --
    Waldek Hebisch
    --- Synchronet 3.22a-Linux NewsLink 1.2