Forum: War Ensemble BBS

Re: condition bits, Concertina IV Has Arrived

From David Brown@david.brown@hesbynett.no to comp.arch on Mon May 25 17:18:18 2026

From Newsgroup: comp.arch

On 25/05/2026 16:28, Anton Ertl wrote:

David Brown <david.brown@hesbynett.no> writes:

On 24/05/2026 23:39, quadi wrote:

On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

It makes sense to trap on a floating-point overflow, but trapping on an >>>>> integer overflow is usually a terrible idea.

Most programming environments I have had contact with don't trap on floating-point overflow.

So, detecting something went wrong and you should inform the programmer >>>> is a bad idea ???

The question is if an integer overflow means that something went
wrong.

At the source code level, that is often the case - but not always. I
think it is quite clear that if you do something the language does not
allow, the code is wrong, but it might give the correct results for some
tools nonetheless. And overflow will often mean something went wrong
even when the language (or compiler options) specifically allow it. At
the object code level, things may be different again. (For an obvious example, if you are using a double-width integer type then the source
code may have no overflow but the implementation might use two "add-with-carry" instructions where overflow is a natural part of the implementation.)

Despite their eagerness to "optimize" based on the assumption
that signed integer overflow does not happen, the GCC developers have
avoided making -ftrap the default, even on platforms like MIPS and
Alpha where the implementation of -ftrapv just means to use different instructions (e.g., add instead of addu on MIPS, and addv instead of
add on Alpha).

An awkward thing about using trap on overflow is determining how
precisely it is defined. Supposing you have the expression "a + b - a".
Perhaps "a + b" overflows. I would hope than when using debug-related compiler flags such as "-fsanitize=signed-integer-overflow", a compiler
would check for overflow on "a + b", and report it at runtime.
(Unfortunately, gcc does not do that unless the partial expression is
assigned to a variable.) But in "normal" usage, I'd expect the
expression to be simplified, resulting in just "b" and no overflow.

If "trap on overflow" has precise semantics in the code, then this
disables a range of useful optimisations and re-arrangements. If it is
just "use trapping arithmetic instructions", then it will miss many
possible cases of actual overflow in the code, which we might want to
catch. And "trap on overflow" might either trigger when there is no
overflow in the original code, or hinder optimisations. (Consider the expression "x / 2 + y / 2" - the compiler could implement that as a
combined "(x + y) / 2", but that might introduce overflow.)

It is not easy to see how a tool can avoid false positives and false
negatives and also conveniently optimise and re-arrange code.

The hardware, of course, cannot always enable trapping on overflow if it
is going to efficiently support a range of programming languages. But
as an optional feature it can be helpful for catching a few bugs in
code, so it can be a good idea (both for signed and unsigned overflow).

This supposedly helpful feature has been neglected by C compiler
developers, and you see in the progression from MIPS (1986) to Alpha
(1992) and then RISC-V (2011) that the hardware architects have
accepted that:

MIPS: add traps on signed overflow, you need to write addu if you
don't want that.

Alpha: add ignores signed overflow, you need to write addv if you want
the trapping.

RISC-V: add ignores signed overflow, there is no add that traps on
signed overflow (and detecting signed overflow is pretty
involved if both operands are unknown to the compiler).

- anton

Compilers have not always been good at taking advantage of all the
features provided by hardware - nor have languages been good at exposing
the possibilities in the language so that programmers can take advantage
of them.

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 16:45:07 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Wed, 20 May 2026 01:35:01 +0000, MitchAlsup wrote:

You will find you have no <marketable> choice; you need to support::

Integer{S8, S16, S32, S64, U8, U16, U32, U64}
Float {FP8, FP16, FP32, FP64 and some way to get FP128}

After realizing that I did need a second instruction for unsigned
_division_ I then learned, to my shock, that division was not one, but
two, instructions, at least in my architecture, for integers.

And there didn't seem to be enough opcode space left for Divide Extensibly Unsigned.

My 66000 has an instruction bit that denotes the signedness of integer calculations {Signed, unSigned}. This bit is available as another OpCode
bit for non-integer calculation instructions.

I was able to re-adjust the 32-bit operate instructions so that the two places where only 96 opcodes were provided for the basic operate instructions could now provide 128 opcodes.

The 16-bit and 24-bit short instructions could not be so modified. But
there were a few unused opcodes; so Divide Extensibly Unsigned could still fit in, just out of place.

But that meant that this one operation would be missing from the minimum- length immediate instructions, and would still be treated as out of the basic instruction set, getting immediate instructions that were 16 bits longer, for them.

The Pigeonhole Principle has finally bit me!

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 16:49:59 2026

From Newsgroup: comp.arch

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

David Brown <david.brown@hesbynett.no> writes:

On 24/05/2026 23:39, quadi wrote:

On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:

-----------------

This supposedly helpful feature has been neglected by C compiler
developers, and you see in the progression from MIPS (1986) to Alpha
(1992) and then RISC-V (2011) that the hardware architects have
accepted that:

MIPS: add traps on signed overflow, you need to write addu if you
don't want that.

Alpha: add ignores signed overflow, you need to write addv if you want
the trapping.

RISC-V: add ignores signed overflow, there is no add that traps on
signed overflow (and detecting signed overflow is pretty
involved if both operands are unknown to the compiler).

The worst of all possible semantic encodings

- anton

--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon May 25 16:43:07 2026

From Newsgroup: comp.arch

David Brown <david.brown@hesbynett.no> writes:

On 25/05/2026 16:28, Anton Ertl wrote:

Despite their eagerness to "optimize" based on the assumption
that signed integer overflow does not happen, the GCC developers have
avoided making -ftrap the default, even on platforms like MIPS and
Alpha where the implementation of -ftrapv just means to use different
instructions (e.g., add instead of addu on MIPS, and addv instead of
add on Alpha).

An awkward thing about using trap on overflow is determining how
precisely it is defined. Supposing you have the expression "a + b - a".
Perhaps "a + b" overflows. I would hope than when using debug-related
compiler flags such as "-fsanitize=signed-integer-overflow", a compiler >would check for overflow on "a + b", and report it at runtime. >(Unfortunately, gcc does not do that unless the partial expression is >assigned to a variable.) But in "normal" usage, I'd expect the
expression to be simplified, resulting in just "b" and no overflow.

OTOH, cases like a+b+c where the result is in range, while an
intermediate result is out of range are one of the reasons why I
prefer -fwrapv over -ftrapv. As for your preference of nasal demons,
given enough information, the compiler might "optimize" "a+b-a" into,
e.g., 0.

Anyway, the definition of -ftrapv is not very precise; for gcc-12.2:

|'-ftrapv'
| This option generates traps for signed overflow on addition,
| subtraction, multiplication operations.

As for what gcc-12.2 does for your example on AMD64:

long foo(long a, long b)
{
return a+b-a;
}

is compiled with gcc -O3 -ftrapv to:

0: 48 89 f0 mov %rsi,%rax
3: c3 ret

If "trap on overflow" has precise semantics in the code, then this
disables a range of useful optimisations and re-arrangements. If it is
just "use trapping arithmetic instructions", then it will miss many
possible cases of actual overflow in the code, which we might want to
catch.

Which would you prefer by default?

The gcc developers apparently took the latter approach, even when you
ask for -ftrapv explicitly. So what, IYO, speaks against doing that
by default on machines like MIPS and Alpha.

And "trap on overflow" might either trigger when there is no
overflow in the original code, or hinder optimisations. (Consider the >expression "x / 2 + y / 2" - the compiler could implement that as a
combined "(x + y) / 2", but that might introduce overflow.)

x/2+y/2 produces a different result from (x+y)/2 when both x and y are
odd integers.

gcc-12.2 compiles

long bar(long x, long y)
{
return x/2+y/2;
}

on AMD64 to:

gcc -O3 -ftrapv gcc -O3
mov %rdi,%rax mov %rdi,%rax
sub $0x8,%rsp mov %rsi,%rdx
shr $0x3f,%rax shr $0x3f,%rax
add %rax,%rdi shr $0x3f,%rdx
mov %rsi,%rax add %rdi,%rax
shr $0x3f,%rax add %rsi,%rdx
sar %rdi sar %rax
add %rax,%rsi sar %rdx
sar %rsi add %rdx,%rax
call __addvdi3@PLT ret
add $0x8,%rsp
ret

so the -ftrapv introduces an additional mov and a call; I would have
expected that the + would be compiled to an ADD instruction followed
by a JO instruction.

Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
produces ILP32 code) produces a call to __addvsi3 instead of the
expected add instruction:

gcc -O3 -ftrapv gcc -O3
lui gp,0x0 srl v0,a0,0x1f
addiu gp,gp,0 srl v1,a1,0x1f
addu gp,gp,t9 addu v0,v0,a0
srl v1,a0,0x1f addu a1,v1,a1
lw t9,__addvsi3(gp) sra v0,v0,0x1
srl v0,a1,0x1f sra a1,a1,0x1
addiu sp,sp,-32 jr ra
addu a0,v1,a0 addu v0,v0,a1
addu a1,v0,a1
sra a0,a0,0x1
sw ra,28(sp)
sw gp,16(sp)
jalr t9
sra a1,a1,0x1
lw ra,28(sp)
jr ra
addiu sp,sp,32

The call costs a lot of overhead.

It is not easy to see how a tool can avoid false positives and false >negatives and also conveniently optimise and re-arrange code.

It can't. But it does not try to avoid false negatives even when
explicitly asked for trapping on overflow.

If some overflow trapping when it can be done without additional
instructions would be preferable over no overflow, gcc would compile
signed adds that survive after optimization into add on MIPS rather
than addu, by default. Given that it does not, the GCC developers
probably found out that it is not preferable. I guess they would get
too many customer complaints, including for "relevant" code, i.e.,
code where the usual "it's UB, so your code is broken" excuse does not
work.

The fact that they don't even try to make -ftrapv produce efficient
code indicates that there is no "relevant" interest in efficient
-ftrapv. It would be interesting to know who came up with the idea of
adding -ftrapv, and why they are still keeping it.

Compilers have not always been good at taking advantage of all the
features provided by hardware

GCC is pretty good at implementing -fwrapv. For the two examples
above, "gcc -O3 -fwrapv" produces the same code on AMD64 and MIPS as
"gcc -O3".

nor have languages been good at exposing
the possibilities in the language so that programmers can take advantage
of them.

Yes. But I leave that for another day.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 19:20:01 2026

From Newsgroup: comp.arch

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

David Brown <david.brown@hesbynett.no> writes:

On 25/05/2026 16:28, Anton Ertl wrote:

Despite their eagerness to "optimize" based on the assumption
that signed integer overflow does not happen, the GCC developers have
avoided making -ftrap the default, even on platforms like MIPS and
Alpha where the implementation of -ftrapv just means to use different
instructions (e.g., add instead of addu on MIPS, and addv instead of
add on Alpha).

An awkward thing about using trap on overflow is determining how
precisely it is defined. Supposing you have the expression "a + b - a".
Perhaps "a + b" overflows. I would hope than when using debug-related
compiler flags such as "-fsanitize=signed-integer-overflow", a compiler >would check for overflow on "a + b", and report it at runtime. >(Unfortunately, gcc does not do that unless the partial expression is >assigned to a variable.) But in "normal" usage, I'd expect the
expression to be simplified, resulting in just "b" and no overflow.

OTOH, cases like a+b+c where the result is in range, while an
intermediate result is out of range are one of the reasons why I
prefer -fwrapv over -ftrapv. As for your preference of nasal demons,
given enough information, the compiler might "optimize" "a+b-a" into,
e.g., 0.

a/0/b/

Anyway, the definition of -ftrapv is not very precise; for gcc-12.2:

|'-ftrapv'
| This option generates traps for signed overflow on addition,
| subtraction, multiplication operations.

As for what gcc-12.2 does for your example on AMD64:

long foo(long a, long b)
{
return a+b-a;
}

is compiled with gcc -O3 -ftrapv to:

0: 48 89 f0 mov %rsi,%rax
3: c3 ret

If "trap on overflow" has precise semantics in the code, then this >disables a range of useful optimisations and re-arrangements. If it is >just "use trapping arithmetic instructions", then it will miss many >possible cases of actual overflow in the code, which we might want to >catch.

Which would you prefer by default?

What you do want is compiled code that can trap on overflow and avoid
trapping on overflow without code substitution or being re-compiled.
This way production code can avoid trapping but if the debugger is
turned on, you can trap.

The gcc developers apparently took the latter approach, even when you
ask for -ftrapv explicitly. So what, IYO, speaks against doing that
by default on machines like MIPS and Alpha.

Both architectures got this one wrong--IMO--and so does RISC-V.

And "trap on overflow" might either trigger when there is no
overflow in the original code, or hinder optimisations. (Consider the >expression "x / 2 + y / 2" - the compiler could implement that as a >combined "(x + y) / 2", but that might introduce overflow.)

x/2+y/2 produces a different result from (x+y)/2 when both x and y are
odd integers.

gcc-12.2 compiles

long bar(long x, long y)
{
return x/2+y/2;
}

on AMD64 to:

gcc -O3 -ftrapv gcc -O3
mov %rdi,%rax mov %rdi,%rax
sub $0x8,%rsp mov %rsi,%rdx
shr $0x3f,%rax shr $0x3f,%rax
add %rax,%rdi shr $0x3f,%rdx
mov %rsi,%rax add %rdi,%rax
shr $0x3f,%rax add %rsi,%rdx
sar %rdi sar %rax
add %rax,%rsi sar %rdx
sar %rsi add %rdx,%rax
call __addvdi3@PLT ret
add $0x8,%rsp
ret

so the -ftrapv introduces an additional mov and a call; I would have
expected that the + would be compiled to an ADD instruction followed
by a JO instruction.

Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
produces ILP32 code) produces a call to __addvsi3 instead of the
expected add instruction:

gcc -O3 -ftrapv gcc -O3
lui gp,0x0 srl v0,a0,0x1f
addiu gp,gp,0 srl v1,a1,0x1f
addu gp,gp,t9 addu v0,v0,a0
srl v1,a0,0x1f addu a1,v1,a1
lw t9,__addvsi3(gp) sra v0,v0,0x1
srl v0,a1,0x1f sra a1,a1,0x1
addiu sp,sp,-32 jr ra
addu a0,v1,a0 addu v0,v0,a1
addu a1,v0,a1
sra a0,a0,0x1
sw ra,28(sp)
sw gp,16(sp)
jalr t9
sra a1,a1,0x1
lw ra,28(sp)
jr ra
addiu sp,sp,32

The call costs a lot of overhead.

Architectures without overflow traps are notorious for excess instruction
count when overflow detection is desired or mandated.

It is not easy to see how a tool can avoid false positives and false >negatives and also conveniently optimise and re-arrange code.

It can't. But it does not try to avoid false negatives even when
explicitly asked for trapping on overflow.

Granted, Optimization can do a lot of strange code emission and movement
when one does not care about precise overflow semantics. But, as a whole,
we are a society where we want high HP automobiles more than we want safe automobiles ('we' not including *.gov's).

If some overflow trapping when it can be done without additional
instructions would be preferable over no overflow, gcc would compile
signed adds that survive after optimization into add on MIPS rather
than addu, by default. Given that it does not, the GCC developers
probably found out that it is not preferable. I guess they would get
too many customer complaints, including for "relevant" code, i.e.,
code where the usual "it's UB, so your code is broken" excuse does not
work.

It is much harder than that. For example: does a signed shift left
overflow when significant bits are shifted out ?? What if the sub-
sequent instruction shifts the result back and the pair are acting
as a bit-field extract ?? My 66000 has bit field extracts for exactly
this reason. Floating-point has a lot of these cases, too.

The fact that they don't even try to make -ftrapv produce efficient
code indicates that there is no "relevant" interest in efficient
-ftrapv. It would be interesting to know who came up with the idea of
adding -ftrapv, and why they are still keeping it.

Compilers have not always been good at taking advantage of all the >features provided by hardware

GCC is pretty good at implementing -fwrapv. For the two examples
above, "gcc -O3 -fwrapv" produces the same code on AMD64 and MIPS as
"gcc -O3".

nor have languages been good at exposing
the possibilities in the language so that programmers can take advantage >of them.

Yes. But I leave that for another day.

A whole new kettle of fish...

- anton

--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon May 25 20:26:24 2026

From Newsgroup: comp.arch

On Mon, 25 May 2026 10:23:00 +0200, David Brown wrote:

The hardware, of course, cannot always enable trapping on overflow if it
is going to efficiently support a range of programming languages.

Yes. And I am used to FORTRAN, which did not trap on integer overflows.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon May 25 20:32:15 2026

From Newsgroup: comp.arch

On Mon, 25 May 2026 19:20:01 +0000, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

David Brown <david.brown@hesbynett.no> writes:

On 25/05/2026 16:28, Anton Ertl wrote:

Despite their eagerness to "optimize" based on the assumption
that signed integer overflow does not happen, the GCC developers
have avoided making -ftrap the default, even on platforms like MIPS
and Alpha where the implementation of -ftrapv just means to use
different instructions (e.g., add instead of addu on MIPS, and addv
instead of add on Alpha).

Both architectures got this one wrong--IMO--and so does RISC-V.

You may not have been replying to what Anton Ertl wrote above, since there
was a lot in between that I snipped. But it does mention two architectures that took an approach to trapping on integer overflow... that I also tend
to disagree with.

What I'm used to is the System/360. While it made the mistake of having
two condition code bits instead of NZVC, the idea of having "trap on
overflow" controlled by a bit in the PSW is... what I assumed to be normal
and correct.

I could be wrong, as I haven't examined that approach critically and given full consideration to the alternatives.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Mon May 25 20:32:15 2026

From Newsgroup: comp.arch

David Brown <david.brown@hesbynett.no> schrieb:

On 24/05/2026 23:39, quadi wrote:

On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

It makes sense to trap on a floating-point overflow, but trapping on an >>>> integer overflow is usually a terrible idea.

So, detecting something went wrong and you should inform the programmer
is a bad idea ???

No, so being able to turn the trap for integer overflow on should
definitely be allowed. But that shouldn't be the default behavior.
Otherwise, programs like random number generators wouldn't work.

John Savard

That does not make sense. Code such as random number generators should
be written so that they are correct in the language they are written in.

In principle, yes.

In practice, people often used whatever "worked" on their systems.
Implementors have a certain right because they control what their
compiler does or does not do. But users did so, as well, with
Numerical Recipes a(n in)famous example.

And yes, this bites people. You can see this at https://gcc.gnu.org/gcc-13/porting_to.html :

# GCC 13 includes new optimizations which may change behavior
# on integer overflow. Traditional code, like linear congruential
# pseudo-random number generators in old programs and relying on
# a specific, non-standard behavior may now generate unexpected
# results. The option -fsanitize=undefined can be used to detect
# such code at runtime.

# It is recommended to use the intrinsic subroutine RANDOM_NUMBER for
# random number generators or, if the old behavior is desired, to use
# the -fwrapv option. Note that this option can impact performance.

If that is C, signed integer overflow is UB while unsigned integers
have wrapping behaviour - thus if your code depends on wrapping, and it
is written in C, it needs to use unsigned types or compiler-specific extensions, flags, etc. (Or C23 ckd_add and other checked arithmetic functions.)

If it is written in Zig, you need to use the specific modulo arithmetic functions even for unsigned arithmetic. If it is written in Java,
signed integer arithmetic is fine.

It all depends on the language and/or any options the language and tools might support - and code should be written to work correctly according
to the language rules.

Fortran has no standard way of implementing this unless you
restrict yourself to sizes which do not overflow a signed integer.
Implementing LCGRNGs was one reason why I pushed for unsigned
arithmetic (modulo 2**n) in Fortran. The attempt failed (not
taken up by WG5 after being endorsed by J3), but I implemented it
for gfortran anyway.

The hardware, of course, cannot always enable trapping on overflow if it
is going to efficiently support a range of programming languages. But
as an optional feature it can be helpful for catching a few bugs in
code, so it can be a good idea (both for signed and unsigned overflow).

Sanitizers are also fairly good now, but of course cost performance.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon May 25 20:34:41 2026

From Newsgroup: comp.arch

On Mon, 25 May 2026 16:49:59 +0000, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

RISC-V: add ignores signed overflow, there is no add that traps on
signed overflow (and detecting signed overflow is pretty
involved if both operands are unknown to the compiler).

The worst of all possible semantic encodings

Although I thought that making trapping on fixed-point overflow the
default is a bad idea, I agree that making it impossible to do so, or even test for fixed-point overflow, is a much worse idea.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon May 25 20:45:20 2026

From Newsgroup: comp.arch

On Mon, 25 May 2026 16:45:07 +0000, MitchAlsup wrote:

My 66000 has an instruction bit that denotes the signedness of integer calculations {Signed, unSigned}. This bit is available as another OpCode
bit for non-integer calculation instructions.

That's nice. It's not an option I can consider, as having lots of
orthogonal modifiers on instructions would tend to increase their length.
A major goal of the Concertina II, III, and IV architectures is for instructions not to be longer than similar instructions on the Motorola
68020 or the IBM System/360 if at all possible.

Basically, the selling point is... "Your programs only get 10% bigger, if that, and yet you have 32 registers, so they run faster!".

Or they _would_, if the design didn't have so many extra transistors for supporting both IBM-format and Intel-format Decimal Floating Point, old-
style IBM floats, simple floating (You too can work with numbers that go around the world 2 1/2 times!), packed decimal, mixed-radix arithmetic...

But, hey, supporting these things in hardware is faster than doing them in software!

And are people even going to _read_ the part of the manual that
explains... as is noted in the description of the original Concertina architecture...

This chip has 8-way simultaneous multi-threading, but only for programs
which do not make use of extensions to the register set.

Only two programs per core may use the extended register banks with 128 elements.

Only one program per core may use the vector registers for long vector instructions. The 256-bit short vector registers, on the other hand, like
the integer and floating-point registers, are available to all
simultaneous threads.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon May 25 20:32:35 2026

From Newsgroup: comp.arch

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
What you do want is compiled code that can trap on overflow and avoid >trapping on overflow without code substitution or being re-compiled.
This way production code can avoid trapping but if the debugger is
turned on, you can trap.

Why do you consider that desirable?

long bar(long x, long y)
{
return x/2+y/2;
}

...

Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
produces ILP32 code) produces a call to __addvsi3 instead of the
expected add instruction:

gcc -O3 -ftrapv gcc -O3
lui gp,0x0 srl v0,a0,0x1f
addiu gp,gp,0 srl v1,a1,0x1f
addu gp,gp,t9 addu v0,v0,a0
srl v1,a0,0x1f addu a1,v1,a1
lw t9,__addvsi3(gp) sra v0,v0,0x1
srl v0,a1,0x1f sra a1,a1,0x1
addiu sp,sp,-32 jr ra
addu a0,v1,a0 addu v0,v0,a1
addu a1,v0,a1
sra a0,a0,0x1
sw ra,28(sp)
sw gp,16(sp)
jalr t9
sra a1,a1,0x1
lw ra,28(sp)
jr ra
addiu sp,sp,32

The call costs a lot of overhead.

Architectures without overflow traps are notorious for excess instruction >count when overflow detection is desired or mandated.

MIPS' add traps on overflow. gcc could have emitted almost the same
code for gcc -O3 -trapv as for gcc -O3, except that the last
instruction would be an add, not an addu. But apparently nobody gives
a damn about the efficiency of -trapv, possibly rightly so.

If some overflow trapping when it can be done without additional
instructions would be preferable over no overflow, gcc would compile
signed adds that survive after optimization into add on MIPS rather
than addu, by default. Given that it does not, the GCC developers
probably found out that it is not preferable. I guess they would get
too many customer complaints, including for "relevant" code, i.e.,
code where the usual "it's UB, so your code is broken" excuse does not
work.

It is much harder than that. For example: does a signed shift left
overflow when significant bits are shifted out ??

-ftrapv specifies trapping on overflow only for additions,
subtractions, and multiplications.
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Mon May 25 16:34:50 2026

From Newsgroup: comp.arch

On 5/25/2026 9:28 AM, Anton Ertl wrote:

David Brown <david.brown@hesbynett.no> writes:

On 24/05/2026 23:39, quadi wrote:

On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

It makes sense to trap on a floating-point overflow, but trapping on an >>>>> integer overflow is usually a terrible idea.

Most programming environments I have had contact with don't trap on floating-point overflow.

Many just go Inf...

Division by zero is usually handled by going NaN.

Contrast with integer division by zero which does usually trap.

So, detecting something went wrong and you should inform the programmer >>>> is a bad idea ???

The question is if an integer overflow means that something went
wrong. Despite their eagerness to "optimize" based on the assumption
that signed integer overflow does not happen, the GCC developers have
avoided making -ftrap the default, even on platforms like MIPS and
Alpha where the implementation of -ftrapv just means to use different instructions (e.g., add instead of addu on MIPS, and addv instead of
add on Alpha).

Integer overflow happens far too often for trapping to be a good solution.

We almost need a separate "integer that should not overflow" type, with
more explicit "do something special if it does" semantics.

Though, more likely to be useful would be a "detect if an overflow had happened" mechanism.

errno_t ovfstate;
__int_no_overflow x, y, z;
...
__start_errsense(&ovfstate);
z=x+y;
__end_errsense(&ovfstate);
if(ovfstate&ERRSENSE_FLAG_OVERFLOW)
...

Which would be awkward, but probably more useful than, say, raising a
signal and/or terminating the program.

The hardware, of course, cannot always enable trapping on overflow if it
is going to efficiently support a range of programming languages. But
as an optional feature it can be helpful for catching a few bugs in
code, so it can be a good idea (both for signed and unsigned overflow).

This supposedly helpful feature has been neglected by C compiler
developers, and you see in the progression from MIPS (1986) to Alpha
(1992) and then RISC-V (2011) that the hardware architects have
accepted that:

MIPS: add traps on signed overflow, you need to write addu if you
don't want that.

Alpha: add ignores signed overflow, you need to write addv if you want
the trapping.

RISC-V: add ignores signed overflow, there is no add that traps on
signed overflow (and detecting signed overflow is pretty
involved if both operands are unknown to the compiler).

In practice, given:
We have instructions like ADDW, etc, whose behavior is explicitly to sign-extend the results of 32-bit ADD;
Behavior in practice is often to meticulously follow wrap-on-overflow semantics;
Exceptions to wrap-on-overflow usually exist as edge cases;
Various programs exist that will actively break if wrap-on-overflow is
not the observed behavior in C land;
...

The expectation that 'int' can or meaningfully do something other than
wrap on overflow is more of a fantasy.

Or like some other some other "portability boogeymen":
Non two's complement integer arithmetic;
Big endian machines;
Machines that don't allow unaligned loads and stores;
Types with sizes other than the "usually accepted" set;
...

The argument has often been, "but, 64-bit machines might not provide
native 32-bit arithmetic".

But, often in 64-bit machines, a pattern emerges:
Most ops are full 64-bit;
A subset of instructions have variants that produce sign and/or zero
extended results;
The instructions which produce these results, typically being, the ones
needed to preserve the usual wrap-on-overflow semantics in those places
where something could happen that would produce a deviation from the
expected semantics.

The ones that have zero-extension usually treating signed integers as zero-extended.

The reverse has also been done; treating unsigned as sign-extended, as
in the standard RISC-V ABI, but IMO this is stupid. Even in the absence
of a native zero-extension op (as in plain RV64G), the mess that results
from sign-extending unsigned is worse than the cost of explicit zero extension.

Best case here being to keep values using "native extension":
'int' : Always sign extended;
'unsigned int': Always zero extended.
Then 32-bit types are a strict subset of the 64-bit range, and
up-promotion becomes free. Not sure why some people don't see this as
obvious though. Well, and people keep making the choice of adding
garbage edge cases to RISC-V that would have been entirely unnecessary
if people weren't being stupid about the ABI rules.

But yeah...

But, all this would not be expected to happen unless one accepts that it
is already generally accepted that wrap-on-overflow for 'int' and
similar is the only really practical or viable solution here.

Otherwise, recently:
In my case I decided to live with a "breaking change" in XG3 and to
change some things that may matter later. Then ended up tweaking some
other things on my annoyance list (since I was already breaking existing binaries, better to cluster breakage to a singular event if doing it).

ADD, ADDS.L, and ADDU.L have all been changed from Imm10u/n to Imm10s.
The Imm10u cases are now Imm10s;
The Imm10n sub-case is now dropped/reserved.
May be reused later.
This reclaims 3 out of the 20 Imm10 spots.
Was mostly a case of it being harder to justify the encoding space.
Old behavior will need to remain for XG1 and XG2.
In this case, XG3 will explicitly deviate from XG1 and XG2 here.
Does mean that XG3 now had less ADD/SUB Imm range than XG2, but...
Only goes from 97.1% hit rate to 95.9%,
no significant effect on overall code density.
Could use the RV Imm12 ops (ADDI / ADDIW), but:
Hit rate for the RV ops here is negligible;
Much of these also happen to miss on one or both registers.

The MULS.L and MULU.L ops were also switched to Imm10s.
This means all of the Imm10 ALU ops are now unified on Imm10s.

Relocated TST and TSTN from the F0-8 block (with the XMOV instructions)
to the F0-9 block (with the other CMPxx 3R ops).

A few very rarely used instructions were demoted from 32-bit to 64-bit encodings.

Have experimentally added some 32-bit:
Bcc Rm, Imm6s, (PC, Disp6s)
instructions, where:
Imm6s: Hits ~ 80% of these cases;
Disp6s: Hits ~ 60% of these cases;
Imm5s + Disp7s would hit slightly better, but,
would have needed more new decoder logic...
Resulting in it hitting about half over the:
Bcc Rm, Imm17s, (PC, Disp10s)
Cases, for an overall code-density improvement of ~ 0.5%, ...
Dominant use-case: Final compare-and-branch in a short "for()" loop.
Secondary use-case: Short non-predicated "if()" branches.
But, is out-weighed by said predicated "if()" branches.
Would likely see more use here if not using predication.
If it would have hit for 100% of these, would have saved ~ 1%.

This is debatable.

This reused the encoding spots previously used for the Load-Disp5us ops,
which still exist for XG1 and XG2 (decoder special-case handling), but
were N/A in XG3 (they would be in effect entirely redundant with the
Disp10s forms in XG3; but had non-redundant edge-cases in XG1 and XG2).

Like with the Imm17s+Disp10s ops, these will still depend on the IMMB extension, as they still need the same basic mechanism.

Was a fairly low-priority feature, in any case.

Seemingly running low on obvious optimization paths.

- anton

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 22:49:58 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Mon, 25 May 2026 10:23:00 +0200, David Brown wrote:

The hardware, of course, cannot always enable trapping on overflow if it
is going to efficiently support a range of programming languages.

Yes. And I am used to FORTRAN, which did not trap on integer overflows.

WATfor and WATfive trapped on integer overflows.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 22:51:42 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Mon, 25 May 2026 19:20:01 +0000, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

David Brown <david.brown@hesbynett.no> writes:

On 25/05/2026 16:28, Anton Ertl wrote:

Despite their eagerness to "optimize" based on the assumption
that signed integer overflow does not happen, the GCC developers
have avoided making -ftrap the default, even on platforms like MIPS
and Alpha where the implementation of -ftrapv just means to use
different instructions (e.g., add instead of addu on MIPS, and addv
instead of add on Alpha).

Both architectures got this one wrong--IMO--and so does RISC-V.

You may not have been replying to what Anton Ertl wrote above, since there was a lot in between that I snipped. But it does mention two architectures that took an approach to trapping on integer overflow... that I also tend
to disagree with.

What I'm used to is the System/360. While it made the mistake of having
two condition code bits instead of NZVC, the idea of having "trap on overflow" controlled by a bit in the PSW is... what I assumed to be normal and correct.

And what My 66000 does....

I purport that ANY Industrial quality ISA should provide a means to
trap on integer overflow.

I could be wrong, as I haven't examined that approach critically and given full consideration to the alternatives.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 22:59:10 2026

From Newsgroup: comp.arch

Thomas Koenig <tkoenig@netcologne.de> posted:

David Brown <david.brown@hesbynett.no> schrieb:

On 24/05/2026 23:39, quadi wrote:

On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

It makes sense to trap on a floating-point overflow, but trapping on an >>>> integer overflow is usually a terrible idea.

So, detecting something went wrong and you should inform the programmer >>> is a bad idea ???

No, so being able to turn the trap for integer overflow on should
definitely be allowed. But that shouldn't be the default behavior.
Otherwise, programs like random number generators wouldn't work.

John Savard

That does not make sense. Code such as random number generators should
be written so that they are correct in the language they are written in.

In principle, yes.

Principle is better in theory than in practice.

In practice, people often used whatever "worked" on their systems.

Face it, the poor slug writing the code may not have the faintest
grasp at the system qualities we are discussing, and does not care
to learn as long as he can slug through the writing and his pro-
gram not blow up catastrophically while it is under his purview.

That defines a lot of what is wrong with SW programming today.

Implementors have a certain right because they control what their
compiler does or does not do.

You would be surprised at how little influence implementors have
on compilers and other software.

But users did so, as well, with
Numerical Recipes a(n in)famous example.

And yes, this bites people. You can see this at https://gcc.gnu.org/gcc-13/porting_to.html :

# GCC 13 includes new optimizations which may change behavior
# on integer overflow. Traditional code, like linear congruential
# pseudo-random number generators in old programs and relying on
# a specific, non-standard behavior may now generate unexpected
# results. The option -fsanitize=undefined can be used to detect
# such code at runtime.

My VAX favorite was:

for( int i = 1; i; i+=i )

Traps instead of exiting the loop normally.

# It is recommended to use the intrinsic subroutine RANDOM_NUMBER for
# random number generators or, if the old behavior is desired, to use
# the -fwrapv option. Note that this option can impact performance.

If that is C, signed integer overflow is UB while unsigned integers
have wrapping behaviour - thus if your code depends on wrapping, and it
is written in C, it needs to use unsigned types or compiler-specific extensions, flags, etc. (Or C23 ckd_add and other checked arithmetic functions.)

If it is written in Zig, you need to use the specific modulo arithmetic functions even for unsigned arithmetic. If it is written in Java,
signed integer arithmetic is fine.

It all depends on the language and/or any options the language and tools might support - and code should be written to work correctly according
to the language rules.

Fortran has no standard way of implementing this unless you
restrict yourself to sizes which do not overflow a signed integer.

Old FORTRAN had no unSigned integer type and no way to avoid overflows.

Implementing LCGRNGs was one reason why I pushed for unsigned
arithmetic (modulo 2**n) in Fortran. The attempt failed (not
taken up by WG5 after being endorsed by J3), but I implemented it
for gfortran anyway.

The hardware, of course, cannot always enable trapping on overflow if it is going to efficiently support a range of programming languages. But
as an optional feature it can be helpful for catching a few bugs in
code, so it can be a good idea (both for signed and unsigned overflow).

Sanitizers are also fairly good now, but of course cost performance.

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 23:00:32 2026

From Newsgroup: comp.arch

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
What you do want is compiled code that can trap on overflow and avoid >trapping on overflow without code substitution or being re-compiled.
This way production code can avoid trapping but if the debugger is
turned on, you can trap.

Why do you consider that desirable?

So you can debug production/released code to find subtle errors.

long bar(long x, long y)
{
return x/2+y/2;
}

...

Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
produces ILP32 code) produces a call to __addvsi3 instead of the
expected add instruction:

gcc -O3 -ftrapv gcc -O3
lui gp,0x0 srl v0,a0,0x1f
addiu gp,gp,0 srl v1,a1,0x1f
addu gp,gp,t9 addu v0,v0,a0
srl v1,a0,0x1f addu a1,v1,a1
lw t9,__addvsi3(gp) sra v0,v0,0x1
srl v0,a1,0x1f sra a1,a1,0x1
addiu sp,sp,-32 jr ra
addu a0,v1,a0 addu v0,v0,a1
addu a1,v0,a1
sra a0,a0,0x1
sw ra,28(sp)
sw gp,16(sp)
jalr t9
sra a1,a1,0x1
lw ra,28(sp)
jr ra
addiu sp,sp,32

The call costs a lot of overhead.

Architectures without overflow traps are notorious for excess instruction >count when overflow detection is desired or mandated.

MIPS' add traps on overflow. gcc could have emitted almost the same
code for gcc -O3 -trapv as for gcc -O3, except that the last
instruction would be an add, not an addu. But apparently nobody gives
a damn about the efficiency of -trapv, possibly rightly so.

If some overflow trapping when it can be done without additional
instructions would be preferable over no overflow, gcc would compile
signed adds that survive after optimization into add on MIPS rather
than addu, by default. Given that it does not, the GCC developers
probably found out that it is not preferable. I guess they would get
too many customer complaints, including for "relevant" code, i.e.,
code where the usual "it's UB, so your code is broken" excuse does not
work.

It is much harder than that. For example: does a signed shift left
overflow when significant bits are shifted out ??

-ftrapv specifies trapping on overflow only for additions,
subtractions, and multiplications.

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 23:03:03 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Mon, 25 May 2026 16:45:07 +0000, MitchAlsup wrote:

My 66000 has an instruction bit that denotes the signedness of integer calculations {Signed, unSigned}. This bit is available as another OpCode bit for non-integer calculation instructions.

That's nice. It's not an option I can consider, as having lots of
orthogonal modifiers on instructions would tend to increase their length.

And harm instruction Entropy.

A major goal of the Concertina II, III, and IV architectures is for instructions not to be longer than similar instructions on the Motorola 68020 or the IBM System/360 if at all possible.

Basically, the selling point is... "Your programs only get 10% bigger, if that, and yet you have 32 registers, so they run faster!".

Mine are getting 30% smaller and needing fewer instructions at the same
time

Or they _would_, if the design didn't have so many extra transistors for supporting both IBM-format and Intel-format Decimal Floating Point, old- style IBM floats, simple floating (You too can work with numbers that go around the world 2 1/2 times!), packed decimal, mixed-radix arithmetic...

But, hey, supporting these things in hardware is faster than doing them in software!

And are people even going to _read_ the part of the manual that
explains... as is noted in the description of the original Concertina architecture...

This chip has 8-way simultaneous multi-threading, but only for programs which do not make use of extensions to the register set.

Another One Bites the Dust.....

Only two programs per core may use the extended register banks with 128 elements.

Only one program per core may use the vector registers for long vector instructions. The 256-bit short vector registers, on the other hand, like the integer and floating-point registers, are available to all
simultaneous threads.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon May 25 23:05:06 2026

From Newsgroup: comp.arch

BGB <cr88192@gmail.com> posted:

On 5/25/2026 9:28 AM, Anton Ertl wrote:

--------------

Integer overflow happens far too often for trapping to be a good solution.

Even on 64-bit variables/machines ??
--- Synchronet 3.22a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Mon May 25 20:02:52 2026

From Newsgroup: comp.arch

On 5/25/2026 3:34 PM, quadi wrote:

On Mon, 25 May 2026 16:49:59 +0000, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

RISC-V: add ignores signed overflow, there is no add that traps on
signed overflow (and detecting signed overflow is pretty
involved if both operands are unknown to the compiler).

The worst of all possible semantic encodings

Although I thought that making trapping on fixed-point overflow the
default is a bad idea, I agree that making it impossible to do so, or even test for fixed-point overflow, is a much worse idea.

Possibly true.

The lack of things like ADD-with-Carry or ADD-with-Overflow are
annoyance points on RISC-V.

Though, it is less obvious what a useful behavior is at the language level:
"signal()" ? ...
Something like try/catch (mostly N/A to C)?
Something similar to FENV_ACCESS?
...

Well, and that if trapping were applied globally:
Overhead due to trap detection/handling code causing excessive bloat;
Overflows traps from any code that naively assumes wrap-on-overflow
semantics;
...

In some codebases, it is already enough of a pain to hunt and fix all
the out-of-bounds and uninitialized variables mess.
Signed integer overflows would likely "turn it up to 11";
Then, how does one fix it? Ask that people start adding a bunch of casts
to make it work?...

One might say:
Add "if()" cases to deal with the overflows, but, ... this only makes
sense for cases where the overflows are not the expected behavior.

Then again, could maybe classify code, say:
1, signed, value doesn't (or shouldn't) go out-of-range;
2, unsigned, value doesn't (or shouldn't) go out-of-range;
3, signed, value is expected to be modulo;
4, unsigned, value is expected to be modulo.

"nasal demons" types assume 1 and 4 as dominant.
Or, 1 as exclusive vs 3.

For compilers, we often need to assume 3 and 4.
Because, failure to uphold 3 results in misbehaving programs.
And, if 3 were uncommon, RISC-V's "ADDW"/etc would be pure stupidity.
Instead:
Something like plain ADD plus ADDWU would have made sense.
But, they dropped ADDWU instead (also stupid IMO).

While, granted, a lot of 1 code likely exists, 3 code tends to generate
the vast majority of overflows; and if there is any reasonable
expectation for 'int' to overflow, and it is not desired for int to
overflow.

We mostly ignore 2 vs 4, because standard specifies 4 making 2 to be
purely a programming error, in which case "2" becomes "should have used
a bigger signed type instead".

Then again, could maybe make sense to add a semantic distinction, say:
"int" (plain):
Maybe a case could be made that overflow be assumed unexpected.
"signed int":
Maybe make separate from plain case, explicitly modulo;
So, could be made distinct;
Explicitly like the "unsigned" case in being modulo.
"unsigned int":
Remains the same, no real controversy here.

Or, say:
char, short, int, long, long long:
For code, assume that overflow may be unexpected / undesirable;
signed char, signed int, signed long, signed long long:
Assume signed modulo;
Compiler should, ideally, always produce wrap-on-overflow semantics.
unsigned ...:
Unsigned modulo.

For a compiler, then:
-ftrapv:
May ideally trap on lack of "signed";
Explicit "signed", continues to wrap.
-fwrapv:
Both default and signed will wrap.
Neither:
Dunno, probably better for compiler to assume "-fwrapv" semantics;
Maybe assume UB opts are safe if no "signed".

Well, and for the programmer POV:
If assuming maximum portability:
Only unsigned overflow wrapping is "safe".
If assuming "any reasonable system":
Both will wrap in most cases;
Absent "-fwrapv", UB opts may occur in certain obscure edge cases.
Though usually in the form of "early" vs "late" type promotion;
In most cases, where it does occur, early promotion is benign.
Vs whatever "nasal demons" people may assert.
What else, that it late propmotes?
(as "-fwrapv" semantics would dictate...)

Like, say:
int x;
long z;
...
z = 42 - x;
//Oh no! UB opt has turned this into a 64-bit RSUB instruction!

Yeah...

Granted, ATM, for BGBCC, wouldn't make much difference at present. Could
maybe make sense to add a distinction either to strengthen semantic
analysis, or if I decided to change away from my existing "assume wrap
on overflow semantics as sole option" policy. Or maybe adding an
"-fno-wrapv" option, with "wrapv" remaining default but allowing an
option to opt-out, sort of like how there is an "-fptropts" option to
"opt into" strict-aliasing / TBAA semantics, vs the default semantics of "assume every explicit store may alias" semantics. Though, may still
assume that loads may be cached and reordered, unless "volatile" is
used, which explicitly disallows caching and reordering loads, though at present is a little "shotgun" and will basically disable caching
throughout the whole basic block; which works as a detractor to the
"casually use volatile as a way to dispel TBAA" interpretation (works on
GCC, and is less adverse for performance than the "use memcpy" option on
some other compilers, ...).

Or, say:
Bare pointer cast and deref:
GCC: averse (falls afoul of default semantics);
MSVC: benign;
BGBCC: benign.
Volatile pointer cast and deref:
GCC: benign (doesn't use TBAA on volatile pointers);
MSVC: benign;
BGBCC: detrimental, disables caching and ld/st reordering;
Using memcpy:
GCC: benign;
MSVC:
Old (15+ years):
Averse (actually calls memcpy, significant impact);
Some intermediate versions would do an inline for "REP MOVSB".
Also kinda crap, but less bad vs calling "memcpy()".
Mostly only matters if still targeting WinXP or similar.
Newer: Mild detriment in some cases.
Inline loads/stores
may fail to optimize to plain register moves for locals.
BGBCC;
Mostly similar to newer MSVC here;
Works, just less efficient than plain "cast and deref".

...

--- Synchronet 3.22a-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon May 25 15:27:29 2026

From Newsgroup: comp.arch

An awkward thing about using trap on overflow is determining how
precisely it is defined.

Indeed, this is a nasty part of language design.

[ IMO, the only sane choice (beside wrapping and explicit `ckd_add`) is
to treat overflow not as a exception (in the sense of `try..catch`
thingies, not in the CPU hardware sense of the word) but as an
execution error comparable to memory exhaustion. ]

Luckily, for `comp.arch` the same problem doesn't plague ISAs because
it's accepted that a CPU should stick religiously to the literal
semantics of the machine code, no matter how far it is from what
really happens inside the machine.

=== Stefan
--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Tue May 26 05:39:02 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> schrieb:

On Mon, 25 May 2026 10:23:00 +0200, David Brown wrote:

The hardware, of course, cannot always enable trapping on overflow if it
is going to efficiently support a range of programming languages.

Yes. And I am used to FORTRAN, which did not trap on integer overflows.

Incorrect.

Integer overflow is illegal in Fortran, so what the compiler then
does is not determined (see my post on random number generators).

Example:

$ cat overfl.f90
program main
integer :: a, b
a = 12345678
b = 2345678
print *,a*b
end program main
$ gfortran -fsanitize=undefined overfl.f90
$ ./a.out
overfl.f90:5:13: runtime error: signed integer overflow: 12345678 * 2345678 cannot be represented in type 'integer(kind=4)'
-1979197244
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Tue May 26 08:18:17 2026

From Newsgroup: comp.arch

On 25/05/2026 18:43, Anton Ertl wrote:

David Brown <david.brown@hesbynett.no> writes:

On 25/05/2026 16:28, Anton Ertl wrote:

Despite their eagerness to "optimize" based on the assumption
that signed integer overflow does not happen, the GCC developers have
avoided making -ftrap the default, even on platforms like MIPS and
Alpha where the implementation of -ftrapv just means to use different
instructions (e.g., add instead of addu on MIPS, and addv instead of
add on Alpha).

An awkward thing about using trap on overflow is determining how
precisely it is defined. Supposing you have the expression "a + b - a".
Perhaps "a + b" overflows. I would hope than when using debug-related
compiler flags such as "-fsanitize=signed-integer-overflow", a compiler
would check for overflow on "a + b", and report it at runtime.
(Unfortunately, gcc does not do that unless the partial expression is
assigned to a variable.) But in "normal" usage, I'd expect the
expression to be simplified, resulting in just "b" and no overflow.

OTOH, cases like a+b+c where the result is in range, while an
intermediate result is out of range are one of the reasons why I
prefer -fwrapv over -ftrapv. As for your preference of nasal demons,
given enough information, the compiler might "optimize" "a+b-a" into,
e.g., 0.

Anyway, the definition of -ftrapv is not very precise; for gcc-12.2:

|'-ftrapv'
| This option generates traps for signed overflow on addition,
| subtraction, multiplication operations.

My understanding is that the GCC developers would rather deprecate
-ftrapv entirely, and encourage the use of -fsanitize instead as a way
to detect run-time errors. I don't know the details of the internals,
but I believe the GCC developers see the sanitize options as more
accurate and more likely to be further developed in the future.

As for what gcc-12.2 does for your example on AMD64:

long foo(long a, long b)
{
return a+b-a;
}

is compiled with gcc -O3 -ftrapv to:

0: 48 89 f0 mov %rsi,%rax
3: c3 ret

If "trap on overflow" has precise semantics in the code, then this
disables a range of useful optimisations and re-arrangements. If it is
just "use trapping arithmetic instructions", then it will miss many
possible cases of actual overflow in the code, which we might want to
catch.

Which would you prefer by default?

I don't know for sure. A "by default" choice has to be suitable for a
wide variety of users and a wide variety of cases, and preferably err on
the side of caution. For my own personal use, I'm happy with UB
overflow and would have preferred that as the default even for unsigned arithmetic (but of course with a way to specify wrapping when I need
it). But that's for /my/ use - I don't think that should necessarily be
the default for others. Let those who are willing to spend the time and effort learning the details and the care needed use compiler flags to
get the highest efficiency from their code, and let the defaults help
others catch their bugs. However, the logical endpoint of that is that
C should only be used by those that have a detailed understanding of the language and need it for peak efficiency, while other programmers should
work with other languages that have more error handling.

The gcc developers apparently took the latter approach, even when you
ask for -ftrapv explicitly. So what, IYO, speaks against doing that
by default on machines like MIPS and Alpha.

And "trap on overflow" might either trigger when there is no
overflow in the original code, or hinder optimisations. (Consider the
expression "x / 2 + y / 2" - the compiler could implement that as a
combined "(x + y) / 2", but that might introduce overflow.)

x/2+y/2 produces a different result from (x+y)/2 when both x and y are
odd integers.

True. Can we pretend that is not the case, and still see my point? The
point is that the compiler can, during re-arrangements, introduce new overflows as long as it knows the final results are correct (since the compiler knows the details of how instructions are actually implemented).

gcc-12.2 compiles

long bar(long x, long y)
{
return x/2+y/2;
}

on AMD64 to:

gcc -O3 -ftrapv gcc -O3
mov %rdi,%rax mov %rdi,%rax
sub $0x8,%rsp mov %rsi,%rdx
shr $0x3f,%rax shr $0x3f,%rax
add %rax,%rdi shr $0x3f,%rdx
mov %rsi,%rax add %rdi,%rax
shr $0x3f,%rax add %rsi,%rdx
sar %rdi sar %rax
add %rax,%rsi sar %rdx
sar %rsi add %rdx,%rax
call __addvdi3@PLT ret
add $0x8,%rsp
ret

so the -ftrapv introduces an additional mov and a call; I would have
expected that the + would be compiled to an ADD instruction followed
by a JO instruction.

Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
produces ILP32 code) produces a call to __addvsi3 instead of the
expected add instruction:

gcc -O3 -ftrapv gcc -O3
lui gp,0x0 srl v0,a0,0x1f
addiu gp,gp,0 srl v1,a1,0x1f
addu gp,gp,t9 addu v0,v0,a0
srl v1,a0,0x1f addu a1,v1,a1
lw t9,__addvsi3(gp) sra v0,v0,0x1
srl v0,a1,0x1f sra a1,a1,0x1
addiu sp,sp,-32 jr ra
addu a0,v1,a0 addu v0,v0,a1
addu a1,v0,a1
sra a0,a0,0x1
sw ra,28(sp)
sw gp,16(sp)
jalr t9
sra a1,a1,0x1
lw ra,28(sp)
jr ra
addiu sp,sp,32

The call costs a lot of overhead.

Agreed. I don't know why GCC uses a function call here. In my quick
godbolt testing, clang uses the "add, jump-on-overflow" sequence.

Using

-fsanitize=signed-integer-overflow -fsanitize-trap

gives an add followed by a jump-on-overflow sequence.

It is not easy to see how a tool can avoid false positives and false
negatives and also conveniently optimise and re-arrange code.

It can't. But it does not try to avoid false negatives even when
explicitly asked for trapping on overflow.

If some overflow trapping when it can be done without additional
instructions would be preferable over no overflow, gcc would compile
signed adds that survive after optimization into add on MIPS rather
than addu, by default. Given that it does not, the GCC developers
probably found out that it is not preferable. I guess they would get
too many customer complaints, including for "relevant" code, i.e.,
code where the usual "it's UB, so your code is broken" excuse does not
work.

If "-ftrapv" is to have any use at all, then overflow is no longer UB -
it has to be defined to trap. But I have to conclude that in GCC,
-ftrapv is too vaguely defined and too inconsistently and inefficiently implemented to be of any use. This matches my understanding that the "-fsanitize=signed-integer-overflow -fsanitize-trap" flags are preferred
by the GCC developers.

The fact that they don't even try to make -ftrapv produce efficient
code indicates that there is no "relevant" interest in efficient
-ftrapv. It would be interesting to know who came up with the idea of
adding -ftrapv, and why they are still keeping it.

Compilers have not always been good at taking advantage of all the
features provided by hardware

GCC is pretty good at implementing -fwrapv. For the two examples
above, "gcc -O3 -fwrapv" produces the same code on AMD64 and MIPS as
"gcc -O3".

That is my experience too (though I expect your experience here vastly outweighs mine).

nor have languages been good at exposing
the possibilities in the language so that programmers can take advantage
of them.

Yes. But I leave that for another day.

Good idea :-)

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Tue May 26 08:27:28 2026

From Newsgroup: comp.arch

On 26/05/2026 01:00, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
What you do want is compiled code that can trap on overflow and avoid
trapping on overflow without code substitution or being re-compiled.
This way production code can avoid trapping but if the debugger is
turned on, you can trap.

Why do you consider that desirable?

So you can debug production/released code to find subtle errors.

I think that when an unexpected error is detected (whether it is with
hardware acceleration, like trap on overflow, or via explicit generated
code), the way to handle it depends strongly on the situation. If a
debugger is present, then it is most helpful to lead to a debugger break
so that the developer can figure out what went wrong. When not
debugging, there is no sensible default handling that works for jet
engine controllers and video game frame generators.

But I do support the aim of having the same generated code when
debugging and when shipping - I am not a fan of "release" builds and
"debug" builds. (Of course you might temporarily do builds with
different flags while chasing down a particular bug.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue May 26 15:13:31 2026

From Newsgroup: comp.arch

On Sun, 24 May 2026 16:39:25 +0000, quadi wrote:

On Sun, 24 May 2026 15:24:22 +0000, John Levine wrote:

Sure they did. S/360 had separate unsigned versions of add and subtract
instructions. The results were the same but the condition codes were
different and the unsigned versions couldn't overflow.

Ah, I didn't remember that!

I just looked it up. It was, and is, the Add Logical instruction.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue May 26 18:02:51 2026

From Newsgroup: comp.arch

BGB <cr88192@gmail.com> posted:

On 5/25/2026 3:34 PM, quadi wrote:

On Mon, 25 May 2026 16:49:59 +0000, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

RISC-V: add ignores signed overflow, there is no add that traps on
signed overflow (and detecting signed overflow is pretty
involved if both operands are unknown to the compiler).

The worst of all possible semantic encodings

Although I thought that making trapping on fixed-point overflow the
default is a bad idea, I agree that making it impossible to do so, or even test for fixed-point overflow, is a much worse idea.

Possibly true.

The lack of things like ADD-with-Carry or ADD-with-Overflow are
annoyance points on RISC-V.

Though, it is less obvious what a useful behavior is at the language level:
"signal()" ? ...
Something like try/catch (mostly N/A to C)?
Something similar to FENV_ACCESS?
...

The important property is that overflow is detected precisely.
Whether {trap, signal, throw} is performed is an environmental choice
not an ISA choice.

Well, and that if trapping were applied globally:
Overhead due to trap detection/handling code causing excessive bloat; Overflows traps from any code that naively assumes wrap-on-overflow semantics;
...

In some codebases, it is already enough of a pain to hunt and fix all
the out-of-bounds and uninitialized variables mess.
Signed integer overflows would likely "turn it up to 11";
Then, how does one fix it? Ask that people start adding a bunch of casts
to make it work?...

One might say:
Add "if()" cases to deal with the overflows, but, ... this only makes
sense for cases where the overflows are not the expected behavior.

If(overflow(??)) requires some flag to carry overflow from point of
detection to if(()).

And what happens if there is more than 1 overflow ??

Then again, could maybe classify code, say:
1, signed, value doesn't (or shouldn't) go out-of-range;
2, unsigned, value doesn't (or shouldn't) go out-of-range;
3, signed, value is expected to be modulo;
4, unsigned, value is expected to be modulo.

5, a language hint about in-range, wrap, trap, signal, throw

"nasal demons" types assume 1 and 4 as dominant.
Or, 1 as exclusive vs 3.

For compilers, we often need to assume 3 and 4.
Because, failure to uphold 3 results in misbehaving programs.
And, if 3 were uncommon, RISC-V's "ADDW"/etc would be pure stupidity.

You would prefer::

AND R7,Rleft,#~(~0<<31)
AND R8,Rright,#~(~0<<31)
ADD Rd,R7,R8
AND Rd,Rd,#~(~0<<31)

That is ADDW range limits operands and performs a shorter ADD.
Matching C's int a,b; semantic. In general the integer instructions
ending with W apply C's int properties to the arithmetic. If compilers
were (WERE) really good at range determination those instructions would
be unnecessary--but they are not.

I (My 66000) had to put in sized integer calculation reasons, and by
doing so, gained 2%-4% in code density and a bit more in latency. -----------------------
--- Synchronet 3.22a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Tue May 26 14:28:56 2026

From Newsgroup: comp.arch

On 5/26/2026 1:02 PM, MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

On 5/25/2026 3:34 PM, quadi wrote:

On Mon, 25 May 2026 16:49:59 +0000, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

RISC-V: add ignores signed overflow, there is no add that traps on
signed overflow (and detecting signed overflow is pretty
involved if both operands are unknown to the compiler).

The worst of all possible semantic encodings

Although I thought that making trapping on fixed-point overflow the
default is a bad idea, I agree that making it impossible to do so, or even >>> test for fixed-point overflow, is a much worse idea.

Possibly true.

The lack of things like ADD-with-Carry or ADD-with-Overflow are
annoyance points on RISC-V.

Though, it is less obvious what a useful behavior is at the language level: >> "signal()" ? ...
Something like try/catch (mostly N/A to C)?
Something similar to FENV_ACCESS?
...

The important property is that overflow is detected precisely.
Whether {trap, signal, throw} is performed is an environmental choice
not an ISA choice.

Yeah.

Say:
ADDV Rs, Rt, Rd
BT __trap_overflow

Which is how I would assume doing it, if I were to re-add ADDV to my ISA
(this had existed in SuperH and BJX1, but got lost along the way, but
could re-add if needed; just it was less often needed than even ADC/ADDC).

Well, and that if trapping were applied globally:
Overhead due to trap detection/handling code causing excessive bloat;
Overflows traps from any code that naively assumes wrap-on-overflow
semantics;
...

In some codebases, it is already enough of a pain to hunt and fix all
the out-of-bounds and uninitialized variables mess.
Signed integer overflows would likely "turn it up to 11";
Then, how does one fix it? Ask that people start adding a bunch of casts
to make it work?...

One might say:
Add "if()" cases to deal with the overflows, but, ... this only makes
sense for cases where the overflows are not the expected behavior.

If(overflow(??)) requires some flag to carry overflow from point of
detection to if(()).

And what happens if there is more than 1 overflow ??

Dunno.
You would need to set a start point and an end/detection point, and have
some way for the compiler to know to track overflows.

Say:
ADDV ...
OR?T Re, 0x100, Re

Then a way to feed Re back into C land to act upon.

There could maybe either be a 32-bit variant (ADDV.L), or some shorthand
way to detect that the value has gone outside of 32-bit range.

Then again, could maybe classify code, say:
1, signed, value doesn't (or shouldn't) go out-of-range;
2, unsigned, value doesn't (or shouldn't) go out-of-range;
3, signed, value is expected to be modulo;
4, unsigned, value is expected to be modulo.

5, a language hint about in-range, wrap, trap, signal, throw

Well, possible, but C doesn't have any hints here...

But, yeah:
Leaving plain 'int' as the "probably shouldn't overflow" and 'signed
int' and 'unsigned int' as "wrap on overflow expected" could make sense.

"nasal demons" types assume 1 and 4 as dominant.
Or, 1 as exclusive vs 3.

For compilers, we often need to assume 3 and 4.
Because, failure to uphold 3 results in misbehaving programs.
And, if 3 were uncommon, RISC-V's "ADDW"/etc would be pure stupidity.

You would prefer::

AND R7,Rleft,#~(~0<<31)
AND R8,Rright,#~(~0<<31)
ADD Rd,R7,R8
AND Rd,Rd,#~(~0<<31)

That is ADDW range limits operands and performs a shorter ADD.
Matching C's int a,b; semantic. In general the integer instructions
ending with W apply C's int properties to the arithmetic. If compilers
were (WERE) really good at range determination those instructions would
be unnecessary--but they are not.

I (My 66000) had to put in sized integer calculation reasons, and by
doing so, gained 2%-4% in code density and a bit more in latency. -----------------------

OK.

Ironically, the 4-op sequence above would have been a single "ADDWU" instruction in the RV BitManip drafts, but ADDWU was dropped as arguably
it didn't make a big enough difference on SPEC scores. They decided to
keep a whole bunch of other random crap though that serves no real
purpose other than to micro-optimize the benchmarks...

I revived this for my own extensions, but left out ADDIWU as it was
still not common enough to justify the encoding space cost (if one has jumbo-prefixes, this could be handled well enough via
immediate-synthesis, and the 64-bit encoding wasn't too bad for
something that is comparably infrequent).

...

--- Synchronet 3.22a-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Tue May 26 15:29:08 2026

From Newsgroup: comp.arch

On Mon, 25 May 2026 23:05:06 GMT, MitchAlsup
<user5857@newsgrouper.org.invalid> wrote:

BGB <cr88192@gmail.com> posted:

On 5/25/2026 9:28 AM, Anton Ertl wrote:

--------------

Integer overflow happens far too often for trapping to be a good solution.

Even on 64-bit variables/machines ??

Yes if there are options for 8/16/32 bit ops in 64 bit registers.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Tue May 26 22:09:28 2026

From Newsgroup: comp.arch

David Brown wrote:

On 26/05/2026 01:00, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
What you do want is compiled code that can trap on overflow and avoid
trapping on overflow without code substitution or being re-compiled.>>>> This way production code can avoid trapping but if the debugger is
turned on, you can trap.

Why do you consider that desirable?

So you can debug production/released code to find subtle errors.

I think that when an unexpected error is detected (whether it is with hardware acceleration, like trap on overflow, or via explicit generated code), the way to handle it depends strongly on the situation. If a debugger is present, then it is most helpful to lead to a debugger break
so that the developer can figure out what went wrong. When not
debugging, there is no sensible default handling that works for jet
engine controllers and video game frame generators.

But I do support the aim of having the same generated code when
debugging and when shipping - I am not a fan of "release" builds and
"debug" builds. (Of course you might temporarily do builds with
different flags while chasing down a particular bug.)

I tend to like "Release with sometimes hard-to-grok debug info",
typically resulting in a separate file with a best effort debug map of
the executable.
Then I can at least get some help when running the debugger and trying
to binary search my way into the spot where the bug resides.
Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue May 26 20:54:30 2026

From Newsgroup: comp.arch

Terje Mathisen <terje.mathisen@tmsw.no> posted:

David Brown wrote:

On 26/05/2026 01:00, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
What you do want is compiled code that can trap on overflow and avoid >>>> trapping on overflow without code substitution or being re-compiled. >>>> This way production code can avoid trapping but if the debugger is
turned on, you can trap.

Why do you consider that desirable?

So you can debug production/released code to find subtle errors.

I think that when an unexpected error is detected (whether it is with hardware acceleration, like trap on overflow, or via explicit generated code), the way to handle it depends strongly on the situation. If a debugger is present, then it is most helpful to lead to a debugger break so that the developer can figure out what went wrong. When not debugging, there is no sensible default handling that works for jet
engine controllers and video game frame generators.

But I do support the aim of having the same generated code when
debugging and when shipping - I am not a fan of "release" builds and "debug" builds. (Of course you might temporarily do builds with different flags while chasing down a particular bug.)

I tend to like "Release with sometimes hard-to-grok debug info",
typically resulting in a separate file with a best effort debug map of
the executable.

Encrypt the debug information (and put it in a {1234-5678-9101-1121-...} folder) so that only the owner (not licensee) of the code can debug
it.

Then I can at least get some help when running the debugger and trying
to binary search my way into the spot where the bug resides.

Terje

--- Synchronet 3.22a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Tue May 26 19:13:21 2026

From Newsgroup: comp.arch

On 5/26/2026 2:29 PM, George Neuner wrote:

On Mon, 25 May 2026 23:05:06 GMT, MitchAlsup <user5857@newsgrouper.org.invalid> wrote:

BGB <cr88192@gmail.com> posted:

On 5/25/2026 9:28 AM, Anton Ertl wrote:

--------------

Integer overflow happens far too often for trapping to be a good solution. >>

Even on 64-bit variables/machines ??

Yes if there are options for 8/16/32 bit ops in 64 bit registers.

32-bit overflow is the dominant scenario here.
While 8 and 16-bit ranges do overflow readily, the normal semantics are
for them to auto-promote to 32 bits before then being narrowed back down
to 8 or 16 bits, so they don't count.

Ironically, for my BS2 language, the semantics were in cases like this
to instead auto-promote to 64 bits; but can't really do this for C as it
gives different results in some cases (and early promotion is itself a
bug, even if early promotion would often be the most natural semantics
for a 64-bit machine).

Well, and there is the usual thing that one can't usually allow a
variable to hold values outside the range of what would be allowed for
that variable.

Well, except for floating-point types, where typically code doesn't care
about out of ranges of values (if a value fails to go to 0 or Inf in a computation in local variables, typically no one cares).

For float, it isn't obvious because the dynamic range of Binary32 is
already quite large. A "short float" effectively having Binary64's
dynamic range when used in scalar computations is a bit incredulous, but
given these smaller formats are non-standard anyways, it reasonable to
be like "these formats are only necessarily confined to their formal
range when in-memory, otherwise all bets are off".

Or: precision and dynamic range >= requested format.

Code can't entirely rely on the higher precision though, as the format
may also revert to its defined precision without warning (even if
intermediate computations may potentially wildly exceed it).

But, then again, this would be analogous to if one has an FPU with
native Binary128, occasionally performing "double" calculations at
Binary128 precision even though "double" is stated as Binary64.

Well, or implementing some operations by widening temporarily to a higher-precision format before narrowing the result.

Though, OTOH, the main use-case for things like scalar "short float" is
more for saving memory in structs and arrays, not for trying to rely on
its crappy range and precision.

So, floating point math is very different from integer math in this regard.

...

--- Synchronet 3.22a-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Wed May 27 10:59:31 2026

From Newsgroup: comp.arch

MitchAlsup [2026-05-26 20:54:30] wrote:

Encrypt the debug information (and put it in
a {1234-5678-9101-1121-...} folder) so that only the owner (not
licensee) of the code can debug it.

I resent that. All code should be Free Software.

=== Stefan
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Wed May 27 18:19:49 2026

From Newsgroup: comp.arch

On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:

MitchAlsup [2026-05-26 20:54:30] wrote:

Encrypt the debug information (and put it in a
{1234-5678-9101-1121-...} folder) so that only the owner (not
licensee) of the code can debug it.

I resent that. All code should be Free Software.

It is wonderful that we have the open-source software movement.

However, people have the right to the fruit of their labors. To give them
away for free is generous, but it should remain a personal choice.

Of course, copyright has been misused, and deserves a critical
examination, not the sort of uncritical expansion given to it by
legislators in the United States - and imposed on the rest of the world by trade threats.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Wed May 27 15:24:09 2026

From Newsgroup: comp.arch

On 5/25/2026 5:59 PM, MitchAlsup wrote:

Thomas Koenig <tkoenig@netcologne.de> posted:

David Brown <david.brown@hesbynett.no> schrieb:

On 24/05/2026 23:39, quadi wrote:

On Sun, 24 May 2026 17:32:10 +0000, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

It makes sense to trap on a floating-point overflow, but trapping on an >>>>>> integer overflow is usually a terrible idea.

So, detecting something went wrong and you should inform the programmer >>>>> is a bad idea ???

No, so being able to turn the trap for integer overflow on should
definitely be allowed. But that shouldn't be the default behavior.
Otherwise, programs like random number generators wouldn't work.

John Savard

That does not make sense. Code such as random number generators should
be written so that they are correct in the language they are written in.

In principle, yes.

Principle is better in theory than in practice.

In practice, people often used whatever "worked" on their systems.

Face it, the poor slug writing the code may not have the faintest
grasp at the system qualities we are discussing, and does not care
to learn as long as he can slug through the writing and his pro-
gram not blow up catastrophically while it is under his purview.

That defines a lot of what is wrong with SW programming today.

Implementors have a certain right because they control what their
compiler does or does not do.

You would be surprised at how little influence implementors have
on compilers and other software.

Yeah.

You can design the ISA and compiler as one likes.
But, if existing C code breaks, well then this is not good.

One might think:
You know, wrap on overflow, and type promotion where it overflows and
wraps, and *then* promotes to the wider type on the final assignment, is
kinda stupid and sucks.

And, if one goes by "well, signed overflow is UB anyways", then they
should be able to turn it into a "promote first, then ADD" scenario (may
be both potentially faster, and less likely to lose information).

I would be inclined to agree.

But... there is old code around that will quietly break if the integer overflow and promotion doesn't follow the specific behavior that mimics
how it would have behaved on 32-bit systems.

I vaguely remember a case of this involving some robot enemies that
drive around in ROTT, where if the integer overflow failed to work in
just the right way, they would all miss their way-points and end up
crashing into walls or similar.

Where, the robot enemies followed a path defined as a series of
waypoints (in a grid world), and once the robot hits a particular spot
on the grid cell, it will change directions and head along the path.
But, the particular way the expression to handle this was written was sensitive to the type promotion and wrap-on-overflow semantics in C.

Also a similar case involving the "elevators", which were effectively
timed teleporters between different parts of the map (would close door,
play elevator sound, then right at the end as the door opens, it would teleport the player to the other location and initiate a screen shaking
effect at around the same time). If the overflow was wrong, the teleport
would fail and the player would still be in the original location.

One could fix this stuff with casts or similar, but, when does one draw
the line exactly?...

Easier sometimes to make it to work, than to try to justify the code was already broken due to reliance on UB.

Well, and to match the behavior of the other compilers, needed to
implement the behavior the way ROTT expected.

Where, as noted, ROTT uses fixed-point math with "fixed" as a signed
32-bit integer, and some cases involve calculations with coordinates
well outside the world bounds with the seeming intention that these
high-order components simply disappear into the ether (with the world essentially treated as a wrapping modulo space).

But, as noted, it differed from my BS2 language, where the default was effectively to auto-promote values to the widest reasonable integer type
in these cases and then drop down to the final range afterwards (to
avoid some integer overflows in cases they would happen in C).

Well, and within BGBCC, there was some non-zero bleed-over between C and
BS2 (where originally I had been implementing BS2 via BGBCC, with the intention that it would compile to an IL image that would then be run in
the VM).

The original VM however, while fast, ended up with horrible code-bloat.
Had gotten creative with the use of the C preprocessor in ways that were ultimately a terrible idea (errm, trying to use it sorta like a
poor-man's version of C++ templates). Binaries got huge, build times
sucked. This VM was a dead end.

Ironically, some of my current ISA projects were built on some of the groundwork left by this experiment, but also as a warning for something
not to do.

Or, when I learned the merit of actually writing all the opcode handler functions and similar by hand and not trying to do combinatorial stuff
via the preprocessor.

Also for the follow up VM (for BS2), had went back to ye-olde stack
machine (vs a Register IR model). But, some parts of this were relevant
to targeting an "actual CPU".

The way JX2VM works isn't too far removed from those VMs in some ways,
apart from JX2VM's general avoidance of getting too clever with the C preprocessor.

...

--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Sat May 30 04:02:45 2026

From Newsgroup: comp.arch

On Mon, 25 May 2026 23:03:03 +0000, MitchAlsup wrote:

Another One Bites the Dust.....

Yes, it certainly is true that Concertina IV retains a lot of baggage
which might be considered silly from even the original Concertina design.

And, since I have a "set flag" instruction still... I needed to have predicated instructions. So I added those in... giving an instruction
format which included either a predicated 32-bit instruction, or a
predicated pair of 16-bit short instructions... which now could have full register specifications! And with predicated instructions, I also brought
back the break bit.

So even without block structure, I brought back VLIW features!

I was so dismayed by how limited my 16-bit short instructions were, that
this was nice - but having two 16-bit short instructions inside a 48-bit instruction was not a gain on using 24-bit short instructions instead!

Well, I added a new 80-bit instruction format, which no longer allowed predication, but which allowed those short instructions to be used with
less overhead.

I felt I could do even better. I wanted to add 112-bit instructions, to
split the 16 bits of overhead between three pairs of these nicer short instructions. It was hard to find the opcode space for them, but I finally
did it.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Sat May 30 04:06:14 2026

From Newsgroup: comp.arch

On Mon, 25 May 2026 23:03:03 +0000, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

A major goal of the Concertina II, III, and IV architectures is for
instructions not to be longer than similar instructions on the Motorola
68020 or the IBM System/360 if at all possible.

Basically, the selling point is... "Your programs only get 10% bigger,
if that, and yet you have 32 registers, so they run faster!".

Mine are getting 30% smaller and needing fewer instructions at the same
time

Well, then you're obviously doing something amazing with MY 68000, and I
don't have the experience to know which modifier bits, if added, would
save instructions often enough to more than pay for the space they take up.

I have to be content with doing the best I can, despite not being capable
of doing much more than slavishly copying existing commercial
architectures.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Sat May 30 15:47:00 2026

From Newsgroup: comp.arch

On Sat, 30 May 2026 04:02:45 +0000, quadi wrote:

So even without block structure, I brought back VLIW features!

I had a little opcode space remaining. So now I have made what is perhaps
my maddest addition to the Concertina IV architecture yet!

In the normal instruction set of the Concertina IV, it was necessary to
extend the 32-bit instruction set to intrude, ever so slightly, into the portion of the opcode space where instructions begin with 11.

This was because in the 3/4 of the opcode space initially allocated to 32-
bit instructions, there wasn't quite enough room for a Halfword Immediate instruction that was 32 bits long, but allowed all 32 registers to be used
as destination registers.

Well, for the primary instruction set, this was no real problem. It may
have made decoding the lengths of instructions less simple and elegant,
but there was still enough space for instructions longer than 32 bits and
for the short instructions, both 16-bit and 24-bit - which chopped that remaining space up into pieces anyways.

But in the 48-bit instructions with an instruction that can be predicated,
and the 80-bit and 112-bit instructions with two or three instructions
which can be indicated explicitly as parallelizable... there's a field
that can _only_ be used for a 32-bit instruction.

So in there, the opcode space of 32-bit instructions starting with 11 is almost completely unused... but I can't use it for paired 15-bit short instructions because of that Halfword Immediate instruction.

Well, now the Halfword Immediate instruction for that case has been
modified, so that paired short instructions including short instructions
other than register-to-register operate instructions can be used.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat May 30 19:03:18 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Mon, 25 May 2026 23:03:03 +0000, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

A major goal of the Concertina II, III, and IV architectures is for
instructions not to be longer than similar instructions on the Motorola
68020 or the IBM System/360 if at all possible.

Basically, the selling point is... "Your programs only get 10% bigger,
if that, and yet you have 32 registers, so they run faster!".

Mine are getting 30% smaller and needing fewer instructions at the same time

Well, then you're obviously doing something amazing with MY 68000, and I

s/68/66/

don't have the experience to know which modifier bits, if added, would
save instructions often enough to more than pay for the space they take up.

1) never use instructions to paste constant bits together
2) never use LDs to fetch constants from data-memory
3) provide ENTER and EXIT to setup and tear-down stack frames
4) provide [Rbase + Rindex<<scale + Displacement] addressing
5) encode orthogonal features in a single encode field
6) spend years reading ASM code from your compiler

The rest (encoding) is the easy part.

I have to be content with doing the best I can, despite not being capable
of doing much more than slavishly copying existing commercial
architectures.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat May 30 19:15:56 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Sat, 30 May 2026 04:02:45 +0000, quadi wrote:

So even without block structure, I brought back VLIW features!

I had a little opcode space remaining. So now I have made what is perhaps
my maddest addition to the Concertina IV architecture yet!

In the normal instruction set of the Concertina IV, it was necessary to extend the 32-bit instruction set to intrude, ever so slightly, into the portion of the opcode space where instructions begin with 11.

This was because in the 3/4 of the opcode space initially allocated to 32- bit instructions, there wasn't quite enough room for a Halfword Immediate instruction that was 32 bits long, but allowed all 32 registers to be used as destination registers.

Yet, My 66000 only has 29 instructions that use 16-bit (or larger) in instruction constants (immediates and displacements)--this includes 2 instructions for Branch on Bit, 2 instructions for branch on condition,
2 26-bit branch instructions, 13 Disp16 memory references, {9 integer,
and 2 miscellaneous instructions} with 16-bit immediates.

Only 29 from an OpCode space of 64 slots with 6 permanently reserved to
prevent executing code. So, only 1/2 my Major OpCode space is used with immediates--with 16-slots available for the future (22 if you count the reserved slots).

Well, for the primary instruction set, this was no real problem. It may
have made decoding the lengths of instructions less simple and elegant,
but there was still enough space for instructions longer than 32 bits and for the short instructions, both 16-bit and 24-bit - which chopped that remaining space up into pieces anyways.

It costs me only 6 gates (2 gates of delay) to decode the length of an instruction--whereas it takes 4 gates to decode S/360 2-bit code for instruction length.

But in the 48-bit instructions with an instruction that can be predicated, and the 80-bit and 112-bit instructions with two or three instructions
which can be indicated explicitly as parallelizable... there's a field
that can _only_ be used for a 32-bit instruction.

An architecture is just as much about what you leave out as what you
put in.

So in there, the opcode space of 32-bit instructions starting with 11 is almost completely unused... but I can't use it for paired 15-bit short instructions because of that Halfword Immediate instruction.

Based on my above: you should not need more than 1/2 OpCode space for instructions with 16-bit immediates.

Well, now the Halfword Immediate instruction for that case has been modified, so that paired short instructions including short instructions other than register-to-register operate instructions can be used.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Sun May 31 01:22:51 2026

From Newsgroup: comp.arch

On Sat, 30 May 2026 15:47:00 +0000, quadi wrote:

On Sat, 30 May 2026 04:02:45 +0000, quadi wrote:

So even without block structure, I brought back VLIW features!

I had a little opcode space remaining. So now I have made what is
perhaps my maddest addition to the Concertina IV architecture yet!

At least this reminded me that embedding instructions inside long
instructions is, in one very important respect, very different from having
a block structure for program code. So I have now added a warning about
how branching to an embedded instruction will not work unless a number of strict conditions are met.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Sun May 31 02:57:08 2026

From Newsgroup: comp.arch

On Sat, 30 May 2026 15:47:00 +0000, quadi wrote:

So in there, the opcode space of 32-bit instructions starting with 11 is almost completely unused... but I can't use it for paired 15-bit short instructions because of that Halfword Immediate instruction.

Well, now the Halfword Immediate instruction for that case has been
modified, so that paired short instructions including short instructions other than register-to-register operate instructions can be used.

I felt that this, while tempting, was still a crazy idea. But now I see
what my subconscious motivation could have been.

Adding this additional, seemingly redundant, short instruction
capability... now makes it possible to think of removing the one feature
of Concertina IV that I dislike the most: the 24-bit short instructions.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Sun May 31 12:05:00 2026

From Newsgroup: comp.arch

On Sun, 31 May 2026 01:22:51 +0000, quadi wrote:

At least this reminded me that embedding instructions inside long instructions is, in one very important respect, very different from
having a block structure for program code. So I have now added a warning about how branching to an embedded instruction will not work unless a
number of strict conditions are met.

And now I've added the Branch to Embedded instruction, which points to the larger instruction, and then indicates which embedded instruction within
it to which control is to be transferred as a method of avoiding these restrictions, should anyone ever need such an instruction.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Sun May 31 17:26:41 2026

From Newsgroup: comp.arch

On Sun, 31 May 2026 12:05:00 +0000, quadi wrote:

And now I've added the Branch to Embedded instruction, which points to
the larger instruction, and then indicates which embedded instruction
within it to which control is to be transferred as a method of avoiding
these restrictions, should anyone ever need such an instruction.

And now a minor change: since the opcode space was available, the shift instructions, not only the operate instructions, among the 24-bit short instructions, may now affect the condition codes.

Oh yes, and I've added 144-bit instructions that provide four embedded 32-
bit instructions with an explicit indication of parallelism.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Sun May 31 11:05:37 2026

From Newsgroup: comp.arch

On 5/30/2026 12:15 PM, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

snip

But in the 48-bit instructions with an instruction that can be predicated, >> and the 80-bit and 112-bit instructions with two or three instructions
which can be indicated explicitly as parallelizable... there's a field
that can _only_ be used for a 32-bit instruction.

An architecture is just as much about what you leave out as what you
put in.

John's answer - leave out as little as possible, preferably nothing! :-)
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sun May 31 18:40:48 2026

From Newsgroup: comp.arch

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> posted:

On 5/30/2026 12:15 PM, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

snip

But in the 48-bit instructions with an instruction that can be predicated, >> and the 80-bit and 112-bit instructions with two or three instructions
which can be indicated explicitly as parallelizable... there's a field
that can _only_ be used for a 32-bit instruction.

An architecture is just as much about what you leave out as what you
put in.

John's answer - leave out as little as possible, preferably nothing! :-)

Which is why his architecture is converging so rapidly.

NOT.
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 01:14:12 2026

From Newsgroup: comp.arch

On Wed, 20 May 2026 17:47:59 +0000, John Levine wrote:

Having looked into this in some detail, both when IBM used bigendian
order on S/360 and DEC used little-endian on the PDP-11, neither
documented the reasons for the byte order choice at all. Not even a
litle bit.

I suppose that, at the time, it was something that nobody felt was
important enough to document.

But to people who were around back then, the reasons would have been
obvious.

IBM mainframes were designed to ooze quality! So here and there, an extra transistor or two was added if something seemed better. That's why the IBM 7090 used sign-magnitude arithmetic for integers.

And that's why the IBM 360 jumped ahead to the end of an integer and
worked backwards to add, because putting things in reverse order would
have shouted cheap.

Plus, the 360 came in a variety of bus widths. So when would you start
putting the small part first? (They didn't know the answer the PDP-11 came
up with. Nobody back then could even imagine it, it was so new.)

The original PDP-11 only came with a 16-bit bus. But its designers aspired
to the level of consistency that the 360 had, but they wanted to do it on
a rock-bottom minicomputer budget. DEC minis, in fact, were cheaper than
most other brands of minicomputer at the time.

So they were going to put the most significant 16-bit word of a 32-bit
integer last. But they got the brilliant idea - that more pedestrian
designers would never even considered for a second, or even thought of as possible - of numbering the bytes in a word backwards too, so as to attain consistency.

The PDP-11 made little-endian a thing. It was so new that the people
designing the floating-point unit didn't get the memo. But the concept of making little-endian consistent, instead of something you did in one particular case, the case where something was twice the size of your
biggest register... that was only born with the PDP-11.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 01:40:59 2026

From Newsgroup: comp.arch

On Sun, 31 May 2026 11:05:37 -0700, Stephen Fuld wrote:

On 5/30/2026 12:15 PM, MitchAlsup wrote:

An architecture is just as much about what you leave out as what you
put in.

Those are words of wisdom, undoubtedly.

John's answer - leave out as little as possible, preferably nothing!

So why do I choose openly to defy good sense, and neglect them?

That's a fair question.

My answer, though, is a simple one. I've opened my eyes, and looked at the world around me.

When it comes to desktop computers, the ones people generally use when
trying to solve a problem more serious than could be dealt with on a smartphone... what processor is in them?

Well, there _is_ the Macintosh, which also used x86 for a time, but is now using ARM.

But in general, x86 is dominant. There's too much software written to run
on x86 Windows.

So what I've learned is that the world of computer architectures seems to
be like _Highlander_... "There can be only one".

And if that one leaves out a feature, then that means that feature is basically not available. I want everyone to have a chance to efficiently
solve their problems, whatever special instructions or data formats they
may need.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Mon Jun 1 02:20:36 2026

From Newsgroup: comp.arch

It appears that quadi <quadibloc@ca.invalid> said:

On Wed, 20 May 2026 17:47:59 +0000, John Levine wrote:

Having looked into this in some detail, both when IBM used bigendian
order on S/360 and DEC used little-endian on the PDP-11, neither
documented the reasons for the byte order choice at all. Not even a
litle bit.

I suppose that, at the time, it was something that nobody felt was
important enough to document.

Evidently.

But to people who were around back then, the reasons would have been >obvious.

As I may have said once or twice before, we have plenty of guesses, but
since there is no documentation, the guesses are a waste of time.

IBM mainframes were designed to ooze quality! So here and there, an extra >transistor or two was added if something seemed better. That's why the IBM >7090 used sign-magnitude arithmetic for integers.

The 7090 used sign-magnitude because the vacuum tube 709 used sign magnitude because the 704 used sign-magnitude and they quite reasonably wanted to keep them program compatible. The preceding 701 was also sign-magnitude but had a strange addressing scheme which let you treat memory (which was flaky Williams tubes) as either 36 bit full words or 18 bit half words. Full words were addressed by even negative addresses from -0000 to -4094 while half words were even and odd positive addresses from +0000 to +4095. The 704 did not do that, thank heavens.

I presume you are aware that the 704 and successors did indexing by two's complement subtraction, which is not sign-magnitude. There is no documentation for that either, and I have looked quite hard. Pretty please, do not guess unless you can cite sources.

And that's why the IBM 360 jumped ahead to the end of an integer and
worked backwards to add, because putting things in reverse order would
have shouted cheap.

IBM's 702, 705, and 7080 decimal mainframes addressed the low digit of a number and I can assure you they were not cheap.

The original PDP-11 only came with a 16-bit bus. But its designers aspired >to the level of consistency that the 360 had, but they wanted to do it on
a rock-bottom minicomputer budget. DEC minis, in fact, were cheaper than >most other brands of minicomputer at the time.

I am familiar with this guess, but having looked at a lot of contemporary DEC documentation, there is no reason to believe it's true. If they saved any transistors by making it little-endian, the difference was trivial.

You should look at the DG Nova, designed by some DEC renegades, really cheap due
to using then-new MSI chips, and word addressed with a bigendian feel.

The PDP-11 made little-endian a thing. It was so new that the people >designing the floating-point unit didn't get the memo.

Nor did the people designing the extended multiplier, but they got it
mostly conssitent in the Vax.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Mon Jun 1 05:36:10 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> schrieb:

So what I've learned is that the world of computer architectures seems to
be like _Highlander_... "There can be only one".

That is what people thought about the /360 until the Minis came
along, where companies were content with lower margins to serve
new markets and customers at lower margins, but higher volume.

And then RISC, and PCs... and the low end that PCs are being attacked
from right now is mobile devices, and ARM.

For this kind of cycle, I highly recommend reading https://en.wikipedia.org/wiki/The_Innovator%27s_Dilemma (the book
not the Wikipedia article itself) It talks a lot about hard drives,
but parallels to computers are obvious.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 1 07:47:42 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> writes:

I presume you are aware that the 704 and successors did indexing by two's >complement subtraction, which is not sign-magnitude.

Looking at the 704 manual <https://ia802904.us.archive.org/12/items/bitsavers_ibm7042466_32932660/24-6661-2_704_Manual_1955_text.pdf>,
it says:

|Type A instructions use two 15 -bit fields (decrement and address)
|containing numbers in the octal range 00000 to 77777.

I did not find other descriptions of addresses; given this
description, it seems that the addresses and the index registers are
unsigned. Appendix A discusses binary arithmetic, but explains
subtraction with borrows rather than addition of the 2s-complement
(borrows is probably easier to understand given the background of the
readers, but adding a 1s-complement and one is easier to implement).

In any case, I don't think that the IBM 704 manual documents
2s-complement representation of negative numbers for any purpose.

So why did the S/360 architects go for 2s-complement?

One speculation is that they wanted 32-bit (unsigned) addreesses and
wanted to be able to use the same adder for the addresses as for the
integers. But the S/360 only has 24-bit addresses, so going for,
e.g., sign-magnitude and only declaring the positive numbers <2^24 to
be valid addresses would also have worked with one adder.

An alternative speculation is that they really wanted to extend the
range of the S/360 implementations as far as possible, also on the
lower end, and the 2s-complement representation for negative numbers
is cheaper to implement, in particular when you implemant a
bit-serial, nybble-serial, or somesuch machine.

[quadi <quadibloc@ca.invalid> said:]

The PDP-11 made little-endian a thing. It was so new that the people >>designing the floating-point unit didn't get the memo.

Nor did the people designing the extended multiplier, but they got it
mostly conssitent in the Vax.

This all indicates that byte-ordering decisions worked like in our
student group. The "right" choice seemed so obvious to everyone that
we did not communicate about it nor document it nor document the
reasons for it, and different contributors took different "right"
choices.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 1 08:36:22 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> writes:

But they got the brilliant idea - that more pedestrian
designers would never even considered for a second, or even thought of as >possible - of numbering the bytes in a word backwards too, so as to attain >consistency.

The designers of the DataPoint 2200 did that, too, in their
instruction encoding, for no technical reason that I am aware of. And
the Datapoint 2200 came out within months of the PDP-11, so it is
unlikely that they took inspiration from the PDP-11 in this decision.

When you introduce byte addressing, you have to take the byte ordering decision. Some designers decide for big-endian, and some for
little-endian, and the decision is mostly arbitrary. And, as John
Levine writes, the designers of the S/360 and the PDP-11 did not
document their reasons for that.

For the 6502, the decision is not arbitrary when implementing the
addressing modes "ABS,X", "ABS,Y" and "(IND),Y". So for that they
decided to go for little-endian to simplify the implementation.

Its predecessor, the 6800, does not have any operations, where 16-bit
numbers coming from memory are added to something else (at least I did
not find such operations), and therefore the decision could be made arbitrarily, and they decided for big-endian. But I think that the
6809 and the 68000 have addressing modes where the big-endian nature complicates the implementation.

I looked at how this turned out for the offspring of the Datapoint
2200: For the Z80, I did not find any instruction where the
little-endian byte order provided an advantage: when a 16-bit value is
accessed in memory, it is used directly instead of being added or
somesuch. For the 8088, in theory little-endian might provide an
advantage when it comes to addressing modes such as disp16[BX], but
AFAIK in practice the 8088 was internally mostly an 8086, with a
16-bit adder, so it loaded the whole 16-bit number anyway before doing
the full 16-bit add (am I wrong?). Likewise for the 386SX.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Mon Jun 1 16:04:26 2026

From Newsgroup: comp.arch

Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:

So why did the S/360 architects go for 2s-complement?

Brooks (who was program manager for /360) writes about this in
"The Design of Design". Unique zero and unified hardware were
his main points, IIRC.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Paul Clayton@paaronclayton@gmail.com to comp.arch on Sun May 31 18:38:28 2026

From Newsgroup: comp.arch

On 5/30/26 3:15 PM, MitchAlsup wrote:
[snip]

It costs me only 6 gates (2 gates of delay) to decode the length of an instruction--whereas it takes 4 gates to decode S/360 2-bit code for instruction length.

Does the current version of My 66000 have three instruction
lengths or four? You mentioned before dropping "large" constants
as store operands, but I am not certain what that means.

Earlier, if I understood correctly, the longest instruction was
a store of a 64-bit constant with a 64-bit displacement,
requiring five 32-bit words.

If My 66000 has the same variability in instruction length as
S/360 (three sizes), then presumably the extra length decode
effort provides some other advantage, perhaps more flexibility
in length allocation (with a 2-bit size indicator, major opcodes
can only be allocated at 25% granularity)?

There may be an advantage in having different lengths have
different detection speed.

Since My 66000 only uses the extra words for immediates, there
*may* even be an advantage to detecting some illegal opcodes and
speculating that such are from constant words. (An illegal
opcode field can indicate an immediate, a faulting instruction,
or a skipped instruction.) Such could introduce variable timing
for parsing a given fetch chunk, but that might be handled by
reducing the number of parsed instructions emitted and inserting
the slowly parsed instructions into the start of the next group
of parsed instructions.

My guess is that such would just be silly complexity even at 16-
wide parsing, especially given the likely minuscule (typical)
timing benefit (if any!). Process variation probably would have
vastly more impact on frequency than trying to exploit a
statistical bias in encoding. (The concept just seemed
interesting.)

Given that register dependencies also "carry", there may be some
opportunity for "width pipelining" (like the staggered ALUs of
the Pentium 4) in parsing, extracting register names, renaming
(at least with RAT-based renaming), and even insertion into a
scheduler. If a dependency means it would not be useful to
insert the operation into a scheduler, this additional delay
might be exploited.
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 17:59:28 2026

From Newsgroup: comp.arch

On Mon, 01 Jun 2026 02:20:36 +0000, John Levine wrote:

As I may have said once or twice before, we have plenty of guesses, but
since there is no documentation, the guesses are a waste of time.

I understand that you would like to have actual documentation. But it
doesn't appear to exist.

But I don't think my "guesses" are wild. I'm familiar with the other
computers that existed in those years, with the milieu in which the
System/360 and the PDP-11 existed.

I presume you are aware that the 704 and successors did indexing by
two's complement subtraction, which is not sign-magnitude. There is no documentation for that either, and I have looked quite hard. Pretty
please, do not guess unless you can cite sources.

I admit that the fact that one subtracts the index on an IBM 704 seems
very weird to me. Since the IBM 704 was made out of vacuum tubes, saving
them, instead of mere discrete transistors, let alone transistors on a microchip with a billion of them, was probably more important.

My guess that sign-magnitude arithmetic was regarded as more prestigious, until IBM outgrew that notion with the 360, does have a source, although
not an IBM source.

A 24-bit computer was advertised as having sign-magnitude integer
arithmetic, unlike cheaper machines which either used one's complement
integer arithmetic, or, even worse, two's complement integer arithmetic.

I think it was the DDP-24, but offhand I'm not completely sure.

To guess - or to attempt to derive intelligence from the available
information - one might think that IBM considered indexing to be less important or less visible than ordinary integer arithmetic per se.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 18:08:38 2026

From Newsgroup: comp.arch

On Mon, 01 Jun 2026 07:47:42 +0000, Anton Ertl wrote:

This all indicates that byte-ordering decisions worked like in our
student group. The "right" choice seemed so obvious to everyone that we
did not communicate about it nor document it nor document the reasons
for it, and different contributors took different "right"
choices.

As has been argued many times, byte-ordering is completely arbitrary, and
so either choice is just as good. Given that widespread belief, that kind
of behavior is not surprising.

Some will think that of course a computer should be little-endian, because arithmetic is faster and simpler that way (if you're doing any multi-word arithmetic).

Some will think that of course a computer should be big-endian, because
that's just the natural way we write numbers, and anything else would be hopelessly confusing.

As I've pointed out, though, there is *one* particular case where there actually is a genuine difference between big-endian and little-endian.

If, like the System/360, your computer performs BCD arithmetic and not
just binary arithmetic, and if, unlike the System/360, you did your BCD arithmetic in the same registers you use for binary arithmetic...

Then, because binary arithmetic is done in the same registers as BCD arithmetic, they should both have the same endianness.

And because BCD numbers are directly related to character strings
representing numbers - just take the last four bits of each digit
character - they ought to have the same endianness. And character strings
that represent numbers _are_ big-endian.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 18:10:37 2026

From Newsgroup: comp.arch

On Mon, 01 Jun 2026 05:36:10 +0000, Thomas Koenig wrote:

quadi <quadibloc@ca.invalid> schrieb:

So what I've learned is that the world of computer architectures seems
to be like _Highlander_... "There can be only one".

That is what people thought about the /360 until the Minis came along,
where companies were content with lower margins to serve new markets and customers at lower margins, but higher volume.

And then RISC, and PCs... and the low end that PCs are being attacked
from right now is mobile devices, and ARM.

For this kind of cycle, I highly recommend reading https://en.wikipedia.org/wiki/The_Innovator%27s_Dilemma (the book not
the Wikipedia article itself) It talks a lot about hard drives, but
parallels to computers are obvious.

This made me think of a different kind of cycle, called the "wheel of reincarnation", discussed in a book on interactive graphical displays.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 1 18:01:36 2026

From Newsgroup: comp.arch

Thomas Koenig <tkoenig@netcologne.de> writes:

Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:

So why did the S/360 architects go for 2s-complement?

Brooks (who was program manager for /360) writes about this in
"The Design of Design". Unique zero and unified hardware were
his main points, IIRC.

This made me remember: G. M. Amdahl, G. A. Blaauw, F. P. Brooks, Jr.: "Architecture of the IBM System/ 360" <https://www.ece.ucdavis.edu/~vojin/CLASSES/EEC272/S2005/Papers/IBM360-Amdahl_april64.pdf>,
which John Levine pointed to. It says on page 92:

|Sign representations. For the fixed-point arithmetic system, which is |binary,the two's complement representation for negative numbers was |selected.The well-known virtues of this system are the unique
|representation of zero and the absence of recomplementation. These |substantial advantages are augmented by several properties especially
|useful in address arithmetic, particularly in the large models, where
|address arithmetic has its own hardware. With two's complement
|notation, this indexing hardware requires no true/complement gates
|and thus works faster. In the smaller, serial models, the fact that |high-order bits of address arithmetic can be elided without changing
|the low-order bits also permits a gain in speed. The same truncation
|property simplifies double-precision calculations. Furthermore, for
|table calculation, rounding or truncation to an integer changes all
|variables in the same direction, thus giving a more acceptable
|distribution than does an absolute-value-plus-sign representation.
|
|The established commercial rounding convention made the use of
|complement notation awkward for decimal data; therefore, |absolute-value-plus-sign is used here.

What is "recomplementation"?

As an aside: When listing authors in alphabetic order, choose your
co-authors wisely: You have a name like "Brooks", and yet only get the
last spot out of three:-).
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Mon Jun 1 18:13:34 2026

From Newsgroup: comp.arch

According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:

John Levine <johnl@taugh.com> writes:

I presume you are aware that the 704 and successors did indexing by two's >>complement subtraction, which is not sign-magnitude.

Looking at the 704 manual ><https://ia802904.us.archive.org/12/items/bitsavers_ibm7042466_32932660/24-6661-2_704_Manual_1955_text.pdf>,
In any case, I don't think that the IBM 704 manual documents
2s-complement representation of negative numbers for any purpose.

The documentation was a bit sparse, but see item 7 on page 17.

The manual for the 7090 which had a superset of the 704's instruction set
is more complete. See "Complement Arithmetic" on page 10 where it says

Effective addresses are always formed in the computer by the addition
of the 2's complement of the contents of the index register.

https://bitsavers.org/pdf/ibm/7090/22-6528-4_7090Manual.pdf

So why did the S/360 architects go for 2s-complement?

One speculation ...

We don't have to guess, because they told us in the Amdahl et al article
in 1964 in the IBMSJ.

Sign representations. For the fixed-point arithmetic
system, which is binary, the two’s complement representa-
tion for negative numbers was selected. The well-known
virtues of this system are the unique representation
of zero and the absence of recomplementation. These
substantial advantages are augmented by several properties
especially useful in address arithmetic, particularly in the
large models, where address arithmetic has its own hard-
ware. With two’s complement notation, this indexing
hardware requires no true/complement gates and thus
works faster. In the smaller, serial models, the fact that
high-order bits of address arithmetic can be elided with-
out changing the low-order bits also permits a gain in
speed. The same truncation property simplifies double-
precision calculations. Furthermore, for table calculation,
rounding or truncation to an integer changes all variables
in the same direction, thus giving a more acceptable
distribution than does an absolute-value-plus-sign repre-
sentation.

They go on to explain why decimal numbers are still sign magnitude,
mostly becaue it made rounding easier, and float because it made
normalizing easier.

Nor did the people designing the extended multiplier, but they got it >>mostly conssitent in the Vax.

This all indicates that byte-ordering decisions worked like in our
student group. The "right" choice seemed so obvious to everyone that
we did not communicate about it nor document it nor document the
reasons for it, and different contributors took different "right"
choices.

That would seem to be the case. Sometimes things are obscure at the
time and obvious in retrospect, sometimes the converse.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Mon Jun 1 18:26:58 2026

From Newsgroup: comp.arch

According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:

|Sign representations. For the fixed-point arithmetic system, which is >|binary,the two's complement representation for negative numbers was >|selected.The well-known virtues of this system are the unique >|representation of zero and the absence of recomplementation.

What is "recomplementation"?

To do sign magnitude arithmetic, you basically do it in one's
complement: bit flip negative operands to make them one's complement,
do the arithmetic, then bit flip the result if it's negative. That
last bit flip is recomplementation.

Straight one's complement doesn't have the recomplementation but does
have end around carry if there's a carry out of the high bit, and
shares with sign-magnitude the question of how you handle +0 and -0
which are different bit patterns but mathemetically equal.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 19:56:38 2026

From Newsgroup: comp.arch

On Sun, 31 May 2026 18:40:48 +0000, MitchAlsup wrote:

Which is why his architecture is converging so rapidly.

NOT.

Indeed, it's not converging as rapidly as I'd like.

I decided that one of my 32-bit instructions really needed to be allocated twice as much opcode space as I had originally given it.

Even if that meant dropping the 24-bit short instructions to make the
room! (Now that I have paired 15-bit short instructions, which also
include short shift instructions, and short branch instructions, I felt I didn't need them as badly, and I had disliked having instructions that
were an odd number of bytes long.)

Well, after making the changes, I still had room - 1/4 as much as I had
before - for 24-bit short instructions.

I wasn't happy. So I noticed that I actually had some unused space that I could squeeze out. So now the 24-bit short instructions have 1/2 as much
space as they used to, which meant the only thing I had to give up was the ability to change the condition codes. Fine, when you want to do that, use
a full 32-bit operate instruction. So I was happy.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Mon Jun 1 20:00:51 2026

From Newsgroup: comp.arch

According to quadi <quadibloc@ca.invalid>:

I admit that the fact that one subtracts the index on an IBM 704 seems
very weird to me. Since the IBM 704 was made out of vacuum tubes, saving >them, instead of mere discrete transistors, let alone transistors on a >microchip with a billion of them, was probably more important.

We can guess that someone thought that counting down indexes was important
but they turned out to be wrong. Fortran stored arrays in reverse order to make indexing easier.

My guess that sign-magnitude arithmetic was regarded as more prestigious, >until IBM outgrew that notion with the 360, does have a source, although
not an IBM source.

My equally uninformed guess is that their tab machines and their commerical computers were decimal sign magnitude, so binary sign magnitude was a
short step away. It evidently took a while to realize that while the
two's complement negative represntation seemed less intuitive, the logic
was a lot simpler.

A 24-bit computer was advertised as having sign-magnitude integer >arithmetic, unlike cheaper machines which either used one's complement >integer arithmetic, or, even worse, two's complement integer arithmetic.

I think it was the DDP-24, but offhand I'm not completely sure.

To guess - or to attempt to derive intelligence from the available >information - one might think that IBM considered indexing to be less >important or less visible than ordinary integer arithmetic per se.

John Savard

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 21:57:21 2026

From Newsgroup: comp.arch

On Mon, 01 Jun 2026 20:00:51 +0000, John Levine wrote:

My equally uninformed guess is that their tab machines and their
commerical computers were decimal sign magnitude, so binary sign
magnitude was a short step away. It evidently took a while to realize
that while the two's complement negative represntation seemed less
intuitive, the logic was a lot simpler.

I agree with that. Remember, IBM made tab machines long before they got
into computers, and commercial computers, not scientific ones, were their
core business later.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 1 23:40:45 2026

From Newsgroup: comp.arch

On Mon, 01 Jun 2026 19:56:38 +0000, quadi wrote:

Well, after making the changes, I still had room - 1/4 as much as I had before - for 24-bit short instructions.

I wasn't happy. So I noticed that I actually had some unused space that
I could squeeze out. So now the 24-bit short instructions have 1/2 as
much space as they used to, which meant the only thing I had to give up
was the ability to change the condition codes.

When it was 1/4 as much, I was no longer able to fit in a modified form of
the Halfword Immediate instruction as an embedded 32-bit instruction
strictly confined to the opcode space of 32-bit instructions that don't
begin with 11.

But when it was 1/2 as much, I didn't realize that I had enough space to
put that back in. So I've made the fix.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 01:22:37 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> posted:

According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:

|Sign representations. For the fixed-point arithmetic system, which is >|binary,the two's complement representation for negative numbers was >|selected.The well-known virtues of this system are the unique >|representation of zero and the absence of recomplementation.

What is "recomplementation"?

To do sign magnitude arithmetic, you basically do it in one's
complement: bit flip negative operands to make them one's complement,
do the arithmetic, then bit flip the result if it's negative. That
last bit flip is recomplementation.

In microarchitecture, you can make the registers 2^(3+n)+1 bits long.
Then simply record that the mantissa is complemented (or not) when
used as an operand. We do this all the time in microarchitecture to
save gates/time/... depending on the implementation technology
constraints.

Straight one's complement doesn't have the recomplementation but does
have end around carry if there's a carry out of the high bit, and
shares with sign-magnitude the question of how you handle +0 and -0
which are different bit patterns but mathemetically equal.

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 01:52:28 2026

From Newsgroup: comp.arch

Paul Clayton <paaronclayton@gmail.com> posted:

On 5/30/26 3:15 PM, MitchAlsup wrote:
[snip]

It costs me only 6 gates (2 gates of delay) to decode the length of an instruction--whereas it takes 4 gates to decode S/360 2-bit code for instruction length.

Does the current version of My 66000 have three instruction
lengths or four? You mentioned before dropping "large" constants
as store operands, but I am not certain what that means.

1-word, 2-words, and 3-words.

Earlier, if I understood correctly, the longest instruction was
a store of a 64-bit constant with a 64-bit displacement,
requiring five 32-bit words.

Yes, it was. We measured its use at 0.2%.

If My 66000 has the same variability in instruction length as
S/360 (three sizes), then presumably the extra length decode
effort provides some other advantage, perhaps more flexibility
in length allocation (with a 2-bit size indicator, major opcodes
can only be allocated at 25% granularity)?

There are 64-slots in the Major Opcode, 42 are in use, 6 permanently
reserved and 16 free for the future.

There may be an advantage in having different lengths have
different detection speed.

Only 1/8th of the Major group is allowed to have VLI. And all of
these have the same 4-bit encoding--which is called operand routing
and is responsible for {inversion, negation, constant substitution}

Since My 66000 only uses the extra words for immediates, there
*may* even be an advantage to detecting some illegal opcodes and
speculating that such are from constant words.

One of the reasons for the 6 permanently reserved slots if to prevent
that.

(An illegal
opcode field can indicate an immediate, a faulting instruction,
or a skipped instruction.) Such could introduce variable timing
for parsing a given fetch chunk, but that might be handled by
reducing the number of parsed instructions emitted and inserting
the slowly parsed instructions into the start of the next group
of parsed instructions.

My 66000 is specified such that ALL unspecified patterns must be
detected and raise UNIMPLEMENTED. And not just on Major OpCodes,
every unimplemented pattern must be detected. It is better to
prevent mayhem than to allow it to damage all future implementations
{no Carry when shift-count == 0 on x86 comes to mind}.

When performing LL/SC sequences--some sequences are not allowed
and will also raise UNIMPLEMENTED. Silently doing unexpected stuff
is worse than doing nothing.

My guess is that such would just be silly complexity even at 16-
wide parsing, especially given the likely minuscule (typical)
timing benefit (if any!). Process variation probably would have
vastly more impact on frequency than trying to exploit a
statistical bias in encoding. (The concept just seemed
interesting.)

Given that register dependencies also "carry", there may be some
opportunity for "width pipelining" (like the staggered ALUs of
the Pentium 4) in parsing, extracting register names, renaming
(at least with RAT-based renaming), and even insertion into a
scheduler. If a dependency means it would not be useful to
insert the operation into a scheduler, this additional delay
might be exploited.

--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Tue Jun 2 01:57:26 2026

From Newsgroup: comp.arch

According to MitchAlsup <user5857@newsgrouper.org.invalid>:

What is "recomplementation"?

To do sign magnitude arithmetic, you basically do it in one's
complement: bit flip negative operands to make them one's complement,
do the arithmetic, then bit flip the result if it's negative. That
last bit flip is recomplementation.

In microarchitecture, you can make the registers 2^(3+n)+1 bits long.
Then simply record that the mantissa is complemented (or not) when
used as an operand. We do this all the time in microarchitecture to
save gates/time/... depending on the implementation technology
constraints.

You can do that now, not so much when building computers out of vacuum
tubes in the 1950s.

Also, that works OK for registers, but at some point you need to
store values in memory at which point I'd think you'd need to do
the recomplementing.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon Jun 1 14:51:14 2026

From Newsgroup: comp.arch

quadi [2026-05-27 18:19:49] wrote:

On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:

MitchAlsup [2026-05-26 20:54:30] wrote:

Encrypt the debug information (and put it in a
{1234-5678-9101-1121-...} folder) so that only the owner (not
licensee) of the code can debug it.

I resent that. All code should be Free Software.

[...]

However, people have the right to the fruit of their labors. To give them away for free is generous, but it should remain a personal choice.

You don't need to encrypt the debug information of your programs in
order to earn a decent living.

=== Stefan
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 06:19:05 2026

From Newsgroup: comp.arch

On Mon, 01 Jun 2026 14:51:14 -0400, Stefan Monnier wrote:

quadi [2026-05-27 18:19:49] wrote:

However, people have the right to the fruit of their labors. To give
them away for free is generous, but it should remain a personal choice.

You don't need to encrypt the debug information of your programs in
order to earn a decent living.

Perhaps. But if someone can write a program that is so useful that it
could make him wealthy beyond the dreams of avarice, who am I to judge him
for seeking to maximize its revenue potential?

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 06:20:53 2026

From Newsgroup: comp.arch

On Mon, 01 Jun 2026 19:56:38 +0000, quadi wrote:

I decided that one of my 32-bit instructions really needed to be
allocated twice as much opcode space as I had originally given it.

There was another 32-bit instruction that was also short of opcode space -
but this time, I didn't even have to extensively reorganize the opcodes of other instructions in order to remedy that.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 07:47:36 2026

From Newsgroup: comp.arch

On Tue, 02 Jun 2026 06:19:05 +0000, quadi wrote:

On Mon, 01 Jun 2026 14:51:14 -0400, Stefan Monnier wrote:

quadi [2026-05-27 18:19:49] wrote:

However, people have the right to the fruit of their labors. To give
them away for free is generous, but it should remain a personal
choice.

You don't need to encrypt the debug information of your programs in
order to earn a decent living.

Perhaps. But if someone can write a program that is so useful that it
could make him wealthy beyond the dreams of avarice, who am I to judge
him for seeking to maximize its revenue potential?

Perhaps this answer was too casual, and a more detailed and serious answer
is needed.

To say that one doesn't "need" to encrypt debug information "to earn a
decent living" is true enough, but you're also implying that this is all anyone has the right to expect.

To me, this implies a mindset that says that everyone should remain a
laborer, and that it's wrong to transition to rent-seeking.

I don't share that view. While there are excesses in the free-enterprise system as we have it now, I have no quarrel with its basic principles. I
see the ownership of property, including intellectual property, and
including capital property, as fully legitimate.

So a person can write a program once and make a living from selling copies
of it, instead of just from providing services to its users. If the
program is good, there's nothing illegitimate about that. And to defend
the program against piracy and reverse-engineering is also basically legitimate.

However, to encrypt debug information is strange. Why would a copy of the debug information in any form be included with distributed copies of
software? I suppose it could be there in an encrypted form to be used in conjunction with remote diagnostic tools, in the case of software which
has to be maintained on customer premises, unlike mass-market applications.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From jgd@jgd@cix.co.uk (John Dallman) to comp.arch on Tue Jun 2 14:01:40 2026

From Newsgroup: comp.arch

In article <10uks4f$1dqo$1@gal.iecc.com>, johnl@taugh.com (John Levine)
wrote:

Having looked into this in some detail, both when IBM used
bigendian order on S/360 and DEC used little-endian on the
PDP-11, neither documented the reasons for the byte order
choice at all. Not even a litle bit.

Brooks and Blaauw, two of the S/360 architects, consider the subject in
their much later book _Computer Architecture_, on p. 99:

"The more logical convention, the Big Endian, considers the whole
storage space as one steam of bits. Bits, bytes and words are
numbered from left to right, following the convention of writing
in Western culture."

That explains why IBM mainframes number the most significant bit as zero,
the opposite way around to all the platforms I've worked on, which number
the least significant bit as zero.

I've always find the latter convention helpful for doing hex arithmetic
in my head or on paper. I _think_ big-endian SPARC, MIPS and POWER all
regard the least significant bit as bit zero, but I can no longer easily
check that,

John
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 14:54:24 2026

From Newsgroup: comp.arch

On Tue, 02 Jun 2026 14:00:00 +0100, John Dallman wrote:

Brooks and Blaauw, two of the S/360 architects, consider the subject in
their much later book _Computer Architecture_, on p. 99:

"The more logical convention, the Big Endian, considers the whole
storage space as one steam of bits. Bits, bytes and words are numbered
from left to right, following the convention of writing in Western
culture."

On the other hand, Arabic is written from right to left, and yet the Arabs also write numbers with the most significant digit on the left. Hence, little-endian would seem more logical to them for the same reason.

Since this is, therefore, a cultural matter, and not something universal,
like the laws of physics or mathematics, we can't tell what a man from
Mars would prefer.

So, while they can call it "the more logical convention", this isn't
something everyone would agree with. The famous article on the subject,
"On Holy Wars and a Plea for Peace", by Danny Cohen from 1981 thus termed
it as being much less important which standard was chosen than for
everyone to choose the same one for compatibility, but he wasn't shy about expressing his personal preference for little-endian, referring to those
who practiced big-endian as "outlaws".

The case for big-endian is...

It makes computers easier to understand for most people in Western
societies.
It makes core dumps easier to read.
Multi-precision compare is faster.

The case for little-endian is...

People don't need to poke around in core dumps or even program in
assembler very much these days. We have compilers.
Multi-precision add and subtract is faster, and it's much more common than compare.

At least, those are the usual arguments, and from that set of arguments,
it does seem like there's little difference and it's just a personal preference.

But, as I've noted, I've finally come up with a more compelling
justification for big-endian. It still assumes that, if you're processing
text data, that text data will be from a society that writes from left to right.

Think of text records that include words and numbers in character format.
Like

00134700 John Smith
00250000 Richard Roe

and so on.

The numerical portion has the most significant digit on the left, the alphabetic portion has the first character on the left. Thus, these
characters will be stored in memory at succeeding addresses from left to right; the most significant digit is stored at the lower address.

So numbers as text strings are stored in big-endian order.

That means that it's simplest to convert a text number to a packed decimal number that's in the same order.

And an ALU that performs binary arithmetic can be modified to also perform decimal arithmetic by changing when carries take place out of each group
of four bits. If that's done, binary and decimal numbers ought to have the same endianness, so that one doesn't need two load and store instructions
for the accumulator or the registers.

I know some computers, regardless of endianness, number the least
significant bit as one instead of zero. Either way, this convention is considered to make sense for wiring a 12-bit DAC to a 16-bit data bus,
since now each number corresponds to the same power of two no matter how
wide your bus is.

Of course, one can argue for considering fixed-point numbers as fractions
in [-1,1), but that is needed far less often than using them as integers.
A few computers were designed this way; it meant that integers had to be represented with a wasted bit, or that a shift was usually needed after a multiply, so it did not get popular.

The IBM 360 made bit numbering consistent not just to make reading the
manuals easier, but out of habit - since their most recent previous
computer with a 64-bit word was the STRETCH (or 7030)... which had the
ability to do bit-addressing as a prominent feature. There, consistency actually mattered.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Tue Jun 2 17:50:38 2026

From Newsgroup: comp.arch

Stefan Monnier wrote:

quadi [2026-05-27 18:19:49] wrote:

On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:

MitchAlsup [2026-05-26 20:54:30] wrote:

Encrypt the debug information (and put it in a
{1234-5678-9101-1121-...} folder) so that only the owner (not
licensee) of the code can debug it.

I resent that. All code should be Free Software.

[...]

However, people have the right to the fruit of their labors. To give them >> away for free is generous, but it should remain a personal choice.

You don't need to encrypt the debug information of your programs in
order to earn a decent living.

I'd say rather the opposite!

In the current environment where every language is expected to be
compatible with a generic IDE like Visual Studio Code, via open source interface specifications, having a proprietary debug format seems like a
good way to strongly limit your potential customer base.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Tue Jun 2 16:13:28 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> writes:

The case for big-endian is...

<snip>

It makes core dumps easier to read.

Actually the program that analyzes the core dump can handle
endedness without the programmer even being aware of it.

It's been more than half a century since programmers looked at raw
memory dumps.....

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 17:04:55 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> posted:

According to MitchAlsup <user5857@newsgrouper.org.invalid>:

What is "recomplementation"?

To do sign magnitude arithmetic, you basically do it in one's
complement: bit flip negative operands to make them one's complement,
do the arithmetic, then bit flip the result if it's negative. That
last bit flip is recomplementation.

In microarchitecture, you can make the registers 2^(3+n)+1 bits long.
Then simply record that the mantissa is complemented (or not) when
used as an operand. We do this all the time in microarchitecture to
save gates/time/... depending on the implementation technology >constraints.

You can do that now, not so much when building computers out of vacuum
tubes in the 1950s.

Also, that works OK for registers, but at some point you need to
store values in memory at which point I'd think you'd need to do
the recomplementing.

Sure, but there is plenty of time to re-complement when storing
the value.

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 17:09:26 2026

From Newsgroup: comp.arch

jgd@cix.co.uk (John Dallman) posted:

In article <10uks4f$1dqo$1@gal.iecc.com>, johnl@taugh.com (John Levine) wrote:

Having looked into this in some detail, both when IBM used
bigendian order on S/360 and DEC used little-endian on the
PDP-11, neither documented the reasons for the byte order
choice at all. Not even a litle bit.

Brooks and Blaauw, two of the S/360 architects, consider the subject in
their much later book _Computer Architecture_, on p. 99:

"The more logical convention, the Big Endian, considers the whole
storage space as one steam of bits. Bits, bytes and words are
numbered from left to right, following the convention of writing
in Western culture."

That explains why IBM mainframes number the most significant bit as zero,
the opposite way around to all the platforms I've worked on, which number
the least significant bit as zero.

I've always find the latter convention helpful for doing hex arithmetic
in my head or on paper. I _think_ big-endian SPARC, MIPS and POWER all
regard the least significant bit as bit zero, but I can no longer easily check that,

Do you want to isolate the register bit as::

bit = ((register) >> (register_bits - bit) ) & 1;

or

bit = ((register) >> bit) & 1;

John

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 17:13:40 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Tue, 02 Jun 2026 14:00:00 +0100, John Dallman wrote:

Brooks and Blaauw, two of the S/360 architects, consider the subject in their much later book _Computer Architecture_, on p. 99:

"The more logical convention, the Big Endian, considers the whole
storage space as one steam of bits. Bits, bytes and words are numbered
from left to right, following the convention of writing in Western
culture."

On the other hand, Arabic is written from right to left, and yet the Arabs also write numbers with the most significant digit on the left. Hence, little-endian would seem more logical to them for the same reason.

Chinese and Japanese is written top to bottom ...

Since this is, therefore, a cultural matter, and not something universal, like the laws of physics or mathematics, we can't tell what a man from
Mars would prefer.

Middle endian!! Start in the middle and then one step left followed by
one step write--more or less like PDP-11 FP.

So, while they can call it "the more logical convention", this isn't something everyone would agree with. The famous article on the subject,
"On Holy Wars and a Plea for Peace", by Danny Cohen from 1981 thus termed
it as being much less important which standard was chosen than for
everyone to choose the same one for compatibility, but he wasn't shy about expressing his personal preference for little-endian, referring to those
who practiced big-endian as "outlaws".

The case for big-endian is...

It makes computers easier to understand for most people in Western societies.

Just core dumps--they can be read without dumping hex on one side and characters on the other.

It makes core dumps easier to read.
Multi-precision compare is faster.

Not up to 256-bits.

The case for little-endian is...

It won.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Tue Jun 2 15:59:33 2026

From Newsgroup: comp.arch

jgd@cix.co.uk (John Dallman) writes:

Brooks and Blaauw, two of the S/360 architects, consider the subject in
their much later book _Computer Architecture_, on p. 99:

"The more logical convention, the Big Endian, considers the whole
storage space as one steam of bits. Bits, bytes and words are
numbered from left to right, following the convention of writing
in Western culture."

That explains why IBM mainframes number the most significant bit as zero,
the opposite way around to all the platforms I've worked on, which number
the least significant bit as zero.

I've always find the latter convention helpful for doing hex arithmetic
in my head or on paper. I _think_ big-endian SPARC, MIPS and POWER all
regard the least significant bit as bit zero, but I can no longer easily >check that,

Power(PC) gives the MSB bit of GPRs the number 0 and the LSB bit
number 63. It's not clear how that works in 32-bit implementations,
and if it plays a role at all. AFAICS, it plays no role (no
instructions refer to the bit number as defined in the manual).

The 68020 is bit-little-endian and byte-big-endian, and it has
bitfield instructions, and from what I have read, this has led to
problems (e.g., consider what to do if you have an array of 17-bit
fields: how do you access the nth element of the array?

The 88000 is bit-little-endian and byte-big-endian (Section 2.2.3 of
the manual is quite clear about this at the start, and then discusses
the byte-little-endian option; AFAIK all 88000 machines are
byte-big-endian). It has bit-field instuctions that specify the
bitfield as a offset from the LSB of the register and a width. Given
that the bitfield instructions work on registers, and the load
instructions require alignment, I don't expect the difference in order
to cause many problems; maybe confusion if you try to deal with bit
fields that cross words.

MIPS also is bit-little-endian; there are byte-big-endian and byte-little-endian machines with MIPS CPUs. MIPS64r2 has bit-field instructions that use little-endian bit order, and before MIPS64r6 it
also required alignment (MIPS64r6 allows either unaligned support or
trapping on unaligned access). With unaligned support and big-endian
byte order, problems like on the 68020 may arise.

SPARCv9 <https://www.cs.utexas.edu/~novak/sparcv9.pdf> is
bit-little-endian, and "uses big-endian byte order by default"
(3.2.1.2) and I am not aware of any little-endian SPARC machine.
AFAICS SPARC does not have instructions that use bit numbers, so the
numbering of bits in the manual does not have any effect on the
instruction set and programming.

My take is, that in a world with different access widths (e.g.,
accessing a register for a 32-bit value or a 64-bit value),
bit-big-endian is a bad idea. And we already see that in the IBM 704
manual which gives its most significant bits in its 38-bit accumulator
the names (starting from the most significant)

S (sign, maybe out of contest, but shown to the left of Q)
Q (not present in memory)
P (not present in memory; this is the carry bit for the ACL instruction)
1

If they had used bit-little-endian (and started at 0 instead of 1),
they could have called P 35, Q 36, and S could be called 37, but given
that it is a sign/magnitude machine, S is ok).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Tue Jun 2 18:25:12 2026

From Newsgroup: comp.arch

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

quadi <quadibloc@ca.invalid> posted:

On Tue, 02 Jun 2026 14:00:00 +0100, John Dallman wrote:

Brooks and Blaauw, two of the S/360 architects, consider the subject in
their much later book _Computer Architecture_, on p. 99:

"The more logical convention, the Big Endian, considers the whole
storage space as one steam of bits. Bits, bytes and words are numbered >> > from left to right, following the convention of writing in Western
culture."

On the other hand, Arabic is written from right to left, and yet the Arabs >> also write numbers with the most significant digit on the left. Hence,
little-endian would seem more logical to them for the same reason.

Chinese and Japanese is written top to bottom ...

In classical times, yes, but modern texts are written left to right.

Since this is, therefore, a cultural matter, and not something universal, >> like the laws of physics or mathematics, we can't tell what a man from
Mars would prefer.

Middle endian!! Start in the middle and then one step left followed by
one step write--more or less like PDP-11 FP.

Very Turing machine-like.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Tue Jun 2 11:44:46 2026

From Newsgroup: comp.arch

On 6/2/2026 10:13 AM, MitchAlsup wrote:

Since this is, therefore, a cultural matter, and not something universal,
like the laws of physics or mathematics, we can't tell what a man from
Mars would prefer.

Middle endian!! Start in the middle and then one step left followed by
one step write--more or less like PDP-11 FP.

But then you have the "discussion" with those who want to start with a
step to the right, followed by one to the left! :-). And that doesn't
even address (pun intended), the issue of when you have an even number
of bits/bytes/words, do you start with the one to the right of the
"middle" or the left. :-)
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 19:07:38 2026

From Newsgroup: comp.arch

On Tue, 02 Jun 2026 18:25:12 +0000, Thomas Koenig wrote:

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

Chinese and Japanese is written top to bottom ...

In classical times, yes, but modern texts are written left to right.

In Taiwan and Hong Kong, books written top to bottom, and then right to
left, and bound like books written right to left, were still being printed
in the 1960s.

As far as endianness is concerned, however, the Chinese wrote numbers with
the most significant digit on the top, so for purposes of this discussion, Chinese was big-endian even traditionally.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue Jun 2 19:10:51 2026

From Newsgroup: comp.arch

On Tue, 02 Jun 2026 17:50:38 +0200, Terje Mathisen wrote:

In the current environment where every language is expected to be
compatible with a generic IDE like Visual Studio Code, via open source interface specifications, having a proprietary debug format seems like a
good way to strongly limit your potential customer base.

You appear to have understood his post in a different way than I did.

I wasn't thinking of the kind of debug information provided by a compiler.

I was thinking of leaving debug information in when one was distributing software to customers.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Jun 2 19:37:17 2026

From Newsgroup: comp.arch

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> posted:

On 6/2/2026 10:13 AM, MitchAlsup wrote:

Since this is, therefore, a cultural matter, and not something universal, >> like the laws of physics or mathematics, we can't tell what a man from
Mars would prefer.

Middle endian!! Start in the middle and then one step left followed by
one step write--more or less like PDP-11 FP.

But then you have the "discussion" with those who want to start with a
step to the right, followed by one to the left! :-). And that doesn't
even address (pun intended), the issue of when you have an even number
of bits/bytes/words, do you start with the one to the right of the
"middle" or the left. :-)

You could do random endian where a LSFR based on the address sequence determines MEL or MER.

--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Tue Jun 2 22:00:06 2026

From Newsgroup: comp.arch

According to John Dallman <jgd@cix.co.uk>:

In article <10uks4f$1dqo$1@gal.iecc.com>, johnl@taugh.com (John Levine) >wrote:

Having looked into this in some detail, both when IBM used
bigendian order on S/360 and DEC used little-endian on the
PDP-11, neither documented the reasons for the byte order
choice at all. Not even a litle bit.

Brooks and Blaauw, two of the S/360 architects, consider the subject in
their much later book _Computer Architecture_, on p. 99:

"The more logical convention, the Big Endian, considers the whole
storage space as one steam of bits. Bits, bytes and words are
numbered from left to right, following the convention of writing
in Western culture."

I'd forgotten about that. Given who they were it's not surprising they found their preconceptions to be "more logical".

On the next page they said (written in 1997):

"Unlike Swift's, the computer Endian controversy is not pointless. The Little Endian design has many complications in use; we much prefer the
Big Endian. Having two active conventions is very painful. Several recent
Big Endian RISC computers., including the MIPS, the Motorola 88000, and
the Intel i860 provide a data-movement operation that can perform the Big Endian-Little Endian permuation [Hennesy and Patterson, 1990]. We predict
that Little Endian addressing will die out, just as decimal addressing did."

Uh huh.

A few years later IBM added LOAD REVERSED and STORE REVERSED to z/Architecture and retroactively to S/390 mode on Z machines.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 00:21:03 2026

From Newsgroup: comp.arch

On Tue, 02 Jun 2026 22:00:06 +0000, John Levine wrote:

"We predict that Little Endian addressing will die out, just as
decimal addressing did."

Uh huh.

A few years later IBM added LOAD REVERSED and STORE REVERSED to z/Architecture and retroactively to S/390 mode on Z machines.

I certainly would not hazard such a bold prediction.

The prediction, though, is not hard to understand. If big-endian is more straightforward and easier to understand, but just costs an extra
transistor here and there, then in the age of billion-transistor chips,
why wouldn't it die out?

However, just because something is going to die out _eventually_ doesn't
mean it will do so any time soon. Interoperating and communicating with
that little-endian monster IBM created in 1981 is going to be important
for generating revenue for decades to come.

So the existence of load reversed and store reversed instructions doesn't prove they were wrong... even though I still would not dare to say they
are definitely right. I just think it's not unreasonable to think as they
did, provided you account for a sufficiently long timeframe.

Of course, given a sufficiently long timeframe, we might all be speaking Arabic, in which case little-endian would be the logical choice. Although
that would require fossil fuels being important for longer than the
climate could sustain it...

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 00:37:06 2026

From Newsgroup: comp.arch

On Tue, 02 Jun 2026 15:59:33 +0000, Anton Ertl wrote:

My take is, that in a world with different access widths (e.g.,
accessing a register for a 32-bit value or a 64-bit value),
bit-big-endian is a bad idea.

There is an argument for that.

But if a computer does have bit-field instructions, I tend to consider it insane for it to number bits in the opposite direction of its endianness.

Even though the problem isn't necessarily all that bad; as long as the bit fields are genuinely contiguous, then only the names of the bits in a byte
are encoded funny.

So if a 32 bit number is stored in bytes 5001, 5000, 4999, and 4998, from
most significant byte to least significant, and you specify a 9-bit field starting in bit 6 of byte 4999... and the bits are numbered in big-endian order... the same thing should happen as if you specified bit 1 of byte
4999 on a little-endian machine with little-endian bit numbering. You get
nine bits, the seven least significant of which are bits 0 through 6 of
byte 4999, and the remaining two of which are bits 6 and 7 of byte 5000.

In the more common case, where the machine is big-endian, and it is the
bit numbering that's little-endian, specifying a nine-bit field starting
in bit 6 of byte 4999 would give you bits 6 through 0 of byte 4999,
followed by bits 7 and 6 of byte 5000. Here, though, you're going from
most significant to least significant, but in both cases you're moving
forward to higher addresses, just as you do when accessing multi-byte
numbers with a byte address.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 00:41:43 2026

From Newsgroup: comp.arch

On Wed, 20 May 2026 05:38:07 +0000, Anton Ertl wrote:

* The last descendent of the PDP-11 was canceled long before the most
prominent big-endien architecture (SPARC) was canceled, and long
before Power switched its Linux support to little-endian, so the
PDP-11 had little, if any, influence on the outcome.

The long decline of big-endian happened later.

But there wouldn't have _been_ little-endian architectures to out-compete big-endian if it hadn't been for the PDP-11. That was where the idea of little-endian got started.

It wasn't the first machine to store two-word numbers least-significant-
word first. But it was the first machine to be little-endian in any other
way but that. Little-endian, as something more than an ad-hoc way to
handle one case of double-precision integers, wasn't a thing until the
PDP-11 came along.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed Jun 3 00:55:35 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Tue, 02 Jun 2026 17:50:38 +0200, Terje Mathisen wrote:

In the current environment where every language is expected to be compatible with a generic IDE like Visual Studio Code, via open source interface specifications, having a proprietary debug format seems like a good way to strongly limit your potential customer base.

You appear to have understood his post in a different way than I did.

I wasn't thinking of the kind of debug information provided by a compiler.

I was thinking of leaving debug information in when one was distributing software to customers.

Yes, you the vendor do not want random customer debugging the code,
however, you want the ability to debug the code that was distributed
on whatever medium on customer's system(s)--

AND you want to debug one copy of the running code while others are using
other processes running the code under normal use.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed Jun 3 01:03:26 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Tue, 02 Jun 2026 22:00:06 +0000, John Levine wrote:

"We predict that Little Endian addressing will die out, just as
decimal addressing did."

Uh huh.

A few years later IBM added LOAD REVERSED and STORE REVERSED to z/Architecture and retroactively to S/390 mode on Z machines.

I certainly would not hazard such a bold prediction.

The prediction, though, is not hard to understand. If big-endian is more straightforward and easier to understand, but just costs an extra
transistor here and there, then in the age of billion-transistor chips,
why wouldn't it die out?

Linux has gone all in on LE. So, if you want to start a HW company,
you are forced to either choose LE or develop your own Operating
System (with all the accoutrement involved.)

However, just because something is going to die out _eventually_ doesn't mean it will do so any time soon. Interoperating and communicating with
that little-endian monster IBM created in 1981 is going to be important
for generating revenue for decades to come.

The whole internet is Dual-endian !! With part LE and other parts BE.

So the existence of load reversed and store reversed instructions doesn't prove they were wrong... even though I still would not dare to say they
are definitely right. I just think it's not unreasonable to think as they did, provided you account for a sufficiently long timeframe.

A byte reverse instruction will also work.

Of course, given a sufficiently long timeframe, we might all be speaking Arabic, in which case little-endian would be the logical choice. Although that would require fossil fuels being important for longer than the
climate could sustain it...

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 01:52:13 2026

From Newsgroup: comp.arch

On Wed, 03 Jun 2026 01:03:26 +0000, MitchAlsup wrote:

Linux has gone all in on LE.

It's true that Linux doesn't support the big-endian version of RISC-V. But
it runs on other big-endian architectures.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Wed Jun 3 08:29:27 2026

From Newsgroup: comp.arch

On 2026-Jun-02 13:13, MitchAlsup wrote:

Middle endian!! Start in the middle and then one step left followed by
one step write--more or less like PDP-11 FP.

We're doing the time warp... again!

--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Wed Jun 3 13:54:01 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> schrieb:

On Tue, 02 Jun 2026 22:00:06 +0000, John Levine wrote:

"We predict that Little Endian addressing will die out, just as
decimal addressing did."

Uh huh.

A few years later IBM added LOAD REVERSED and STORE REVERSED to
z/Architecture and retroactively to S/390 mode on Z machines.

I certainly would not hazard such a bold prediction.

The prediction, though, is not hard to understand. If big-endian is more straightforward and easier to understand, but just costs an extra
transistor here and there, then in the age of billion-transistor chips,
why wouldn't it die out?

It causes problems with badly-written software.

Consider the following test program:

#include <stdio.h>

void printit(void *p)
{
char *c = p;
printf ("Value is: %d\n", *c);
}

int main()
{
int i = 42;
printit (&i);
return 0;
}

On a little-endian system, this prints
Value is: 42

On a big-endian system, this prints
Value is: 0

If software designers play games with this sort of thing
(knowingly or unknowingly), then software that will run
on a little-endian system will not run on a big-endian
system.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Wed Jun 3 15:33:53 2026

From Newsgroup: comp.arch

On Wed, 03 Jun 2026 13:54:01 +0000, Thomas Koenig wrote:

It causes problems with badly-written software.

I don't see that as a fault of big-endian.

One has to exert oneself to write a program equivalent to

INTEGER*2 IP
EQUIVALENCE (I, IP)
I = 42
WRITE(6,11) IP
STOP
11 FORMAT(' ', 'VALUE IS: ', I3)
END

and so the fact that it will print

VALUE IS: 0

is not a bug, it's exactly what one should expect.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Wed Jun 3 17:36:45 2026

From Newsgroup: comp.arch

According to quadi <quadibloc@ca.invalid>:

On Wed, 03 Jun 2026 13:54:01 +0000, Thomas Koenig wrote:

It causes problems with badly-written software.

I don't see that as a fault of big-endian.

Agreed. There were plenty of bugs porting BSD software from
the little-endian Vax to big-endian 68000 series. Buggy software
is buggy software.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Wed Jun 3 17:13:08 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> writes:

But there wouldn't have _been_ little-endian architectures to out-compete >big-endian if it hadn't been for the PDP-11. That was where the idea of >little-endian got started.

The Datapoint 2200 would most likely have had the little-endian
encoding if the PDP-11 had been big-endian, because it was developed independently and in parallel. The 8080's call and ret instructions
were the first where little-endian extended beyond instruction
encoding, but I expect that, once you have jump instructions with
little-endian targets, you also want return addresses to be stored in little-endian byte order (simplifies implementation). And from there
it goes to the 8086 which has 16-bit data memory accesses, and where
you also stick with little-endian if you already have done so for jump
targets and return addresses. And following the
8086, IA-32 and AMD64 would have been little-endian, too.

The 6502 would have been little-endian if the PDP-11 had been
big-endian, for technical reasons. They ignored even the big-endian
byte order of its predecessor, the 6800. And following the 6502, the
ARM would have been little-endian even if the PDP-11 had been
big-endian.

So the architectures that dominate now would be little-endian even if
the PDP-11 had been big-endian. Would they have been less successful
if the PDP-11 had been big-endian? I doubt it. At a point around
1990, most of the Unix market was big-endian, based on the 68000 being big-endian, and it seemed that if any byte order would win, it would
be big-endian. IA-32 and VAX were expected to die because they were
CISCs, and ARM was just a minor player in the RISC market at the time.

And yet, IA-32/AMD64 and ARM's instruction sets outlived all the
highly successful RISCs of the time. This would also have happened if
the PDP-11 and the VAX would have been big-endian.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Wed Jun 3 17:44:20 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> writes:

On Tue, 02 Jun 2026 15:59:33 +0000, Anton Ertl wrote:

My take is, that in a world with different access widths (e.g.,
accessing a register for a 32-bit value or a 64-bit value),
bit-big-endian is a bad idea.

There is an argument for that.

But if a computer does have bit-field instructions, I tend to consider it >insane for it to number bits in the opposite direction of its endianness.

So if big-endian bit numbering is a bad idea (and it is), big endian
byte order is a bad idea, too.

In the more common case, where the machine is big-endian, and it is the
bit numbering that's little-endian, specifying a nine-bit field starting
in bit 6 of byte 4999 would give you bits 6 through 0 of byte 4999,
followed by bits 7 and 6 of byte 5000.

This does not make sense. You have the 32-bit word with address 4998.
If you access a 9-bit field at bit 6, it extends to bit 14, and these
bits will be in the bytes at addresses 5001 and 5000 on your
byte-big-endian machine. As long as you only access this bit field
through a 32-bit access at this address, the difference does not play
a role. But once you want to access it through an 32-bit access at
4999 (now it's bit 14 through 22), a 16-bit access at 5000 (bit 6
through 14 again), or a 32-bit access at 5000 (bit 22 through 30), the different orders become confusing.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Wed Jun 3 10:42:09 2026

From Newsgroup: comp.arch

Yes, you the vendor do not want random customer debugging the code,

I also want a pony, but that doesn't make it right.

The customer will usually not want to debug your code, but sometimes
they will have to (e.g. because you the vendor don't exist any more or
don't find that product of commercial value any more, ...).

The customer deserves to be able to debug the code it's paid for.

=== Stefan
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Wed Jun 3 18:36:51 2026

From Newsgroup: comp.arch

Stefan Monnier <monnier@iro.umontreal.ca> writes:

<snip discussion of including debug data in distributed software>

Yes, you the vendor do not want random customer debugging the code,

I also want a pony, but that doesn't make it right.

The customer will usually not want to debug your code, but sometimes
they will have to (e.g. because you the vendor don't exist any more or
don't find that product of commercial value any more, ...).

The customer deserves to be able to debug the code it's paid for.

There are several reasons that a vendor may wish to refrain from
distributing the DWARF (or windows equiv) data with a software
package. For example, program identifiers may inadvertently
identify other customers, internal proprietary
information or internal codenames.

Being able to debug code without the source code doesn't seem
a particulary common use case, nor would it be a viable way
to continue to use orphaned software, other than to, perhaps,
get it working sufficient to export any application
data in an interchange form (e.g. csv or xml if supported
by the application). I would certainly not recommend
that it be used for production.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Wed Jun 3 19:22:49 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> schrieb:

On Wed, 03 Jun 2026 13:54:01 +0000, Thomas Koenig wrote:

[big-endian]

It causes problems with badly-written software.

I don't see that as a fault of big-endian.

Neither do I. I am glad I still have access to a few big-endian
machines. For example:

$ lscpu
Architecture: ppc64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Big Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Model name: POWER10 (architected), altivec supported
[...]

(which shows that big-endian Linux is still supported).

But if the software you are targeting is primarily written on,
and for, little-endian systems like x86, then the little-endian
assumption will tend to creep in - certain things like writing
an int to memory and reading back a char will "just work", and
programmers may not know or care that they are violating language
standards; they very rarely do.

So what to do? Submit bug reports and patches and hoped they are
integrated, or just bite the bullet and offer a little-endian
version as well? IBM chose the latter.

And refering to the point above, the code

One has to exert oneself to write a program equivalent to

INTEGER*2 IP
EQUIVALENCE (I, IP)
I = 42
WRITE(6,11) IP
STOP
11 FORMAT(' ', 'VALUE IS: ', I3)
END

also violates the FORTRAN standard going back to Fortran 66.
After the "I = 42" statement, IP becomes undefined according to
the language definition.

and so the fact that it will print

VALUE IS: 0

is not a bug, it's exactly what one should expect.

It could also launch World War III, provided the right operational
hardware has been installed.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed Jun 3 21:24:09 2026

From Newsgroup: comp.arch

Stefan Monnier <monnier@iro.umontreal.ca> posted:

Yes, you the vendor do not want random customer debugging the code,

I also want a pony, but that doesn't make it right.

The customer will usually not want to debug your code, but sometimes
they will have to (e.g. because you the vendor don't exist any more or
don't find that product of commercial value any more, ...).

The customer deserves to be able to debug the code it's paid for.

Does MS allow you to debug W11 or Office ...
Does Corel allow you to debug Draw ...
Does Adobe allow you to debug PDF reader ...

Which is why, sooner or later, open source should win.

=== Stefan

--- Synchronet 3.22a-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Wed Jun 3 19:05:54 2026

From Newsgroup: comp.arch

On Tue, 02 Jun 2026 15:59:33 GMT, anton@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:

The 68020 is bit-little-endian and byte-big-endian, and it has
bitfield instructions, and from what I have read, this has led to
problems (e.g., consider what to do if you have an array of 17-bit
fields: how do you access the nth element of the array?

<array>[n] ?

In C the default would be an aligned array of 32-bit containers each
of which stored a 17-bit field.

If you mean a /packed/ array in which the 17-bit fields are stored bit contiguously ... well that could get interesting.
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Jun 4 04:06:45 2026

From Newsgroup: comp.arch

On Mon, 01 Jun 2026 19:56:38 +0000, quadi wrote:

I wasn't happy. So I noticed that I actually had some unused space that
I could squeeze out. So now the 24-bit short instructions have 1/2 as
much space as they used to, which meant the only thing I had to give up
was the ability to change the condition codes.

I found that I had some unused space within the 80-bit instructions, and
that was enough to let me restore the 24-bit short instructions to their former glory.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Jun 4 04:44:33 2026

From Newsgroup: comp.arch

On Thu, 04 Jun 2026 04:06:45 +0000, quadi wrote:

I found that I had some unused space within the 80-bit instructions, and
that was enough to let me restore the 24-bit short instructions to their former glory.

Then another crazy idea came into my head. The 16-bit short instructions
are limited to operating on the first eight registers. Some believe that
this limitation will make them essentially useless.

They have a lot more opcode space allocated to them than the 24-bit short instructions. If I took that space, and gave it to the 24-bit short instructions, I could perhaps add a 24-bit memory-reference instruction!

Well, I tried, and found out that I could indeed almost do that, with, of course, a restriction to 12-bit displacements... but, at best, I could
only use two registers as destination registers for those memory-reference instructions!

So that idea had to be discarded.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Thu Jun 4 12:45:39 2026

From Newsgroup: comp.arch

George Neuner <gneuner2@comcast.net> writes:

On Tue, 02 Jun 2026 15:59:33 GMT, anton@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:

The 68020 is bit-little-endian and byte-big-endian, and it has
bitfield instructions, and from what I have read, this has led to
problems (e.g., consider what to do if you have an array of 17-bit
fields: how do you access the nth element of the array?

[...]

If you mean a /packed/ array in which the 17-bit fields are stored bit >contiguously ... well that could get interesting.

For a consistently little-endian architecture that has no alignment requirements, the access is relativelysimple:

nbit = n*17
nbyte = nbit/8
bitoffset = nbit%8
tmp = load32b(array+nbyte)
element = ext(tmp,bitoffset,17)

(ext extracts the bitfield with length 17 at bitoffset from tmp; 88000
and MIPS64r2 have such instructions).

I leave the consistently big-endian version and the byte-big-endian bit-little-endian version as exercise to those who think that these
are good ideas. I guess, for consistent big-endian, given an
appropriate definition of ext, it's pretty similar, if not the same as
above. The inconsistent variants (e.g., 68020, 88000, MIPS64r2) are
not so easy, however.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From Brian G. Lucas@bagel99@gmail.com to comp.arch on Thu Jun 4 11:53:31 2026

From Newsgroup: comp.arch

On 6/3/26 12:36 PM, John Levine wrote:

According to quadi <quadibloc@ca.invalid>:

On Wed, 03 Jun 2026 13:54:01 +0000, Thomas Koenig wrote:

It causes problems with badly-written software.

I don't see that as a fault of big-endian.

Agreed. There were plenty of bugs porting BSD software from
the little-endian Vax to big-endian 68000 series. Buggy software
is buggy software.

That followed the port to the Interdata 7/32 which was big-endian,
so BSD must not have learned from that.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Thu Jun 4 23:46:11 2026

From Newsgroup: comp.arch

Scott Lurndal [2026-06-03 18:36:51] wrote:

Being able to debug code without the source code doesn't seem
a particulary common use case,

Indeed, the source code should also be available, of course.
I started this thread by mentioning Free Software. 🙂

=== Stefan
--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Fri Jun 5 08:36:53 2026

From Newsgroup: comp.arch

Stefan Monnier <monnier@iro.umontreal.ca> schrieb:

Scott Lurndal [2026-06-03 18:36:51] wrote:

Being able to debug code without the source code doesn't seem
a particulary common use case,

Indeed, the source code should also be available, of course.
I started this thread by mentioning Free Software. 🙂

I am a big proponent of free software, but it has a basic problem:
Getting developers paid is not easy.

An example is OpenFOAM. This is a very widely used CFD package,
both in academia (because it costs nothing, and ANSYS is very
expensive, also for universities) and also now in industry because
people who come in from university have learned this during their
PhDs (you need quite some time to learn).

Funding? They want 500 k€ in 2026, which is far from excessive,
see https://openfoam.org/news/funding-2026/ , both compared to
commercial CFD companies and the value that OpenFOAM provides.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Fri Jun 5 15:07:25 2026

From Newsgroup: comp.arch

On Wed, 03 Jun 2026 00:55:35 GMT, MitchAlsup
<user5857@newsgrouper.org.invalid> wrote:

quadi <quadibloc@ca.invalid> posted:

On Tue, 02 Jun 2026 17:50:38 +0200, Terje Mathisen wrote:

In the current environment where every language is expected to be
compatible with a generic IDE like Visual Studio Code, via open source
interface specifications, having a proprietary debug format seems like a >> > good way to strongly limit your potential customer base.

You appear to have understood his post in a different way than I did.

I wasn't thinking of the kind of debug information provided by a compiler. >>
I was thinking of leaving debug information in when one was distributing
software to customers.

Yes, you the vendor do not want random customer debugging the code,
however, you want the ability to debug the code that was distributed
on whatever medium on customer's system(s)--

AND you want to debug one copy of the running code while others are using >other processes running the code under normal use.

That is true, but the issue at hand is how to achieve that. Leaving
debug information /in/ the executable, I think, is a bad idea.

However, many (most?) toolchains provide a way to separate debug
symbols from the executable - either by generating a separate symbol
database in the 1st place, or by allowing debug data to be stripped
from the executables. If you have to debug at the client site, you
simply take the symbol database with you.

Another useful method is to write out debug information as the program
executes and arrange that it either is suppressed or (alternatively)
goes to /dev/null unless some undocumented flag is given.
[Obviously where speed is paramount you can't be generating
unnecessary output, so the utility of this method is situation
dependent.]

I have used both of these methods in the past.
YMMV.
--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Sat Jun 6 01:37:13 2026

From Newsgroup: comp.arch

Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

long bar(long x, long y)
{
return x/2+y/2;
}

...

Trying the same on a MIPS64 machine with gcc-8.3 (which apparently
produces ILP32 code) produces a call to __addvsi3 instead of the
expected add instruction:

gcc -O3 -ftrapv gcc -O3
lui gp,0x0 srl v0,a0,0x1f
addiu gp,gp,0 srl v1,a1,0x1f
addu gp,gp,t9 addu v0,v0,a0
srl v1,a0,0x1f addu a1,v1,a1
lw t9,__addvsi3(gp) sra v0,v0,0x1
srl v0,a1,0x1f sra a1,a1,0x1
addiu sp,sp,-32 jr ra
addu a0,v1,a0 addu v0,v0,a1
addu a1,v0,a1
sra a0,a0,0x1
sw ra,28(sp)
sw gp,16(sp)
jalr t9
sra a1,a1,0x1
lw ra,28(sp)
jr ra
addiu sp,sp,32

The call costs a lot of overhead.

Architectures without overflow traps are notorious for excess instruction >>count when overflow detection is desired or mandated.

MIPS' add traps on overflow. gcc could have emitted almost the same
code for gcc -O3 -trapv as for gcc -O3, except that the last
instruction would be an add, not an addu. But apparently nobody gives
a damn about the efficiency of -trapv, possibly rightly so.

My guess is that GCC developers care more about -trapv than about
MIPS. AFAICS several architectures officialy supported by GCC
struggle to work at all. I suspect that maintainers of MIPS
backend are happy that -trapv works and do not have resources
to make it efficient.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Sat Jun 6 07:57:46 2026

From Newsgroup: comp.arch

Waldek Hebisch <antispam@fricas.org> schrieb:

Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

MIPS' add traps on overflow. gcc could have emitted almost the same
code for gcc -O3 -trapv as for gcc -O3, except that the last
instruction would be an add, not an addu. But apparently nobody gives
a damn about the efficiency of -trapv, possibly rightly so.

My guess is that GCC developers care more about -trapv than about
MIPS.

It is a common misconception to treat GCC developers as a
monolithic group. There are hobbyists (such as myself) but
I would guess only a small minority of work is done by them,
with the notable exception of some front ends such as Fortran or
(the most recent example) Algol 68. There are employees by
different companies: Linux distributors like RedHat or Suse,
Large software companies like Google, hardware vendors like
IBM, Intel or Qualcomm, ...

For MIPS, there are not so many active people and commits.
mips64-linux-gnu is a secondary platform, so if it fails
bootstrap, a release would be held up, but a wrong-code
regression will not.

Counting changes since 2025-01-01 in the gcc/config directories
can give a good idea of the relative activity for different
subdirectories; I cut this off below 7, where the PDP-11 is (note
that architecture names are often historical, so i386 includes
x86_64, s390 includes Z, rs6000 includes POWER and so on).

539 ./riscv
435 ./i386
432 ./aarch64
177 ./loongarch
100 ./arm
85 ./s390
75 ./avr
72 ./xtensa
60 ./rs6000
51 ./gcn
39 ./nvptx
25 ./sparc
25 ./mips
20 ./arc
19 ./pa
19 ./bpf
19 ./alpha
18 ./pru
15 ./sh
13 ./rx
13 ./cris
12 ./or1k
12 ./microblaze
12 ./m68k
11 ./lm32
11 ./ia64
11 ./h8300
9 ./vax
9 ./nds32
8 ./mcore
8 ./epiphany
8 ./c6x
7 ./visium
7 ./rl78
7 ./pdp11
7 ./frv
7 ./csky

AFAICS several architectures officialy supported by GCC
struggle to work at all. I suspect that maintainers of MIPS
backend are happy that -trapv works and do not have resources
to make it efficient.

First, they would need to know about this, which requires a PR,
but resources may well be lacking.

There are currently 28 open "missed-optimization" bugs with mips
in their target field. Looking at a few architectures above,
RISC-V has 118, x86 has 943, aarch64 has 305, power has 133.
(Some bugs affect more than one architecture, of course).

But it is worth submitting a PR nonetheless, if anybody cares enough :-)
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Sat Jun 6 08:57:09 2026

From Newsgroup: comp.arch

George Neuner <gneuner2@comcast.net> writes:

That is true, but the issue at hand is how to achieve that. Leaving
debug information /in/ the executable, I think, is a bad idea.

On the contrary, it's an excellent idea. It means that the debug
information goes with the code. No chance of confusing yourself by inadvertantly associating the wrong debugging information with the
code, and much less chance of not finding the correct debug
information.

Best of all, of course, is to deliver the source code.

Another useful method is to write out debug information as the program >executes and arrange that it either is suppressed or (alternatively)
goes to /dev/null unless some undocumented flag is given.

Undocumented features are forgotten and reimplemented. There's the
story of Microsoft embedding some watermark into Microsoft BASIC
twice, the second time apparently because they had forgotten about the
first time.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Sun Jun 7 16:42:25 2026

From Newsgroup: comp.arch

George Neuner <gneuner2@comcast.net> writes:

On Wed, 03 Jun 2026 00:55:35 GMT, MitchAlsup ><user5857@newsgrouper.org.invalid> wrote:

quadi <quadibloc@ca.invalid> posted:

On Tue, 02 Jun 2026 17:50:38 +0200, Terje Mathisen wrote:

In the current environment where every language is expected to be
compatible with a generic IDE like Visual Studio Code, via open source >>> > interface specifications, having a proprietary debug format seems like a >>> > good way to strongly limit your potential customer base.

You appear to have understood his post in a different way than I did.

I wasn't thinking of the kind of debug information provided by a compiler. >>>
I was thinking of leaving debug information in when one was distributing >>> software to customers.

Yes, you the vendor do not want random customer debugging the code, >>however, you want the ability to debug the code that was distributed
on whatever medium on customer's system(s)--

AND you want to debug one copy of the running code while others are using >>other processes running the code under normal use.

That is true, but the issue at hand is how to achieve that. Leaving
debug information /in/ the executable, I think, is a bad idea.

However, many (most?) toolchains provide a way to separate debug
symbols from the executable - either by generating a separate symbol
database in the 1st place, or by allowing debug data to be stripped
from the executables. If you have to debug at the client site, you
simply take the symbol database with you.

Indeed, and that's been the common paradigm at my employers

I'll also note that many linux distributions include the debug
symbols for the distribution in optionally loaded packages.

Another useful method is to write out debug information as the program >executes and arrange that it either is suppressed or (alternatively)
goes to /dev/null unless some undocumented flag is given.

We arrange for the application to be able to be configured
(both statically before startup and dynamically during
runtime) to produce additional debug logging. Generally
arranged in the code to avoid significant impact to non-debug
performance (e.g. using __builtin_expect with GCC toolchains).

--- Synchronet 3.22a-Linux NewsLink 1.2

From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Sun Jun 7 15:05:24 2026

From Newsgroup: comp.arch

On 6/4/2026 8:46 PM, Stefan Monnier wrote:

Scott Lurndal [2026-06-03 18:36:51] wrote:

Being able to debug code without the source code doesn't seem
a particulary common use case,

Indeed, the source code should also be available, of course.
I started this thread by mentioning Free Software. 🙂

Note that free does not equal open source. There is a fair amount of
software that is freely available for which the source is not. Many of
these are reduced functionality versions of paid for software, e.g.
Adobe PDF reader, but there are others.
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 8 01:19:17 2026

From Newsgroup: comp.arch

On Sun, 07 Jun 2026 15:05:24 -0700, Stephen Fuld wrote:

Note that free does not equal open source. There is a fair amount of software that is freely available for which the source is not. Many of
these are reduced functionality versions of paid for software, e.g.
Adobe PDF reader, but there are others.

Commonly, when this distinction is discussed in the open-source community,
the phrases "free as in beer" and "free as in freedom" are used to
distinguish between freeware that remains proprietary versus true open-
source software under the GPL.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 8 06:05:59 2026

From Newsgroup: comp.arch

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/4/2026 8:46 PM, Stefan Monnier wrote:

I started this thread by mentioning Free Software. 🙂

Note that free does not equal open source. There is a fair amount of >software that is freely available for which the source is not. Many of >these are reduced functionality versions of paid for software, e.g.
Adobe PDF reader, but there are others.

The Adobe PDF reader is chained software (aka proprietary software),
not free software.

In the appendix of "1984" George Orwell wrote:

|To give a single example, the word free still existed in Newspeak, but
|could only be used in such statements as "The dog is free from lice"
|or "This field is free from weeds." It could not be used in its old
|sense of "politically free" or "intellectually free," since political
|and intellectual freedom no longer existed even as concepts, and were |therefore of necessity nameless.

Some of us obviously already write and think in Newspeak.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Mon Jun 8 09:25:32 2026

From Newsgroup: comp.arch

On Mon, 8 Jun 2026 01:19:17 -0000 (UTC)
quadi <quadibloc@ca.invalid> wrote:

On Sun, 07 Jun 2026 15:05:24 -0700, Stephen Fuld wrote:

Note that free does not equal open source. There is a fair amount
of software that is freely available for which the source is not.
Many of these are reduced functionality versions of paid for
software, e.g. Adobe PDF reader, but there are others.

Commonly, when this distinction is discussed in the open-source
community, the phrases "free as in beer" and "free as in freedom" are
used to distinguish between freeware that remains proprietary versus
true open- source software under the GPL.

John Savard

I strongly disagree with statement that true open source software is
equivalent of GPL.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Mon Jun 8 11:45:51 2026

From Newsgroup: comp.arch

Michael S wrote:

On Mon, 8 Jun 2026 01:19:17 -0000 (UTC)
quadi <quadibloc@ca.invalid> wrote:

On Sun, 07 Jun 2026 15:05:24 -0700, Stephen Fuld wrote:

Note that free does not equal open source. There is a fair amount
of software that is freely available for which the source is not.
Many of these are reduced functionality versions of paid for
software, e.g. Adobe PDF reader, but there are others.

Commonly, when this distinction is discussed in the open-source
community, the phrases "free as in beer" and "free as in freedom" are
used to distinguish between freeware that remains proprietary versus
true open- source software under the GPL.

John Savard

I strongly disagree with statement that true open source software is equivalent of GPL.

The obviously "most free" sw must be public domain, right?

Followed by free use but attribution required/requested?

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
--- Synchronet 3.22a-Linux NewsLink 1.2

From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Mon Jun 8 07:30:46 2026

From Newsgroup: comp.arch

On 6/7/2026 11:05 PM, Anton Ertl wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/4/2026 8:46 PM, Stefan Monnier wrote:

I started this thread by mentioning Free Software. 🙂

Note that free does not equal open source. There is a fair amount of
software that is freely available for which the source is not. Many of
these are reduced functionality versions of paid for software, e.g.
Adobe PDF reader, but there are others.

The Adobe PDF reader is chained software (aka proprietary software),
not free software.

I don't want to get into a semantic argument here. I don't know what
you mean by the term "chained software". I only meant that anyone
could use it without paying anything to anyone. In the sense that John
talked about, it is free beer.

If I misinterpreted Stefan's use of the word free, then I apologize.

In the appendix of "1984" George Orwell wrote:

|To give a single example, the word free still existed in Newspeak, but |could only be used in such statements as "The dog is free from lice"
|or "This field is free from weeds." It could not be used in its old
|sense of "politically free" or "intellectually free," since political
|and intellectual freedom no longer existed even as concepts, and were |therefore of necessity nameless.

Some of us obviously already write and think in Newspeak.

I hardly think that using the word free to mean "you don't have to pay
for it" is Newspeak.
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Mon Jun 8 15:19:57 2026

From Newsgroup: comp.arch

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/7/2026 11:05 PM, Anton Ertl wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/4/2026 8:46 PM, Stefan Monnier wrote:

I started this thread by mentioning Free Software. 🙂

Note that free does not equal open source. There is a fair amount of
software that is freely available for which the source is not. Many of
these are reduced functionality versions of paid for software, e.g.
Adobe PDF reader, but there are others.

The Adobe PDF reader is chained software (aka proprietary software),
not free software.

I don't want to get into a semantic argument here. I don't know what
you mean by the term "chained software". I only meant that anyone
could use it without paying anything to anyone. In the sense that John >talked about, it is free beer.

Acroread sends basic telemetry to Adobe every time you use it,
so in a sense, it's not exactly free.

xpdf on the other hand....

--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Mon Jun 8 16:18:37 2026

From Newsgroup: comp.arch

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/7/2026 11:05 PM, Anton Ertl wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/4/2026 8:46 PM, Stefan Monnier wrote:

I started this thread by mentioning Free Software. 🙂

Note that free does not equal open source. There is a fair amount of
software that is freely available for which the source is not. Many of
these are reduced functionality versions of paid for software, e.g.
Adobe PDF reader, but there are others.

The Adobe PDF reader is chained software (aka proprietary software),
not free software.

I don't want to get into a semantic argument here. I don't know what
you mean by the term "chained software".

Non-free software. I put the more commonly used term in parentheses.

I only meant that anyone
could use it without paying anything to anyone.

That's not what "free software" means. The four essential freedoms of
software are
<https://www.gnu.org/philosophy/free-sw.en.html#fs-definition>:

|* The freedom to run the program as you wish, for any purpose (freedom 0).
|
|* The freedom to study how the program works, and change it so it does
| your computing as you wish (freedom 1). Access to the source code is
| a precondition for this.
|
|* The freedom to redistribute copies so you can help others (freedom 2).
|
|* The freedom to distribute copies of your modified versions to others
| (freedom 3). By doing this you can give the whole community a chance
| to benefit from your changes. Access to the source code is a
| precondition for this.
|
|A program is free software if it gives users adequately all of these |freedoms. Otherwise, it is nonfree.

In the appendix of "1984" George Orwell wrote:

|To give a single example, the word free still existed in Newspeak, but
|could only be used in such statements as "The dog is free from lice"
|or "This field is free from weeds." It could not be used in its old
|sense of "politically free" or "intellectually free," since political
|and intellectual freedom no longer existed even as concepts, and were
|therefore of necessity nameless.

Some of us obviously already write and think in Newspeak.

I hardly think that using the word free to mean "you don't have to pay
for it" is Newspeak.

Orwell did not think about that meaning when he gave an example of
Newspeak use of "free", so if the meaning "gratis" for "free" existed
when he wrote the book in 1949, it was not widely-enough used to make
it into the book. In any case, the meaning "free from lice" existed
when Orwell wrote the book and still exists in Newspeak. Newspeak
does not introduce new meanings, but the elimines the "freedom"
meaning. And in your case, Newspeak obviously has been successful
(not the Ingsoc variant ("free from lice"), but the surveillance
capitalism variant ("you don't pay [money] for it")).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Jun 8 17:11:35 2026

From Newsgroup: comp.arch

On Mon, 08 Jun 2026 09:25:32 +0300, Michael S wrote:

On Mon, 8 Jun 2026 01:19:17 -0000 (UTC)
quadi <quadibloc@ca.invalid> wrote:

On Sun, 07 Jun 2026 15:05:24 -0700, Stephen Fuld wrote:

Note that free does not equal open source. There is a fair amount of
software that is freely available for which the source is not. Many
of these are reduced functionality versions of paid for software,
e.g. Adobe PDF reader, but there are others.

Commonly, when this distinction is discussed in the open-source
community, the phrases "free as in beer" and "free as in freedom" are
used to distinguish between freeware that remains proprietary versus
true open- source software under the GPL.

I strongly disagree with statement that true open source software is equivalent of GPL.

I did not mean to imply that _only_ GPL-licensed software is truly open source. The GPL license is only the most common example. There is, as
another reply has already noted, also MIT-licensed software and public
domain software.

And the term "open source", of course, is broader than this, as well. It
isn't incorrect to use that term for any software the source code of which
is open to inspection, even if the software itself is proprietary.

John Savard

--- Synchronet 3.22a-Linux NewsLink 1.2

From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Mon Jun 8 10:40:17 2026

From Newsgroup: comp.arch

On 6/8/2026 9:18 AM, Anton Ertl wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/7/2026 11:05 PM, Anton Ertl wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/4/2026 8:46 PM, Stefan Monnier wrote:

I started this thread by mentioning Free Software. 🙂

Note that free does not equal open source. There is a fair amount of
software that is freely available for which the source is not. Many of >>>> these are reduced functionality versions of paid for software, e.g.
Adobe PDF reader, but there are others.

The Adobe PDF reader is chained software (aka proprietary software),
not free software.

I don't want to get into a semantic argument here. I don't know what
you mean by the term "chained software".

Non-free software. I put the more commonly used term in parentheses.

OK.

I only meant that anyone
could use it without paying anything to anyone.

That's not what "free software" means. The four essential freedoms of software are
<https://www.gnu.org/philosophy/free-sw.en.html#fs-definition>:

|* The freedom to run the program as you wish, for any purpose (freedom 0).
|
|* The freedom to study how the program works, and change it so it does
| your computing as you wish (freedom 1). Access to the source code is
| a precondition for this.
|
|* The freedom to redistribute copies so you can help others (freedom 2).
|
|* The freedom to distribute copies of your modified versions to others
| (freedom 3). By doing this you can give the whole community a chance
| to benefit from your changes. Access to the source code is a
| precondition for this.
|
|A program is free software if it gives users adequately all of these |freedoms. Otherwise, it is nonfree.

That is certainly *a* definition. It is obviously your preferred
definition. But there are others.

snipped the Orwell quotation.

Can you accept that others might have a different definition without
insulting them? (I take the assertion that I am using Newspeak as an
insult.)
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
--- Synchronet 3.22a-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon Jun 8 09:18:02 2026

From Newsgroup: comp.arch

Indeed, the source code should also be available, of course.
I started this thread by mentioning Free Software. 🙂

Note that free does not equal open source. There is a fair amount of software that is freely available for which the source is not. Many of
these are reduced functionality versions of paid for software, e.g. Adobe
PDF reader, but there are others.

You may want to check on Wikipedia what is [Free Software](https://en.wikipedia.org/wiki/Free_software) before jumping
to conclusions.

I capitalized "Free Software" for a reason.

=== Stefan
--- Synchronet 3.22a-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Mon Jun 8 22:43:40 2026

From Newsgroup: comp.arch

On Mon, 08 Jun 2026 16:18:37 GMT
anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/7/2026 11:05 PM, Anton Ertl wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/4/2026 8:46 PM, Stefan Monnier wrote:

I started this thread by mentioning Free Software. ðŸ™‚

Note that free does not equal open source. There is a fair
amount of software that is freely available for which the source
is not. Many of these are reduced functionality versions of paid
for software, e.g. Adobe PDF reader, but there are others.

The Adobe PDF reader is chained software (aka proprietary
software), not free software.

I don't want to get into a semantic argument here. I don't know
what you mean by the term "chained software".

Non-free software. I put the more commonly used term in parentheses.

I only meant that anyone
could use it without paying anything to anyone.

That's not what "free software" means. The four essential freedoms of software are
<https://www.gnu.org/philosophy/free-sw.en.html#fs-definition>:

|* The freedom to run the program as you wish, for any purpose
(freedom 0). |
|* The freedom to study how the program works, and change it so it
does | your computing as you wish (freedom 1). Access to the source
code is | a precondition for this.
|
|* The freedom to redistribute copies so you can help others (freedom
2). |
|* The freedom to distribute copies of your modified versions to
others | (freedom 3). By doing this you can give the whole community
a chance | to benefit from your changes. Access to the source code
is a | precondition for this.
|
|A program is free software if it gives users adequately all of these |freedoms. Otherwise, it is nonfree.

In the appendix of "1984" George Orwell wrote:

|To give a single example, the word free still existed in
Newspeak, but |could only be used in such statements as "The dog
is free from lice" |or "This field is free from weeds." It could
not be used in its old |sense of "politically free" or
"intellectually free," since political |and intellectual freedom
no longer existed even as concepts, and were |therefore of
necessity nameless.

Some of us obviously already write and think in Newspeak.

I hardly think that using the word free to mean "you don't have to
pay for it" is Newspeak.

Orwell did not think about that meaning when he gave an example of
Newspeak use of "free", so if the meaning "gratis" for "free" existed
when he wrote the book in 1949, it was not widely-enough used to make
it into the book. In any case, the meaning "free from lice" existed
when Orwell wrote the book and still exists in Newspeak. Newspeak
does not introduce new meanings, but the elimines the "freedom"
meaning. And in your case, Newspeak obviously has been successful
(not the Ingsoc variant ("free from lice"), but the surveillance
capitalism variant ("you don't pay [money] for it")).

- anton

I tend to think that it was the other way around.
RMS invented a new meaning of the term "free software" and then he and
his devotees started to insist that it the the only correct meaning.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Mon Jun 8 22:51:52 2026

From Newsgroup: comp.arch

On Mon, 08 Jun 2026 15:19:57 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/7/2026 11:05 PM, Anton Ertl wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 6/4/2026 8:46 PM, Stefan Monnier wrote:

I started this thread by mentioning Free Software. ðŸ™‚

Note that free does not equal open source. There is a fair
amount of software that is freely available for which the source
is not. Many of these are reduced functionality versions of paid
for software, e.g. Adobe PDF reader, but there are others.

The Adobe PDF reader is chained software (aka proprietary
software), not free software.

I don't want to get into a semantic argument here. I don't know
what you mean by the term "chained software". I only meant that
anyone could use it without paying anything to anyone. In the sense
that John talked about, it is free beer.

Acroread sends basic telemetry to Adobe every time you use it,
so in a sense, it's not exactly free.

xpdf on the other hand....

I prefer SumatraPdf. GPL3, but not avalable outside Windows.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon Jun 8 18:03:43 2026

From Newsgroup: comp.arch

Anton Ertl [2026-06-08 16:18:37] wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

I hardly think that using the word free to mean "you don't have to pay
for it" is Newspeak.

Orwell did not think about that meaning when he gave an example of
Newspeak use of "free", so if the meaning "gratis" for "free" existed
when he wrote the book in 1949, it was not widely-enough used to make
it into the book. In any case, the meaning "free from lice" existed
when Orwell wrote the book and still exists in Newspeak. Newspeak
does not introduce new meanings, but the elimines the "freedom"
meaning. And in your case, Newspeak obviously has been successful
(not the Ingsoc variant ("free from lice"), but the surveillance
capitalism variant ("you don't pay [money] for it")).

Well, Stephen is hardly using a recent meaning of the word "free".
According to the OED, "free" as in "free of charge" traces back to the
13th century, so it clearly existed in Orwell's time.

But yes, I find it demoralizing that people within the computer world
are still making this mistake, after more than 40 years of FSF.

=== Stefan
--- Synchronet 3.22a-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Mon Jun 8 17:44:59 2026

From Newsgroup: comp.arch

Terje Mathisen [2026-06-08 11:45:51] wrote:

Michael S wrote:
The obviously "most free" sw must be public domain, right?

As with most things related to freedom ... it depends.

Public domain offers "more freedom" when you consider the point of view
of the developers, who can use that software any way they want with no restrictions at all.

But not when you consider the point of view of the end-users who may
receive code compiled/derived from that public domain source code with
no way to recover that public domain source code, or to change or fix
it. It may even be illegal to try to recover it (since the DMCA
disallows several forms of reverse engineering).

From that end-user point of view, the GPL arguably ensures "more
freedom" than public domain.

=== Stefan
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Tue Jun 9 10:06:48 2026

From Newsgroup: comp.arch

On 09/06/2026 00:03, Stefan Monnier wrote:

Anton Ertl [2026-06-08 16:18:37] wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

I hardly think that using the word free to mean "you don't have to pay
for it" is Newspeak.

Orwell did not think about that meaning when he gave an example of
Newspeak use of "free", so if the meaning "gratis" for "free" existed
when he wrote the book in 1949, it was not widely-enough used to make
it into the book. In any case, the meaning "free from lice" existed
when Orwell wrote the book and still exists in Newspeak. Newspeak
does not introduce new meanings, but the elimines the "freedom"
meaning. And in your case, Newspeak obviously has been successful
(not the Ingsoc variant ("free from lice"), but the surveillance
capitalism variant ("you don't pay [money] for it")).

Well, Stephen is hardly using a recent meaning of the word "free".
According to the OED, "free" as in "free of charge" traces back to the
13th century, so it clearly existed in Orwell's time.

But yes, I find it demoralizing that people within the computer world
are still making this mistake, after more than 40 years of FSF.

It is not a mistake - it is merely a different but perfectly reasonable
use of the same word. The FSF has done (and continues to do) wonderful
things that are of huge benefit to the computing world, and I am a big
fan of what they term "Free Software". But they do not own the word
"free", nor do they have rights to determine the definition of the
phrase "free software". People can, and do, use the phrase meaning
"gratis software". In any discussion on the topic, it is good to be
entirely clear on the intended meanings - but that applies equally to
those who write "free software" meaning "libre software" and "free
software" meaning "gratis software". Neither are "mistaken", and both
can cause confusion. (In longer phrases, acronyms, or proper names of organisations, there should be no confusion - FOSS or FSF should be
clear to all.)

(As for the discussion about what is the most "free", or "libre",
licensing model - you can argue about it until you are blue in the face,
but no conclusion can be reached because it depends on the point of
view. Freedoms are always a balance and a tradeoff to some extent.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Tue Jun 9 17:24:24 2026

From Newsgroup: comp.arch

Michael S <already5chosen@yahoo.com> writes:

On Mon, 08 Jun 2026 16:18:37 GMT
anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

Orwell did not think about that meaning when he gave an example of
Newspeak use of "free", so if the meaning "gratis" for "free" existed
when he wrote the book in 1949, it was not widely-enough used to make
it into the book. In any case, the meaning "free from lice" existed
when Orwell wrote the book and still exists in Newspeak. Newspeak
does not introduce new meanings, but the elimines the "freedom"
meaning. And in your case, Newspeak obviously has been successful
(not the Ingsoc variant ("free from lice"), but the surveillance
capitalism variant ("you don't pay [money] for it")).
=20
- anton

I tend to think that it was the other way around.
RMS invented a new meaning of the term "free software"

The story that I read was that all software originally was free in the
FSF sense (i.e., provided the four freedoms).[1] Then some people
removed some or all freedoms from some software, typically with the
goal of making money from the software. Removing the freedoms and yet
not asking for money is a later development; this has often been
called shareware or freeware, but Stephen Fuld is the first one I have
seen who has called it "free software", and actually misunderstood a
reference to "Free Software" (capitalized).

[1] As an example, <https://en.wikipedia.org/wiki/SHARE_(computing)>
states:

|Originally, IBM distributed what software it provided in source
|form[2][3][4] and systems programmers commonly made small local
|additions or modifications and exchanged them with other users.

All four freedoms were exercised here, more than two decades before
the Free Software Foundation.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Tue Jun 9 21:15:41 2026

From Newsgroup: comp.arch

Anton Ertl wrote:

Michael S <already5chosen@yahoo.com> writes:

On Mon, 08 Jun 2026 16:18:37 GMT
anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

Orwell did not think about that meaning when he gave an example of
Newspeak use of "free", so if the meaning "gratis" for "free" existed
when he wrote the book in 1949, it was not widely-enough used to make
it into the book. In any case, the meaning "free from lice" existed
when Orwell wrote the book and still exists in Newspeak. Newspeak
does not introduce new meanings, but the elimines the "freedom"
meaning. And in your case, Newspeak obviously has been successful
(not the Ingsoc variant ("free from lice"), but the surveillance
capitalism variant ("you don't pay [money] for it")).
=20
- anton

I tend to think that it was the other way around.
RMS invented a new meaning of the term "free software"

The story that I read was that all software originally was free in the
FSF sense (i.e., provided the four freedoms).[1] Then some people
removed some or all freedoms from some software, typically with the
goal of making money from the software. Removing the freedoms and yet
not asking for money is a later development; this has often been
called shareware or freeware, but Stephen Fuld is the first one I have
seen who has called it "free software", and actually misunderstood a reference to "Free Software" (capitalized).

[1] As an example, <https://en.wikipedia.org/wiki/SHARE_(computing)>
states:

|Originally, IBM distributed what software it provided in source |form[2][3][4] and systems programmers commonly made small local
|additions or modifications and exchanged them with other users.

All four freedoms were exercised here, more than two decades before
the Free Software Foundation.

Yes, with one importnt restriction:

The software was free, but you could not use it except on IBM hardware,
which was quite expensive.

When clones started to appear (Amdahl?) I believe the free sw
disappeared, now it was explicitly licensed to only run on "real" IBM hardware?

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
--- Synchronet 3.22a-Linux NewsLink 1.2

From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Tue Jun 9 16:29:01 2026

From Newsgroup: comp.arch

On 6/9/2026 12:15 PM, Terje Mathisen wrote:

Anton Ertl wrote:

Michael S <already5chosen@yahoo.com> writes:

On Mon, 08 Jun 2026 16:18:37 GMT
anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

Orwell did not think about that meaning when he gave an example of
Newspeak use of "free", so if the meaning "gratis" for "free" existed
when he wrote the book in 1949, it was not widely-enough used to make
it into the book. In any case, the meaning "free from lice" existed
when Orwell wrote the book and still exists in Newspeak. Newspeak
does not introduce new meanings, but the elimines the "freedom"
meaning. And in your case, Newspeak obviously has been successful
(not the Ingsoc variant ("free from lice"), but the surveillance
capitalism variant ("you don't pay [money] for it")).
=20
- anton

I tend to think that it was the other way around.
RMS invented a new meaning of the term "free software"

The story that I read was that all software originally was free in the
FSF sense (i.e., provided the four freedoms).[1] Then some people
removed some or all freedoms from some software, typically with the
goal of making money from the software. Removing the freedoms and yet
not asking for money is a later development; this has often been
called shareware or freeware, but Stephen Fuld is the first one I have
seen who has called it "free software", and actually misunderstood a
reference to "Free Software" (capitalized).

[1] As an example, <https://en.wikipedia.org/wiki/SHARE_(computing)>
states:

|Originally, IBM distributed what software it provided in source
|form[2][3][4] and systems programmers commonly made small local
|additions or modifications and exchanged them with other users.

All four freedoms were exercised here, more than two decades before
the Free Software Foundation.

Yes, with one importnt restriction:

The software was free, but you could not use it except on IBM hardware, which was quite expensive.

When clones started to appear (Amdahl?) I believe the free sw
disappeared, now it was explicitly licensed to only run on "real" IBM hardware?

Sort of, but it was more complicated than that. In the 1960s (and
before), IBM "bundled" (i.e. it was freely included) all software with
the hardware. In 1969, the US government filed an anti-trust case
against IBM, claiming, among other things, monopolization of the
software market. One of the Government's goals was to support an
independent software market (which couldn't exist if IBM gave everything
away for free). The suit dragged on for years and was ultimately
withdrawn, but IBM was scared about what the suit could do. So it
initiated "unbundling" of software (and other things like education
classes), now charging separately for each software product the customer wanted. This was ultimately successful for the government, leading to
the success of companies like Syncsort (1971), and later several
competitive database systems (e.g Total, IDMS, etc.) But it also
allowed Amdahl (in 1971), and later other PCM hardware companies, to
sell competitive CPUs with the assurance that they could license the OS,
etc. from IBM.
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
--- Synchronet 3.22a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Wed Jun 10 06:01:19 2026

From Newsgroup: comp.arch

Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:

The story that I read was that all software originally was free in the
FSF sense (i.e., provided the four freedoms).[1]

There is an anecdote in "Abstracting Away the Machine".
IBM supplied a customer with its Fortran compiler (Fortran I at
the time). The customer noted that tape use was inefficient,
leading to longer than necessary compile times, and asked for
the source to improve it. Somebody at IBM refused, quipping "IBM
does not supply source code". So the customer went ahead, reverse
engieered the compiler and added the improvements anyway (which
were huge). When IBM noticed that, they asked for the improvement,
and the customer uipped back "$COMPANY does not supply object code",
and refused.

I'd have to search for the anecdote in the book to get the details
exactly right.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Thu Jun 11 15:14:26 2026

From Newsgroup: comp.arch

According to Stefan Monnier <monnier@iro.umontreal.ca>:

But not when you consider the point of view of the end-users who may
receive code compiled/derived from that public domain source code with
no way to recover that public domain source code, or to change or fix
it. It may even be illegal to try to recover it (since the DMCA
disallows several forms of reverse engineering).

If it's really public domain, there is no bar to reverse engineering
since there is nobody who can complain about it. I agree there are
other kinds of software where the executable is freely available but
the authors choose not to provide source and could use the DMCA
against people who reverse engineer.

From that end-user point of view, the GPL arguably ensures "more
freedom" than public domain.

I definitely agree with "arguably".

Speaking of the DMCA, the US Copyright Office is starting its tenth proceeding to update the list of exemptions to the DMCA for research, analysis and other non-infringing uses. Worth a look if you're interested in the topic.

https://www.copyright.gov/1201/2027/
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Thu Jun 11 13:43:29 2026

From Newsgroup: comp.arch

On Mon, 08 Jun 2026 17:44:59 -0400, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:

Terje Mathisen [2026-06-08 11:45:51] wrote:

Michael S wrote:
The obviously "most free" sw must be public domain, right?

As with most things related to freedom ... it depends.

Public domain offers "more freedom" when you consider the point of view
of the developers, who can use that software any way they want with no >restrictions at all.

But not when you consider the point of view of the end-users who may
receive code compiled/derived from that public domain source code with
no way to recover that public domain source code, or to change or fix
it. It may even be illegal to try to recover it (since the DMCA
disallows several forms of reverse engineering).

From that end-user point of view, the GPL arguably ensures "more
freedom" than public domain.

=== Stefan

Not to mention that there are many countries that do not recognize
public domain. And even where it technically is recognized, some
countries have legal procedures that must be followed to relinquish
your rights and so complicate actually putting something into public
domain.

Putting <whatever> under some kind of license - regardless of how
permissive it is - actually is easier to do in many places, and is
recognized in more places.

MMV.
--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Fri Jun 12 01:04:53 2026

From Newsgroup: comp.arch

Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

David Brown <david.brown@hesbynett.no> writes:

On 25/05/2026 16:28, Anton Ertl wrote:

Despite their eagerness to "optimize" based on the assumption
that signed integer overflow does not happen, the GCC developers have
avoided making -ftrap the default, even on platforms like MIPS and
Alpha where the implementation of -ftrapv just means to use different
instructions (e.g., add instead of addu on MIPS, and addv instead of
add on Alpha).

An awkward thing about using trap on overflow is determining how
precisely it is defined. Supposing you have the expression "a + b - a".
Perhaps "a + b" overflows. I would hope than when using debug-related >>compiler flags such as "-fsanitize=signed-integer-overflow", a compiler >>would check for overflow on "a + b", and report it at runtime. >>(Unfortunately, gcc does not do that unless the partial expression is >>assigned to a variable.) But in "normal" usage, I'd expect the
expression to be simplified, resulting in just "b" and no overflow.

OTOH, cases like a+b+c where the result is in range, while an
intermediate result is out of range are one of the reasons why I
prefer -fwrapv over -ftrapv. As for your preference of nasal demons,
given enough information, the compiler might "optimize" "a+b-a" into,
e.g., 0.

Anyway, the definition of -ftrapv is not very precise; for gcc-12.2:

|'-ftrapv'
| This option generates traps for signed overflow on addition,
| subtraction, multiplication operations.

As for what gcc-12.2 does for your example on AMD64:

long foo(long a, long b)
{
return a+b-a;
}

is compiled with gcc -O3 -ftrapv to:

0: 48 89 f0 mov %rsi,%rax
3: c3 ret

That is what I expect from '-ftrapv': running code should deliver
result as if using infinite precision arithmetic or overflow trap.
More tight specification could be that optimized code should not
generate overflow trap in cases when computing naively using C
semantics does not lead to overflow. Since the result above
agrees with result obtained using infinite precision arithmetic,
the code is fine and there is no need for runtime checks.

Of course, languages like C++ which turn traps into exceptions
and allow to use this as part of computations may have problem
here. More precisely, if they specify that overflow exception
must happen at given computational step, then optimizer may be
forced to generate code which is ther only to generate trap
and serves no other purpose. But the orignal intent of
overflow trap is to signal that real machine using fixed size
numbers can not deliver the same result as ideal machine
using infinte precion. If a language respects the intent,
then compiler can do a lot of optimizations.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Fri Jun 12 01:57:06 2026

From Newsgroup: comp.arch

Thomas Koenig <tkoenig@netcologne.de> wrote:

Waldek Hebisch <antispam@fricas.org> schrieb:

Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

MIPS' add traps on overflow. gcc could have emitted almost the same
code for gcc -O3 -trapv as for gcc -O3, except that the last
instruction would be an add, not an addu. But apparently nobody gives
a damn about the efficiency of -trapv, possibly rightly so.

My guess is that GCC developers care more about -trapv than about
MIPS.

It is a common misconception to treat GCC developers as a
monolithic group. There are hobbyists (such as myself) but
I would guess only a small minority of work is done by them,
with the notable exception of some front ends such as Fortran or
(the most recent example) Algol 68. There are employees by
different companies: Linux distributors like RedHat or Suse,
Large software companies like Google, hardware vendors like
IBM, Intel or Qualcomm, ...

Well, normal employees care about things that their employer
tells them to do. Whatever the reason GCC developers
(that is people contributing to GCC) each have their agenda,
care more about some things and less about other.

For MIPS, there are not so many active people and commits.
mips64-linux-gnu is a secondary platform, so if it fails
bootstrap, a release would be held up, but a wrong-code
regression will not.

Counting changes since 2025-01-01 in the gcc/config directories
can give a good idea of the relative activity for different
subdirectories; I cut this off below 7, where the PDP-11 is (note
that architecture names are often historical, so i386 includes
x86_64, s390 includes Z, rs6000 includes POWER and so on).

539 ./riscv
435 ./i386
432 ./aarch64
177 ./loongarch
100 ./arm
85 ./s390
75 ./avr
72 ./xtensa
60 ./rs6000
51 ./gcn
39 ./nvptx
25 ./sparc
25 ./mips
20 ./arc
19 ./pa
19 ./bpf
19 ./alpha
18 ./pru
15 ./sh
13 ./rx
13 ./cris
12 ./or1k
12 ./microblaze
12 ./m68k
11 ./lm32
11 ./ia64
11 ./h8300
9 ./vax
9 ./nds32
8 ./mcore
8 ./epiphany
8 ./c6x
7 ./visium
7 ./rl78
7 ./pdp11
7 ./frv
7 ./csky

AFAICS several architectures officialy supported by GCC
struggle to work at all. I suspect that maintainers of MIPS
backend are happy that -trapv works and do not have resources
to make it efficient.

First, they would need to know about this, which requires a PR,
but resources may well be lacking.

There are currently 28 open "missed-optimization" bugs with mips
in their target field. Looking at a few architectures above,
RISC-V has 118, x86 has 943, aarch64 has 305, power has 133.
(Some bugs affect more than one architecture, of course).

But it is worth submitting a PR nonetheless, if anybody cares enough :-)

Frankly, I do not care enough. I mean, I like fact that GCC
supports several architectures. But I have use of x86_64, arm (both
32 bit and 64-bit one), RISC-V and few embedded processors, that is
I have processors and can run GCC outout on them. I even have some
use of s390/z, that is I have emulator (Hercules) and have some
interst in software running inside emulator. But I have essentially
no use of MIPS.

And in slightly different spirit, IIRC there were cases when bug
reports caused reaction of sort "Yes, it is buggy. It would be
too much effort to fix it, so we will just remove support".
I support is removed, then work needed to revive an architecture
is likely to be significantly larger than in case of bitrotten,
but still in-tree architecture.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Thu Jun 11 15:21:14 2026

From Newsgroup: comp.arch

David Brown [2026-06-09 10:06:48] wrote:

On 09/06/2026 00:03, Stefan Monnier wrote:

Well, Stephen is hardly using a recent meaning of the word "free".
According to the OED, "free" as in "free of charge" traces back to the
13th century, so it clearly existed in Orwell's time.
But yes, I find it demoralizing that people within the computer world
are still making this mistake, after more than 40 years of FSF.

It is not a mistake - it is merely a different but perfectly reasonable use of the same word.

In an arbitrary context, I could agree, but here we're talking about
a subthread that started with:

On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:
> MitchAlsup [2026-05-26 20:54:30] wrote:
>> Encrypt the debug information (and put it in a
>> {1234-5678-9101-1121-...} folder) so that only the owner (not
>> licensee) of the code can debug it.
> I resent that. All code should be Free Software.

I think there is no ambiguity here.

Treating this "Free Software" to refer to price rather than to freedom
is an error that can be explained only by a lack of familiarity with the
idea of software freedom.

=== Stefan
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Fri Jun 12 13:02:41 2026

From Newsgroup: comp.arch

On 11/06/2026 21:21, Stefan Monnier wrote:

David Brown [2026-06-09 10:06:48] wrote:

On 09/06/2026 00:03, Stefan Monnier wrote:

Well, Stephen is hardly using a recent meaning of the word "free".
According to the OED, "free" as in "free of charge" traces back to the
13th century, so it clearly existed in Orwell's time.
But yes, I find it demoralizing that people within the computer world
are still making this mistake, after more than 40 years of FSF.

It is not a mistake - it is merely a different but perfectly reasonable use >> of the same word.

In an arbitrary context, I could agree, but here we're talking about
a subthread that started with:

On Wed, 27 May 2026 10:59:31 -0400, Stefan Monnier wrote:
> MitchAlsup [2026-05-26 20:54:30] wrote:
>> Encrypt the debug information (and put it in a
>> {1234-5678-9101-1121-...} folder) so that only the owner (not
>> licensee) of the code can debug it.
> I resent that. All code should be Free Software.

I think there is no ambiguity here.

Treating this "Free Software" to refer to price rather than to freedom
is an error that can be explained only by a lack of familiarity with the
idea of software freedom.

Fair enough - I agree that in that context, the term "Free Software" is unambiguous. The capitalisation is important, and that was lost by the
post to which I replied. (There are a /lot/ of posts in this thread,
and when switching between two computers I have undoubtedly skipped many
of them.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Sat Jun 13 03:01:00 2026

From Newsgroup: comp.arch

According to George Neuner <gneuner2@comcast.net>:

As with most things related to freedom ... it depends.

Not to mention that there are many countries that do not recognize
public domain. And even where it technically is recognized, some
countries have legal procedures that must be followed to relinquish
your rights and so complicate actually putting something into public
domain.

I am not aware of any countries that do not have the public domain for
material whose copyright has expired, or for whatever reason was not
eligible for copyright in the first place. But you're right, in some
places it is impossible or at least impractical to relinquish your
rights and put something in the P.D. before it would get there anyway.

Putting <whatever> under some kind of license - regardless of how
permissive it is - actually is easier to do in many places, and is
recognized in more places.

Agreed. There are lots of licenses other than the GPL that are
used successfully for open source software.

R's,
John
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Sat Jun 13 05:49:19 2026

From Newsgroup: comp.arch

On Sat, 13 Jun 2026 03:01:00 -0000 (UTC), John Levine
<johnl@taugh.com> wrote:

According to George Neuner <gneuner2@comcast.net>:

As with most things related to freedom ... it depends.

Not to mention that there are many countries that do not recognize
public domain. And even where it technically is recognized, some
countries have legal procedures that must be followed to relinquish
your rights and so complicate actually putting something into public >>domain.

I am not aware of any countries that do not have the public domain for >material whose copyright has expired, or for whatever reason was not
eligible for copyright in the first place. But you're right, in some
places it is impossible or at least impractical to relinquish your
rights and put something in the P.D. before it would get there anyway.

The Berne convention defined an implicit copyright that exists by
virtue of authorship and persists until the author's death. Though
the US does not recognize or enforce these implicit copyrights, most signatories to either Berne (1886) or UCC (1952) conventions do
recognize and enforce Berne copyrights.

Explicit copyrights - filed with Copyright offices - can be
voluntarily surrendered at any time. It is giving up the implicit
copyright that is the problem with public domain.

Putting <whatever> under some kind of license - regardless of how >>permissive it is - actually is easier to do in many places, and is >>recognized in more places.

Agreed. There are lots of licenses other than the GPL that are
used successfully for open source software.

R's,
John

--- Synchronet 3.22a-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Sat Jun 13 10:52:08 2026

From Newsgroup: comp.arch

George Neuner <gneuner2@comcast.net> writes:

The Berne convention defined an implicit copyright that exists by
virtue of authorship and persists until the author's death. Though
the US does not recognize or enforce these implicit copyrights, most >signatories to either Berne (1886) or UCC (1952) conventions do
recognize and enforce Berne copyrights.

According to <https://en.wikipedia.org/wiki/Berne_convention>:

|The United States acceded to the convention on 16 November 1988, and
|the convention entered into force for the United States on 1 March
|1989.

How can the convention have entered into force in the US without the
US recognizing or enforcing implicit copyrights?

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadibloc@quadibloc@invalid.com (John Savard) to comp.arch on Sat Jun 13 14:20:57 2026

From Newsgroup: comp.arch

On Sat, 13 Jun 2026 10:52:08 GMT, anton@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:

George Neuner <gneuner2@comcast.net> writes:

The Berne convention defined an implicit copyright that exists by
virtue of authorship and persists until the author's death. Though
the US does not recognize or enforce these implicit copyrights, most >>signatories to either Berne (1886) or UCC (1952) conventions do
recognize and enforce Berne copyrights.

According to <https://en.wikipedia.org/wiki/Berne_convention>:

|The United States acceded to the convention on 16 November 1988, and
|the convention entered into force for the United States on 1 March
|1989.

How can the convention have entered into force in the US without the
US recognizing or enforcing implicit copyrights?

Implicit copyrights do exist now in the U.S. because of its
ratification of the Berne convention.

But there are some limitations.

Nothing that entered the public domain prior to this ratification
became copyrighted again; there was no retroactive effect.

Also, U.S. parties are still incentivized to register their
copyrights, because this is necessary to recieve statutory damages and attorney's fees from a copyright lawsuit.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Sat Jun 13 15:07:00 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> wrote:

According to George Neuner <gneuner2@comcast.net>:

As with most things related to freedom ... it depends.

Not to mention that there are many countries that do not recognize
public domain. And even where it technically is recognized, some
countries have legal procedures that must be followed to relinquish
your rights and so complicate actually putting something into public >>domain.

I am not aware of any countries that do not have the public domain for material whose copyright has expired,

My country (Poland) has a rule that once copyright has expired
distributior of work should pay royalites to the state. In
am not sure how it works in "interesting" case, but clearly
this is quite different from US/UK meaning of public domain.

Also, law of my country declares some author right as
untransfreable. Basically, author can sue if he/she/it
thinks that artistic integrity of the work is violated.
Theoretically, one could imagine some old, not longer commercialy
viable program to be released as public domain, new people fixing
old bugs and original developers suing that bug fixes deprive
users of orignal experience and hence violate artistic integrity.
Probably not going to work in court, but we had case when
sensible improvements to buildings were blocked by architects.

BTW. Our copyright law has notion of "area of exploration" and
states that copyright transfer is effective only for explicitely
transferred rights. All right in "areas of exploration" which
are not explicitely transferred stay with autors. I guess that
training LLM would count as new "area of exploration"...
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Sat Jun 13 17:25:27 2026

From Newsgroup: comp.arch

According to Waldek Hebisch <antispam@fricas.org>:

My country (Poland) has a rule that once copyright has expired
distributior of work should pay royalites to the state. In
am not sure how it works in "interesting" case, but clearly
this is quite different from US/UK meaning of public domain.

Do distributors pay state royalties on works of Shakespeare?
The Bible? Wow.

Also, law of my country declares some author right as
untransfreable. Basically, author can sue if he/she/it
thinks that artistic integrity of the work is violated.

Those are moral rights, introduced into Berne in the 1920s by
everyone's favorite copyright advocate, Benito Mussolini.

The US only recognizes moral rights for visual works like
paintings and sculpture. There was an interesting case in
2013 where the owner of a building containing artists' studios
who had allowed elaborate graffiti on the outside of the building
decided to tear it down, first whitewashing over the art. The
artists sued and won $6.7 million. https://en.wikipedia.org/wiki/5_Pointz

So don't let people doodle on your circuit boards, I guess.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.22a-Linux NewsLink 1.2

From quadibloc@quadibloc@invalid.com (John Savard) to comp.arch on Sun Jun 14 19:14:39 2026

From Newsgroup: comp.arch

On Sat, 13 Jun 2026 15:07:00 -0000 (UTC), antispam@fricas.org (Waldek
Hebisch) wrote:

My country (Poland) has a rule that once copyright has expired
distributior of work should pay royalites to the state. In
am not sure how it works in "interesting" case, but clearly
this is quite different from US/UK meaning of public domain.

I once read a science-fiction story in which, after Earth joined an interplanetary confederation, works that were in the public domain
became works for which the United Nations could charge royalties to
people from other planets.

The story was a detective story. It wasn't Shakespeare, but instead
Bollywood movies that a criminal was modifying and re-selling to a
planet to the culture of which those movies were well suited.

Also, law of my country declares some author right as
untransfreable. Basically, author can sue if he/she/it
thinks that artistic integrity of the work is violated.

Most European countries recognize the moral rights of authors.

John Savard
--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Mon Jun 15 16:32:47 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> wrote:

According to Waldek Hebisch <antispam@fricas.org>:

My country (Poland) has a rule that once copyright has expired
distributior of work should pay royalites to the state. In
am not sure how it works in "interesting" case, but clearly
this is quite different from US/UK meaning of public domain.

Do distributors pay state royalties on works of Shakespeare?
The Bible? Wow.

I am not sure how they handle what in english zone is called
"derived works". If they reproduce first edition of Shakespeare
work or say Gutenberg Bible distributors are supposed to pay
(and I think that they do pay). Copyright to currently sold
Bible is attributed to translators.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

Who's Online
Recent Visitors
- N Cline
  Fri Jun 26 13:25:22 2026
  from Palmer, Ga via Telnet
- N Cline
  Fri Jun 26 12:13:09 2026
  from Palmer, Ga via Telnet
- Noozle
  Fri Jun 26 10:51:12 2026
  from Noozle City via Telnet
- N Cline
  Thu Jun 25 19:30:21 2026
  from Palmer, Ga via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,124
Nodes:	10 (0 / 10)
Uptime:	24:40:12
Calls:	14,394
Calls today:	3
Files:	186,389
D/L today:	6,186 files (1,560M bytes)
Messages:	2,545,009
Posted today:	1

Re: condition bits, Concertina IV Has Arrived

Who's Online

Recent Visitors

System Info