Forum: War Ensemble BBS

Re: Base-Index Addressing in the Concertina II

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat Aug 23 18:16:57 2025

From Newsgroup: comp.arch

Thomas Koenig <tkoenig@netcologne.de> posted:

John Savard <quadibloc@invalid.invalid> schrieb:

1) 16-bit displacements are _important_. Pretty well *all* microprocessors use 16-bit displacements, rather than anything shorter like 12 bits.

That is a bit of an exaggeration - SPARC has 13-bit constants, RISC-V
has 12-bit constants.

Even
though this meant, in most cases, they had to give up indexing.

SPARC has both Ra+Rb and Ra+immediate, but not combined. The use
case for Ra+Rb+13..16 bit is extremely limited.

2) Index registers were hailed as a great advancement in computers
when they were added to them, as they allowed avoiding self-modifying
code.

GP registers were an even greater achievement.

Of course, if one just has base registers, one can still have
a special base register, used for array accessing, where the base
address of a segment has had the array offset added to it through
separate arithmetic instructions.

The usual method is Ra+Rb. Mitch also has Ra+Rb<<n+32-bit or Ra+Rb<<n+64-bit, with n from 0 to 3. This is useful for addressing
global data encoded in the constant, which a 12 to 16 bit-offset
is not.

Array accesses are common, and not needing extra instructions for them is therefore beneficial.

Yes, and indexing without offset can do that particular job just
fine.

3) At least one major microprocessor manufacturer, Motorola, did have base-index addressing with 16-bit displacements, Motorola, starting with the 68020.

Mitch recently explained that they had micorarchitectural reasons.

'020 address modes have scaling with base and displacement. They did
not cut too soon.

(3) may not be much of an argument, but it seems to me that (1) and (2)
can reasonably be considered fairly strong arguments. But what about the drawbacks?

I have not seen a strong argument for Ra+Rb+16 bit in what you wrote
above.

[Ra+Rb+Disp16] has the same gate delay as [Rb+Ri<<3+Disp64] and is not
as powerful, nor can it reach all of the virtual address space.

The only thing in this corner of my ISA I regret is not having more bits
for the scale {to cover complex double, and Quaternions}
--- Synchronet 3.21a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat Aug 23 18:25:09 2025

From Newsgroup: comp.arch

Robert Finch <robfi680@gmail.com> posted:
<snip>

My current design fuses a max of one memory op into instructions instead
of having a load followed by the instruction (or an instruction followed
by a store). Address mode available without adding instruction words are
Rn, (Rn), (Rn)+, -(Rn). After that 32-bit instruction words are added to support 32 and 64-bit displacements or addresses.

The instructions with the extra displacement words are larger but there
are fewer instructions to execute.
LOAD Rd,[Rb+Disp16]
ADD Rd,Ra,Rd
Requiring two instruction words, and executing as two instructions, gets replaced with:
ADD Rd,Ra,[Rb+Disp32]
Which also takes two instruction words, but only one instruction.

I have been using the term "instruction-specifier" for the first word of
an instruction, and "instruction" for all of the words of an instruction. Instruction-specifier contains everything about the instruction except
for the constants.

Since you and I are the only "RISCs" with VLE we (WE) should get our terminology aligned.

Immediate operands and memory operands are routed according to two
two-bit routing fields. I may be able to compress this to a single
three-bit field.

Typical instruction encoding:
ADD: oooooo ss xx ii ww mmm rrrrr rrrrr ddddd
oooooo: is the opcode
ss: is the operation size
xx: is two extra opcode bits
ii: indicates which register field represents an immediate value
ww: indicates which register field is a memory operand
mmm: is the addressing mode, similar to a 68k
rrrrr: source register spec (or 4+ bit immediate)
ddddd: destination register spec (or 4+ bit immediate)

A 36-bit opcode would work great, allowing operand sign control.

I cam to the same realization ...
--- Synchronet 3.21a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat Aug 23 18:27:57 2025

From Newsgroup: comp.arch

Robert Finch <robfi680@gmail.com> posted:

On 2025-07-28 1:29 a.m., Stephen Fuld wrote:

On 7/27/2025 3:50 AM, Robert Finch wrote:

big snip

First, thanks for posting this. I don't recall you posting much about your design. Can you talk about its goals, why you are doing it, its status, etc.?

Just started the design. Lots of details to work out. I like some
features of the 68k and 66k. I have some doubt as to starting a new
design. I would prefer to use something existing. I am not terribly fond
of RISC designs though.

Specific comments below

My current design fuses a max of one memory op into instructions
instead of having a load followed by the instruction (or an
instruction followed by a store). Address mode available without
adding instruction words are Rn, (Rn), (Rn)+, -(Rn). After that 32-bit
instruction words are added to support 32 and 64-bit displacements or
addresses.

The combined mem-op instructions used to be popular, but since the RISC revolution, are now out of fashion. Their advantages are, as you state, often eliminating an instruction. The disadvantages include that they preclude scheduling the load earlier in the instruction stream. Do you "crack" the instruction into two micro-ops in the decode stage? What drove your decision to "buck" the trend. I am not saying you are wrong.
I just want to understand your reasoning.

Instructions will be cracked into micro-ops. My compiler does not do instruction scheduling (yet). Relying on the processor to schedule instructions. There are explicit load and store instructions which
should allow scheduling earlier in the instruction stream.

Once the Fetch-Issue width is greater than 2, compiler scheduling is
an anathema--just let the GBOoO FU schedulers do it.

I am under the impression that with a micro-op based processor the ISA (RISC/CISC) becomes somewhat less relevant allowing more flexibility in
the ISA design. >

There is always the complexity budget ...
--- Synchronet 3.21a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Sat Aug 23 18:40:10 2025

From Newsgroup: comp.arch

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

The only thing in this corner of my ISA I regret is not having more bits
for the scale {to cover complex double, and Quaternions}

There is a bit of inconvenience, but strenght reduction can go a
long way to bridge that gap. Consider something like

void foo (__complex double *c, double *d, long int n)
{
for (long int i=0; i<n; i++)
c[i] += d[i];
}

which could be something like (translated by hand, so errors
are likely)

foo:
ble0 r3,.L_end
mov r4,#0
sll r3,r3,#3
vec r5,{}
ldd r6,[r2,r4,0]
ldd r7,[r1,r4<<2,0]
fadd r7,r7,r6
std r7,[r1,r4<<2,0]
loop1 ne,r4,#4,r3
.L_end:
ret

which is as close to optimum (just a single sll instruction) as
not to matter.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Robert Finch@robfi680@gmail.com to comp.arch on Sun Aug 24 21:47:17 2025

From Newsgroup: comp.arch

On 2025-08-23 2:25 p.m., MitchAlsup wrote:

Robert Finch <robfi680@gmail.com> posted:
<snip>

My current design fuses a max of one memory op into instructions instead
of having a load followed by the instruction (or an instruction followed
by a store). Address mode available without adding instruction words are
Rn, (Rn), (Rn)+, -(Rn). After that 32-bit instruction words are added to
support 32 and 64-bit displacements or addresses.

The instructions with the extra displacement words are larger but there
are fewer instructions to execute.
LOAD Rd,[Rb+Disp16]
ADD Rd,Ra,Rd
Requiring two instruction words, and executing as two instructions, gets
replaced with:
ADD Rd,Ra,[Rb+Disp32]
Which also takes two instruction words, but only one instruction.

I have been using the term "instruction-specifier" for the first word of
an instruction, and "instruction" for all of the words of an instruction. Instruction-specifier contains everything about the instruction except
for the constants.

Since you and I are the only "RISCs" with VLE we (WE) should get our terminology aligned.

I will keep that in mind. Reminds me of compiler terminology, specifiers
and declarators.

I have scrapped the latest architecture. Independent load and store instructions look better.

I have been working on an OS implemented for the 68k primarily because
the core is reasonably stable. It can be ported to the latest ISA at a
later date.

Immediate operands and memory operands are routed according to two
two-bit routing fields. I may be able to compress this to a single
three-bit field.

Typical instruction encoding:
ADD: oooooo ss xx ii ww mmm rrrrr rrrrr ddddd
oooooo: is the opcode
ss: is the operation size
xx: is two extra opcode bits
ii: indicates which register field represents an immediate value
ww: indicates which register field is a memory operand
mmm: is the addressing mode, similar to a 68k
rrrrr: source register spec (or 4+ bit immediate)
ddddd: destination register spec (or 4+ bit immediate)

A 36-bit opcode would work great, allowing operand sign control.

I cam to the same realization ...

--- Synchronet 3.21a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed Aug 27 00:19:02 2025

From Newsgroup: comp.arch

Thomas Koenig <tkoenig@netcologne.de> posted:

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

The only thing in this corner of my ISA I regret is not having more bits for the scale {to cover complex double, and Quaternions}

There is a bit of inconvenience, but strenght reduction can go a
long way to bridge that gap. Consider something like

void foo (__complex double *c, double *d, long int n)
{
for (long int i=0; i<n; i++)
c[i] += d[i];
}

Wondering why "c[i] += d[i];" did not get a type mismatch.

Should be "c[i].real += d[i];"

which could be something like (translated by hand, so errors
are likely)

foo:
ble0 r3,.L_end
mov r4,#0
sll r3,r3,#3
vec r5,{}
ldd r6,[r2,r4,0]
ldd r7,[r1,r4<<2,0]
fadd r7,r7,r6
std r7,[r1,r4<<2,0]
loop1 ne,r4,#4,r3
.L_end:
ret

which is as close to optimum (just a single sll instruction) as
not to matter.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Wed Aug 27 05:08:08 2025

From Newsgroup: comp.arch

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

Thomas Koenig <tkoenig@netcologne.de> posted:

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

The only thing in this corner of my ISA I regret is not having more bits >> > for the scale {to cover complex double, and Quaternions}

There is a bit of inconvenience, but strenght reduction can go a
long way to bridge that gap. Consider something like

void foo (__complex double *c, double *d, long int n)
{
for (long int i=0; i<n; i++)
c[i] += d[i];
}

Wondering why "c[i] += d[i];" did not get a type mismatch.

Should be "c[i].real += d[i];"

C's implicit conversion rules.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.21a-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Microbot
  Tue Sep 16 10:00:46 2025
  from Moore, Ok via Telnet
- Snow
  Mon Sep 15 12:19:45 2025
  from Nyc via Telnet
- Microbot
  Mon Sep 15 11:13:27 2025
  from Moore, Ok via Telnet
- Noozle
  Sun Sep 14 14:16:26 2025
  from Noozle City via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,070
Nodes:	10 (0 / 10)
Uptime:	127:41:48
Calls:	13,731
Calls today:	1
Files:	186,965
D/L today:	1,258 files (486M bytes)
Messages:	2,417,820

Re: Base-Index Addressing in the Concertina II

Who's Online

Recent Visitors

System Info