• Electronics Magazine

    From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Tue Dec 16 11:51:19 2025
    From Newsgroup: comp.arch

    If anyone is interested...

    I stumbled over a stash of old Electronics magazine PDF's dating back
    to the 1930's up to 1990's. Electronics magazine was where many new
    products were announced along with technical design articles.
    All of the dozens of microprocessors available after 1971 would
    have had articles describing them in some issue of Electronics.
    Unfortunately it was bought out by someone around 1985 who changed
    the magazine focus away from technical and eventually it closed.
    But for 55 or so years it was a very good source.

    https://www.worldradiohistory.com/Archive-Electronics/

    There are indexes for the 1930's and 40's but unfortunately not after.

    Electronics articles were written by EE's for EE's.
    For example, the first issue Apr-1930 has articles on
    "The power pentode Its characteristics and applications",
    "Tuned radio-frequency amplifiers", and "Industrial uses of vacuum devices".

    https://www.worldradiohistory.com/Archive-Electronics/30s/Electronics-1930-04-Original.pdf

    In the 1973-09-13 issue on page 118 is an article by Texas Instruments announcing the first single transistor per cell 4096b DRAM,
    with an explanation of how it works at the semiconductor level,
    plus on how to design a 16 kB ECC memory board using their chip.

    That is followed by an article various microcomputer bus designs.

    https://www.worldradiohistory.com/Archive-Electronics/70s/73/Electronics-1973-09-13.pdf


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Tue Dec 16 16:57:16 2025
    From Newsgroup: comp.arch

    EricP <ThatWouldBeTelling@thevillage.com> writes:
    If anyone is interested...


    In the 1973-09-13 issue on page 118 is an article by Texas Instruments >announcing the first single transistor per cell 4096b DRAM,
    with an explanation of how it works at the semiconductor level,
    plus on how to design a 16 kB ECC memory board using their chip.

    That is followed by an article various microcomputer bus designs.

    https://www.worldradiohistory.com/Archive-Electronics/70s/73/Electronics-1973-09-13.pdf

    200 pages. Those were the days.

    There is an interesting article on the AT&T Picture phone in that same issue. --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Al Kossow@aek@bitsavers.org to comp.arch on Tue Dec 16 09:52:39 2025
    From Newsgroup: comp.arch

    On 12/16/25 8:51 AM, EricP wrote:
    If anyone is interested...

    I stumbled over a stash of old Electronics magazine PDF's dating back
    to the 1930's up to 1990's.

    their scans are all crap

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Wed Dec 17 15:32:56 2025
    From Newsgroup: comp.arch

    EricP <ThatWouldBeTelling@thevillage.com> schrieb:
    In the 1973-09-13 issue on page 118 is an article by Texas Instruments announcing the first single transistor per cell 4096b DRAM,
    with an explanation of how it works at the semiconductor level,
    plus on how to design a 16 kB ECC memory board using their chip.

    That is followed by an article various microcomputer bus designs.

    https://www.worldradiohistory.com/Archive-Electronics/70s/73/Electronics-1973-09-13.pdf

    Interesting read, thanks!

    Clearly, semiconductor memory was the new hot thing (pun intended)
    at the time, there was also and from Intel in the same issue,
    promising less than 0.1 mW per bit.

    Referring to our previous discussions about an early
    RISC, which would have certainly have required a cache:
    I took a look at the TI memory handbook from 1975, at https://www.synfo.nl/datasheets/TI_1975_Memory-Data-Book.pdf which
    had a 1024*1 memory chips with 150 ns write cycle time (which
    what would have been used for cache, presumably). They also had
    programmable ROMs of 4096 bits with 55 ns access time. From this
    figure alone, anybody can be forgiven for thinking that microcode
    is the way to go...
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Wed Dec 17 21:55:50 2025
    From Newsgroup: comp.arch

    Thomas Koenig wrote:
    EricP <ThatWouldBeTelling@thevillage.com> schrieb:
    In the 1973-09-13 issue on page 118 is an article by Texas Instruments
    announcing the first single transistor per cell 4096b DRAM,
    with an explanation of how it works at the semiconductor level,
    plus on how to design a 16 kB ECC memory board using their chip.

    That is followed by an article various microcomputer bus designs.

    https://www.worldradiohistory.com/Archive-Electronics/70s/73/Electronics-1973-09-13.pdf

    Interesting read, thanks!

    I like the 1930's ones - art deco boxes and rheostats on everything.

    Clearly, semiconductor memory was the new hot thing (pun intended)
    at the time, there was also and from Intel in the same issue,
    promising less than 0.1 mW per bit.

    Initially it was replacing core memories in mini's and mainframes.
    By 1975 DEC was selling semiconductor memory boards for LSI-11 and 11/45.

    In 1973 microprocessors were just coming into use and were yet to
    start making a demand on the MOS memory market.
    Most engineers did not know what a microprocessor was or how they worked.
    If you look the hardware manuals from all the processor manufacturers,
    Intel, Motorola, Fairchild, dozens of others, they all begin with an explaination of what a microprocessor is.

    Referring to our previous discussions about an early
    RISC, which would have certainly have required a cache:
    I took a look at the TI memory handbook from 1975, at https://www.synfo.nl/datasheets/TI_1975_Memory-Data-Book.pdf which
    had a 1024*1 memory chips with 150 ns write cycle time (which
    what would have been used for cache, presumably). They also had
    programmable ROMs of 4096 bits with 55 ns access time. From this
    figure alone, anybody can be forgiven for thinking that microcode
    is the way to go...

    Yes RISC it needs a cache or it just wastes all its potential concurrency
    in stalls.

    It looks like you are looking at a 74S209/74S309 TTL SRAM.
    That has a 100 ns read access time, which isn't fast enough if you want,
    say, at 200 ns cpu clock time when you consider the overheads involved
    for TTL in the time needed to get signals from core to the cache and back.
    It would really need the cache read value at the core by about 150 ns
    so you can route it to registers by 200 ns (170 ns with 15% slack time).

    Looking at VAX 780 I&D cache schematics, it used Motorola 93425-1 1024x1b
    TTL SRAMs with a max access time of what looks like either 30, 45 or 60 ns
    (not sure as I can find the exact "-1" part number).
    30 ns SRAM should be fast enough.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.arch on Wed Dec 17 23:21:45 2025
    From Newsgroup: comp.arch

    Yes RISC it needs a cache or it just wastes all its potential
    concurrency in stalls.

    FWIW, the original ARM did not have a cache,


    Stefan
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Thu Dec 18 07:45:16 2025
    From Newsgroup: comp.arch

    Stefan Monnier <monnier@iro.umontreal.ca> writes:
    Yes RISC it needs a cache or it just wastes all its potential
    concurrency in stalls.

    FWIW, the original ARM did not have a cache,

    Indeed, the ARM2 used in the Archimedes does not have a cache and runs
    rings around contemporary CISCs (including 386 and 68020, with a small
    I-cache on the 68020).

    It runs at 8MHz, the same speed as the first HPPA implementation
    (TS-1, a board, not a chip), which does have 64K+64K cache. However,
    the ARM2 does not have an MMU, while the 386 and the TS-1 have one,
    and the 68020 was usually used with an MMU.

    It seems to me that ARM made this clock work with DRAM without cache
    by making good use of staying in the same row: In particular,
    consecutive instructions usually are from the same row. In addition,
    ARM includes load-multiple and store-multiple instructions that access consecutive data that usually are in the same row.

    By contrast, note that the VAX 11/780 has a 5MHz clock (and about
    10CPI) and a cache. Even if the DRAM at the time of the VAX was
    somewhat slower than at the time of the Archimedes, and the VAX has an
    MMU, I am sure that an ARM-like RISC with an MMU, FPU and just DRAM
    would have required less implementation effort and performed better
    than the VAX 11/780 if implemented with the same technology as the VAX
    11/780. If you add a cache to the RISC (as the VAX 11/780 has, even
    better. If you convert the VAX 11/780 microcode store into a cache,
    even better. And, to combat code size, use something like ARM T32
    instead of A32, and the decoder and instruction buffering for that
    would still fit in the implementation budget (the VAX 11/780 also has instruction buffering and a decoder for variable-length instructions.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Thu Dec 18 11:44:56 2025
    From Newsgroup: comp.arch

    Anton Ertl wrote:
    Stefan Monnier <monnier@iro.umontreal.ca> writes:
    Yes RISC it needs a cache or it just wastes all its potential
    concurrency in stalls.
    FWIW, the original ARM did not have a cache,

    Indeed, the ARM2 used in the Archimedes does not have a cache and runs
    rings around contemporary CISCs (including 386 and 68020, with a small I-cache on the 68020).

    It runs at 8MHz, the same speed as the first HPPA implementation
    (TS-1, a board, not a chip), which does have 64K+64K cache. However,
    the ARM2 does not have an MMU, while the 386 and the TS-1 have one,
    and the 68020 was usually used with an MMU.

    It seems to me that ARM made this clock work with DRAM without cache
    by making good use of staying in the same row: In particular,
    consecutive instructions usually are from the same row. In addition,
    ARM includes load-multiple and store-multiple instructions that access consecutive data that usually are in the same row.

    By contrast, note that the VAX 11/780 has a 5MHz clock (and about
    10CPI) and a cache. Even if the DRAM at the time of the VAX was
    somewhat slower than at the time of the Archimedes, and the VAX has an
    MMU, I am sure that an ARM-like RISC with an MMU, FPU and just DRAM
    would have required less implementation effort and performed better
    than the VAX 11/780 if implemented with the same technology as the VAX 11/780. If you add a cache to the RISC (as the VAX 11/780 has, even
    better. If you convert the VAX 11/780 microcode store into a cache,
    even better. And, to combat code size, use something like ARM T32
    instead of A32, and the decoder and instruction buffering for that
    would still fit in the implementation budget (the VAX 11/780 also has instruction buffering and a decoder for variable-length instructions.

    - anton

    There is a copy of the ARM-1 Hardware Reference Manual from 1986 here

    http://chrisacorns.computinghistory.org.uk/docs/Acorn/OEM/OEM.html

    (1) The MMU (if any) was external to the cpu (ie "not their problem")
    (2) It looks like the RAS and CAS DRAM signals came directly from the
    ARM cpu chip which was designed to work synchronously with DRAM.
    (3) There was only 1 memory bank.
    (4) There was no cache
    (5) There was no standard system bus to plug in IO adapters

    Compared to a VAX-780, VAX had an MMU and a system bus,
    the Synchronous Backplane Interface (SBI) onto which all memory and
    IO adapter boards were hung. There were multiple memory boards.

    For a VAX memory read it had to (roughly speaking):
    (1) translate virtual->physical address
    (2) go through the cache (read miss)
    (3) get to the SBI (there is a 1 entry store buffer on cache output)
    (4) negotiate for SBI
    (5) SBI take 2 cycles to transmit control and read address
    (6) memory controller does its thing
    (7) memory controller negotiates for SBI
    (8) memory controller transmits 32B cache line (1 control + 8*4B data clocks) (9) cache receives and saves 32B cache line
    (10) cache returns 4B value to 780 core



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Thu Dec 18 17:47:42 2025
    From Newsgroup: comp.arch

    EricP <ThatWouldBeTelling@thevillage.com> schrieb:

    For a VAX memory read it had to (roughly speaking):
    (1) translate virtual->physical address
    (2) go through the cache (read miss)
    (3) get to the SBI (there is a 1 entry store buffer on cache output)
    (4) negotiate for SBI
    (5) SBI take 2 cycles to transmit control and read address
    (6) memory controller does its thing
    (7) memory controller negotiates for SBI
    (8) memory controller transmits 32B cache line (1 control + 8*4B data clocks) (9) cache receives and saves 32B cache line
    (10) cache returns 4B value to 780 core

    The VAX cache line was 8 bytes according to https://dl.acm.org/doi/pdf/10.1145/357353.357356
    .
    --
    This USENET posting was made without artificial intelligence,
    artificial impertinence, artificial arrogance, artificial stupidity,
    artificial flavorings or artificial colorants.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Thu Dec 18 13:10:32 2025
    From Newsgroup: comp.arch

    Thomas Koenig wrote:
    EricP <ThatWouldBeTelling@thevillage.com> schrieb:

    For a VAX memory read it had to (roughly speaking):
    (1) translate virtual->physical address
    (2) go through the cache (read miss)
    (3) get to the SBI (there is a 1 entry store buffer on cache output)
    (4) negotiate for SBI
    (5) SBI take 2 cycles to transmit control and read address
    (6) memory controller does its thing
    (7) memory controller negotiates for SBI
    (8) memory controller transmits 32B cache line (1 control + 8*4B data clocks)
    (9) cache receives and saves 32B cache line
    (10) cache returns 4B value to 780 core

    The VAX cache line was 8 bytes according to https://dl.acm.org/doi/pdf/10.1145/357353.357356
    ..

    You are correct. I was thinking it was 8 words.
    (That's a lot of tag overhead for such a small data cache.)
    So 2*4B data clocks.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Thu Dec 18 17:22:44 2025
    From Newsgroup: comp.arch

    EricP <ThatWouldBeTelling@thevillage.com> writes:
    Anton Ertl wrote:
    It seems to me that ARM made this clock work with DRAM without cache
    by making good use of staying in the same row: In particular,
    consecutive instructions usually are from the same row.

    This is called page mode; not sure if that was available when the VAX
    was designed. Later (supposedly starting in 1986) fast page mode was introduced (not sure how that affected performance). The ARM1
    hardware reference manual says "150 nanoseconds row access DRAM" and
    "8 MIPS peak". Not sure how 150nanoseconds and 8 MIPS go together,
    given that every instruction needs a memory access.

    There is a copy of the ARM-1 Hardware Reference Manual from 1986 here

    http://chrisacorns.computinghistory.org.uk/docs/Acorn/OEM/OEM.html

    (1) The MMU (if any) was external to the cpu (ie "not their problem")
    (2) It looks like the RAS and CAS DRAM signals came directly from the
    ARM cpu chip which was designed to work synchronously with DRAM.

    I see no RAS and CAS pins in the ARM pinout. What I see is 26 address
    lines, while 13 would have been enough if the memory controller was in
    the ARM1. They apparently were not very worried about pin count for
    the chip, so they did not even multiplex address bus (26 pins) and
    data bus (32 pins) the way many others did.

    The ARM1 has a "translate" signal for telling the MMU (not existing on
    the first systems) that this is a virtual address.

    The ARM1 also has a "seq" output signal that indicate sequential
    memory access: "It may be used, in combination with the low-order
    address lines, to indicate that the next cycle can use a fast memory
    mode (for example DRAM page mode) and/or to by- pass the address
    translation system."

    (3) There was only 1 memory bank.

    Page 23 says that the ARM co-processor board (it's a co-processor to
    the BBC Model B, not a co-processor to the ARM) carries "2MBytes DRAM,
    a bootstrap ROM, and an additional 2MBytes of DRAM on a daughter
    board". On page 24 it says "IC23 to IC150 [...] ICs that make up the
    4MBytes of RAM". I.e., 128 chips; sounds like at least 4 banks of RAM
    to me.

    Page 27 says: "RAS0..RAS3 [...] There are four banks of RAMs"

    (4) There was no cache
    (5) There was no standard system bus to plug in IO adapters

    The co-processor board also accesses the host system through
    memory-mapped I/O (the ARM1 has no I/O pins), and the BBC Micro has a
    system bus with I/O.

    The Archimedes A400 series (with an ARM2) has a 4-slot backplane.

    Compared to a VAX-780, VAX had an MMU and a system bus,
    the Synchronous Backplane Interface (SBI) onto which all memory and
    IO adapter boards were hung. There were multiple memory boards.

    For a VAX memory read it had to (roughly speaking):
    (1) translate virtual->physical address

    For sequential accesses, the same translation can be used in the usual
    case, just as the same DRAM row can be used in the usual case. This
    means that the usual case can be as fast as without translation.

    In the unusual case, the translation will add latency, yes. But
    that's true even with caches, unless you have a virtually addressed
    cache (which is the common case for L1 these days).

    One thing that's possible if you are willing to pay for a more complex
    system (as the VAX 11/780 microarchitects clearly were) is to have
    separate control, address and data lines for the different banks, and
    use that to access the banks alternatingly, with a bandwidth advantage.

    But yes, in general bigger memory subsystems tend to be slower. The
    VAX 11/780 compensated that partly with its cache.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Thu Dec 18 14:12:29 2025
    From Newsgroup: comp.arch

    EricP wrote:
    Thomas Koenig wrote:
    EricP <ThatWouldBeTelling@thevillage.com> schrieb:

    For a VAX memory read it had to (roughly speaking):
    (1) translate virtual->physical address
    (2) go through the cache (read miss)
    (3) get to the SBI (there is a 1 entry store buffer on cache output)
    (4) negotiate for SBI
    (5) SBI take 2 cycles to transmit control and read address
    (6) memory controller does its thing
    (7) memory controller negotiates for SBI
    (8) memory controller transmits 32B cache line (1 control + 8*4B data
    clocks)
    (9) cache receives and saves 32B cache line
    (10) cache returns 4B value to 780 core

    The VAX cache line was 8 bytes according to
    https://dl.acm.org/doi/pdf/10.1145/357353.357356
    ..

    You are correct. I was thinking it was 8 words.
    (That's a lot of tag overhead for such a small data cache.)
    So 2*4B data clocks.

    Ah, but 8B lines makes sense when you think about TTL packages.
    An 8B line can use 4*8 8:1 muxes to align (rotate) any byte into position,
    and an 8:1 mux can be had in a 14-pin package, with about 20 ns delay.
    If it had a larger 16B cache line then it would need 4*8 16:1 muxes
    which are in 24-pin DIPs and about 6x the board space each,
    and >16B line needs multiple layers of muxes adding even more to
    board space and delay.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Fri Dec 19 00:56:17 2025
    From Newsgroup: comp.arch

    On Wed, 17 Dec 2025 23:21:45 -0500, Stefan Monnier wrote:

    FWIW, the original ARM did not have a cache,

    Some good background on various clever features of the design at
    RetroBytes <https://www.youtube.com/watch?v=t59EtDxpYmM>, and how they
    kept costs down. E.g.

    * Low heat dissipation meant the chip could be put in a plastic
    package, instead of a ceramic one like the Intel 80386.
    * Only 26-bit addressing, so the unused 6 address bits could be used
    to save processor state, needing only a single 32-bit value to be
    pushed on the stack during an interrupt.
    * The way the MEMC and VIDC support chips divided up tasks between
    them: one could access the address pins but not the data pins; while
    the other could see data but not the address. How did that work?
    Very cleverly.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Fri Dec 19 01:04:07 2025
    From Newsgroup: comp.arch

    On Thu, 18 Dec 2025 17:22:44 GMT, Anton Ertl wrote:

    They apparently were not very worried about pin count for the
    chip, so they did not even multiplex address bus (26 pins) and data bus
    (32 pins) the way many others did.

    May have been part of the tradeoff that saw one support chip connected to
    the address lines while the other one attached to the data lines. Might
    have lowered the cost overall to do things this way.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Fri Dec 19 12:28:42 2025
    From Newsgroup: comp.arch

    Anton Ertl wrote:
    EricP <ThatWouldBeTelling@thevillage.com> writes:
    Anton Ertl wrote:
    It seems to me that ARM made this clock work with DRAM without cache
    by making good use of staying in the same row: In particular,
    consecutive instructions usually are from the same row.

    This is called page mode; not sure if that was available when the VAX
    was designed. Later (supposedly starting in 1986) fast page mode was introduced (not sure how that affected performance). The ARM1
    hardware reference manual says "150 nanoseconds row access DRAM" and
    "8 MIPS peak". Not sure how 150nanoseconds and 8 MIPS go together,
    given that every instruction needs a memory access.

    Page mode does not appear in the 1975 memory catalogs I looked at
    and does appear in the 1980's ones.

    Note that you only need to time multiplex the address when the DRAM size
    gets so large that not doing so would mean a much larger pin package.
    And package size matters because you need a large matrix of these chips.

    In 1975 the 4kb chips was in an 18 pin package so did not need RAS and CAS.
    By 1979 the 4kb chip was in a 16 pin package and did have RAS and CAS
    but the 16kb DRAM also was in a 16 pin package and would fit on the same
    PCB design. If they had not multiplexed the address lines then it would
    force customers to completely redesign their memory boards.
    This way it is a simple upgrade from 4kb to 16kb.

    There is a copy of the ARM-1 Hardware Reference Manual from 1986 here

    http://chrisacorns.computinghistory.org.uk/docs/Acorn/OEM/OEM.html

    (1) The MMU (if any) was external to the cpu (ie "not their problem")
    (2) It looks like the RAS and CAS DRAM signals came directly from the
    ARM cpu chip which was designed to work synchronously with DRAM.

    I see no RAS and CAS pins in the ARM pinout. What I see is 26 address
    lines, while 13 would have been enough if the memory controller was in
    the ARM1. They apparently were not very worried about pin count for
    the chip, so they did not even multiplex address bus (26 pins) and
    data bus (32 pins) the way many others did.

    Ok, I saw the references to RAS and CAS and thought they came from the cpu. Yes, the RAS and CAS timing is indirectly generated by two PAL's IC1 and IC2 that combine the cpu signals and system timing generator signals.

    Yes, the time multiplexing that most microprocessors required large
    numbers of support chips to decode states, latch addresses.

    Here the RAS, CAS and DRAM are *almost* directly driven by the ARM cpu.
    That is unusual - this must have been designed to be a
    minimal parts count, single board processor.

    The ARM1 has a "translate" signal for telling the MMU (not existing on
    the first systems) that this is a virtual address.

    The ARM1 also has a "seq" output signal that indicate sequential
    memory access: "It may be used, in combination with the low-order
    address lines, to indicate that the next cycle can use a fast memory
    mode (for example DRAM page mode) and/or to by- pass the address
    translation system."

    (3) There was only 1 memory bank.

    Page 23 says that the ARM co-processor board (it's a co-processor to
    the BBC Model B, not a co-processor to the ARM) carries "2MBytes DRAM,
    a bootstrap ROM, and an additional 2MBytes of DRAM on a daughter
    board". On page 24 it says "IC23 to IC150 [...] ICs that make up the
    4MBytes of RAM". I.e., 128 chips; sounds like at least 4 banks of RAM
    to me.

    My point being that this 1-board ARM system does not have all the overhead
    a general purpose cpu with expandable plug-in memory boards would have.
    To get plus-in memory boards needs a shared bus, which needs interfaces, arbitrations, blah, blah.

    So any performance numbers are highly skewed in ARM's favor.

    Page 27 says: "RAS0..RAS3 [...] There are four banks of RAMs"

    (4) There was no cache
    (5) There was no standard system bus to plug in IO adapters

    The co-processor board also accesses the host system through
    memory-mapped I/O (the ARM1 has no I/O pins), and the BBC Micro has a
    system bus with I/O.

    The Archimedes A400 series (with an ARM2) has a 4-slot backplane.

    Compared to a VAX-780, VAX had an MMU and a system bus,
    the Synchronous Backplane Interface (SBI) onto which all memory and
    IO adapter boards were hung. There were multiple memory boards.

    For a VAX memory read it had to (roughly speaking):
    (1) translate virtual->physical address

    For sequential accesses, the same translation can be used in the usual
    case, just as the same DRAM row can be used in the usual case. This
    means that the usual case can be as fast as without translation.

    In the unusual case, the translation will add latency, yes. But
    that's true even with caches, unless you have a virtually addressed
    cache (which is the common case for L1 these days).

    But again my point is that this is a single board implementation
    so doesn't have the overheads most computers do.
    To add translation or cache you'd have to use multiple boards
    connected by buses and all that cuts into its performance.

    One thing that's possible if you are willing to pay for a more complex
    system (as the VAX 11/780 microarchitects clearly were) is to have
    separate control, address and data lines for the different banks, and
    use that to access the banks alternatingly, with a bandwidth advantage.

    Yes, but as I discovered playing around with my "TTL pipelined risc-VAX"
    design that adds parallel buses, and buses with plug-in boards need
    connectors, and you quickly run out of PCB edge pins.

    I did not anticipate that an in-order pipeline design needs many more
    connector pins on each board, because it is not time-multiplexing all
    the signals on a single system bus but doing communications concurrently.

    But yes, in general bigger memory subsystems tend to be slower. The
    VAX 11/780 compensated that partly with its cache.

    - anton



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Fri Dec 19 19:26:12 2025
    From Newsgroup: comp.arch

    EricP <ThatWouldBeTelling@thevillage.com> writes:
    Anton Ertl wrote:
    EricP <ThatWouldBeTelling@thevillage.com> writes:


    Yes, but as I discovered playing around with my "TTL pipelined risc-VAX" >design that adds parallel buses, and buses with plug-in boards need >connectors, and you quickly run out of PCB edge pins.

    The Burroughs B4900 processor cards plugged into a backplane (the cards
    were about 30" tall, about 18" deep). The fetch module was split across two cards, and used front-edge connectors for internal signals
    (card edge connectors with ribbon cable linkage). Similarly
    for the XM (eXecute Module).

    The remaining cards (Memory Write, Memory Read, DRAM and I/O processor)
    were single cards each just sharing the backplane bus.

    I/O controllers were plugged into an I/O "Base" module with its own
    backplane, and the more complicated multicard I/O controllers used
    front-edge connectors for internal signals. Somewhat smaller cards, perhaps 20" tall and 16" deep.

    This was as late 70's design that started shipping circa 1982.

    http://bitsavers.org/pdf/burroughs/MediumSystems/B4900/1987-1193_B4900_Architecture_198306.pdf
    --- Synchronet 3.21a-Linux NewsLink 1.2