• Re: $0.03 microcontroller

    From Niklas Holsti@niklas.holsti@tidorum.invalid to comp.arch.embedded on Wed Oct 17 23:07:12 2018
    From Newsgroup: comp.arch.embedded

    On 18-10-17 20:04 , gnuarm.deletethisbit@gmail.com wrote:
    On Wednesday, October 17, 2018 at 11:37:14 AM UTC-4, Niklas Holsti
    wrote:
    On 18-10-17 17:08 , gnuarm.deletethisbit@gmail.com wrote:
    On Wednesday, October 17, 2018 at 2:35:46 AM UTC-4, Niklas
    Holsti wrote:
    On 18-10-17 01:46 , David Brown wrote: ...
    When I am faced with someone else's code to examine or
    maintain, I often run it through Doxygen with "generate
    documentation for /everything/ - caller graphs, callee
    graphs, cross-linked source, etc." It can make it quick to
    jump around in the code. And recursive (or re-entrant,
    whichever you prefer) code stands out like a sore thumb, as
    long as the code is single-threaded - you get loops in the
    call graphs.

    Anecdote: some years ago, when I was applying a WCET analysis
    tool to someone else's program, the tool found recursion. This
    surprised the people I was working with, because they had
    generated call graphs for the program, analysed them visually,
    and found no recursive, looping paths.

    Turned out that they had asked the call-graph tool to optimize
    the size of the window used to display the call-graphs. The
    tool did as it was told, with the result that the line segments
    on the path for the recursive call went down to the bottom edge
    of the diagram, then *merged* with the lower border line of the
    diagram, followed that lower border, went up one side of the
    diagram -- still merged with the border line -- and then
    reentered the diagram to point at the source of the recursive
    call, effectively making the loop very hard to see...

    (It turned out that this recursion was intentional. At this
    point, the program was sending an alarm message, but the alarm
    buffer was full, so the alarm routine called itself to send an
    alarm about the full buffer -- and that worked, because one
    buffer slot was reserved, by design, for this "buffer full"
    alarm.)

    Seems to me what actually failed was that they knew they had
    recursion in the design but didn't realize the fact that they
    didn't see the recursion in the call graphs was an error that
    should have been caught.

    The guys creating and viewing the call-graphs were not the
    designers of the program, either, so they didn't know, but for sure
    it was something they should have discovered and remarked on as
    part of their work.

    Do you know the intended purpose of the call graphs?

    IIRC they were doing independent SW verification & validation of the
    program (and the WCET analysis was also a part of that). But it was many
    years ago, and I don't remember the details well enough to say much
    more, nor can I say why the program was recursive in this way, or if it
    could as easily have been made non-recursive.
    --
    Niklas Holsti
    Tidorum Ltd
    niklas holsti tidorum fi
    . @ .
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From upsidedown@upsidedown@downunder.com to comp.arch.embedded on Sun Oct 21 16:27:31 2018
    From Newsgroup: comp.arch.embedded

    On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
    <no.email@nospam.invalid> wrote:

    Clifford Heath <no.spam@please.net> writes:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
    <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
    OTP, no SPI, UART or I²C, but still...

    That is impressive! Seems to be an 8-bit RISC with no registers, just
    an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
    I like that it's in a 6-pin SOT23 package since there aren't many other
    MCUs that small.

    Slightly OT, but I have often wonder how primitive a computer
    architecture can be and still do some useful work. In the
    tube/discrete/SSI times, there were quite a lot 1 bit processors.
    There were at least two types, the PLC (programmable Logic Controller)
    type replacing relay logic. These had typically at least AND, OR, NOT,
    (XOR) instructions.The other group was used as truly serial computers
    with the same instructions as the PLC but also at least a 1 bit SUB
    (and ADD) instructions to implement all mathematical functions.

    However, in the LSI era, there down't seem to be many implement ions.

    One that immediately comes in mind is the MC14500B PLC building block,
    from the 1970's, which requires quite lot of support chips (code
    memory, PC, /O chips) to do some useful work.

    After much searching, I found the (NI) National Instruments SBA
    (Serial Boolean Analyser)
    http://www.wass.net/othermanuals/GI%20SBA.pdf
    from the same era, with 1024 word instructions (8 bit) ROM and four
    banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
    For the re-entrance enthusiasts, it contains stack pointer relative
    addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From jim.brakefield@jim.brakefield@ieee.org to comp.arch.embedded on Sun Oct 21 07:47:21 2018
    From Newsgroup: comp.arch.embedded

    On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
    On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
    <no.email@nospam.invalid> wrote:

    Clifford Heath <no.spam@please.net> writes:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
    OTP, no SPI, UART or Iæ¶Ž, but still...

    That is impressive! Seems to be an 8-bit RISC with no registers, just
    an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
    I like that it's in a 6-pin SOT23 package since there aren't many other >MCUs that small.

    Slightly OT, but I have often wonder how primitive a computer
    architecture can be and still do some useful work. In the
    tube/discrete/SSI times, there were quite a lot 1 bit processors.
    There were at least two types, the PLC (programmable Logic Controller)
    type replacing relay logic. These had typically at least AND, OR, NOT,
    (XOR) instructions.The other group was used as truly serial computers
    with the same instructions as the PLC but also at least a 1 bit SUB
    (and ADD) instructions to implement all mathematical functions.

    However, in the LSI era, there down't seem to be many implement ions.

    One that immediately comes in mind is the MC14500B PLC building block,
    from the 1970's, which requires quite lot of support chips (code
    memory, PC, /O chips) to do some useful work.

    After much searching, I found the (NI) National Instruments SBA
    (Serial Boolean Analyser)
    http://www.wass.net/othermanuals/GI%20SBA.pdf
    from the same era, with 1024 word instructions (8 bit) ROM and four
    banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
    For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?
    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?
    LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose
    (Logic Emulation Machine) https://opencores.org/project/lem1_9min
    Jim Brakefield
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Phil Martel@pomartel@comcast.net to comp.arch.embedded on Sun Oct 21 11:03:18 2018
    From Newsgroup: comp.arch.embedded

    On 10/21/2018 09:27, upsidedown@downunder.com wrote:
    On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
    <no.email@nospam.invalid> wrote:

    Clifford Heath <no.spam@please.net> writes:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
    OTP, no SPI, UART or I²C, but still...

    That is impressive! Seems to be an 8-bit RISC with no registers, just
    an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram,
    enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
    I like that it's in a 6-pin SOT23 package since there aren't many other
    MCUs that small.

    Slightly OT, but I have often wonder how primitive a computer
    architecture can be and still do some useful work. In the
    tube/discrete/SSI times, there were quite a lot 1 bit processors.
    There were at least two types, the PLC (programmable Logic Controller)
    type replacing relay logic. These had typically at least AND, OR, NOT,
    (XOR) instructions.The other group was used as truly serial computers
    with the same instructions as the PLC but also at least a 1 bit SUB
    (and ADD) instructions to implement all mathematical functions.

    However, in the LSI era, there down't seem to be many implement ions.

    One that immediately comes in mind is the MC14500B PLC building block,
    from the 1970's, which requires quite lot of support chips (code
    memory, PC, /O chips) to do some useful work.

    After much searching, I found the (NI) National Instruments SBA
    (Serial Boolean Analyser)
    http://www.wass.net/othermanuals/GI%20SBA.pdf
    from the same era, with 1024 word instructions (8 bit) ROM and four
    banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
    For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?

    I have a memory of a 1-bit GPU from the late 70's, but can't pin it
    down. There is an article on Wikipedia https://en.wikipedia.org/wiki/1-bit_architecture
    --
    Best wishes,
    --Phil
    pomartel At Comcast(ignore_this) dot net
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From gnuarm.deletethisbit@gnuarm.deletethisbit@gmail.com to comp.arch.embedded on Sun Oct 21 08:08:02 2018
    From Newsgroup: comp.arch.embedded

    On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:
    On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
    On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
    <no.email@nospam.invalid> wrote:

    Clifford Heath <no.spam@please.net> writes:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
    <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
    OTP, no SPI, UART or Iæ¶Ž, but still...

    That is impressive! Seems to be an 8-bit RISC with no registers, just
    an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
    I like that it's in a 6-pin SOT23 package since there aren't many other >MCUs that small.

    Slightly OT, but I have often wonder how primitive a computer
    architecture can be and still do some useful work. In the
    tube/discrete/SSI times, there were quite a lot 1 bit processors.
    There were at least two types, the PLC (programmable Logic Controller)
    type replacing relay logic. These had typically at least AND, OR, NOT, (XOR) instructions.The other group was used as truly serial computers
    with the same instructions as the PLC but also at least a 1 bit SUB
    (and ADD) instructions to implement all mathematical functions.

    However, in the LSI era, there down't seem to be many implement ions.

    One that immediately comes in mind is the MC14500B PLC building block,
    from the 1970's, which requires quite lot of support chips (code
    memory, PC, /O chips) to do some useful work.

    After much searching, I found the (NI) National Instruments SBA
    (Serial Boolean Analyser)
    http://www.wass.net/othermanuals/GI%20SBA.pdf
    from the same era, with 1024 word instructions (8 bit) ROM and four
    banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
    For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?

    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?

    LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose (Logic Emulation Machine) https://opencores.org/project/lem1_9min

    Jim Brakefield
    It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
    I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.
    Rick C.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From jim.brakefield@jim.brakefield@ieee.org to comp.arch.embedded on Sun Oct 21 09:31:29 2018
    From Newsgroup: comp.arch.embedded

    On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:
    On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:
    On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
    On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
    <no.email@nospam.invalid> wrote:

    Clifford Heath <no.spam@please.net> writes:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
    <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
    OTP, no SPI, UART or Iæ¶Ž, but still...

    That is impressive! Seems to be an 8-bit RISC with no registers, just >an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM. >I like that it's in a 6-pin SOT23 package since there aren't many other >MCUs that small.

    Slightly OT, but I have often wonder how primitive a computer architecture can be and still do some useful work. In the tube/discrete/SSI times, there were quite a lot 1 bit processors.
    There were at least two types, the PLC (programmable Logic Controller) type replacing relay logic. These had typically at least AND, OR, NOT, (XOR) instructions.The other group was used as truly serial computers with the same instructions as the PLC but also at least a 1 bit SUB
    (and ADD) instructions to implement all mathematical functions.

    However, in the LSI era, there down't seem to be many implement ions.

    One that immediately comes in mind is the MC14500B PLC building block, from the 1970's, which requires quite lot of support chips (code
    memory, PC, /O chips) to do some useful work.

    After much searching, I found the (NI) National Instruments SBA
    (Serial Boolean Analyser)
    http://www.wass.net/othermanuals/GI%20SBA.pdf
    from the same era, with 1024 word instructions (8 bit) ROM and four
    banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
    For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?

    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?

    LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose (Logic Emulation Machine) https://opencores.org/project/lem1_9min

    Jim Brakefield

    It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

    I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.

    Rick C.
    It's hard to picture an application where you couldn't spare a few hundred LUTs.
    There are advantages to using several soft core processors, each sized and customized to the need.
    I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
    There are many under 600 LUTs, including 32-bit. Had hoped the full featured LEM design would be under 100 LUTs.
    Have done some rough research of whats available for under 600 LUTs: https://opencores.org/project/up_core_list/downloads
    select: "By Performance Metric"
    A big rational for small soft core processors is that they replace LUTs (slow speed logic) with block RAM (instructions). And they are completely deterministic as opposed to doing the same by time slicing a ASIC (ARM) processor.
    Jim Brakefield
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From gnuarm.deletethisbit@gnuarm.deletethisbit@gmail.com to comp.arch.embedded on Sun Oct 21 10:51:29 2018
    From Newsgroup: comp.arch.embedded

    On Sunday, October 21, 2018 at 12:31:34 PM UTC-4, jim.bra...@ieee.org wrote:
    On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:
    On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:
    On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
    On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin <no.email@nospam.invalid> wrote:

    Clifford Heath <no.spam@please.net> writes:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
    <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
    OTP, no SPI, UART or Iæ¶Ž, but still...

    That is impressive! Seems to be an 8-bit RISC with no registers, just >an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram,
    enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
    I like that it's in a 6-pin SOT23 package since there aren't many other
    MCUs that small.

    Slightly OT, but I have often wonder how primitive a computer architecture can be and still do some useful work. In the tube/discrete/SSI times, there were quite a lot 1 bit processors.
    There were at least two types, the PLC (programmable Logic Controller) type replacing relay logic. These had typically at least AND, OR, NOT, (XOR) instructions.The other group was used as truly serial computers with the same instructions as the PLC but also at least a 1 bit SUB (and ADD) instructions to implement all mathematical functions.

    However, in the LSI era, there down't seem to be many implement ions.

    One that immediately comes in mind is the MC14500B PLC building block, from the 1970's, which requires quite lot of support chips (code memory, PC, /O chips) to do some useful work.

    After much searching, I found the (NI) National Instruments SBA (Serial Boolean Analyser)
    http://www.wass.net/othermanuals/GI%20SBA.pdf
    from the same era, with 1024 word instructions (8 bit) ROM and four banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
    For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?

    Anyone seen more modern 1 bit chips either for relay replacement or for truly serial computers ?

    LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose (Logic Emulation Machine) https://opencores.org/project/lem1_9min

    Jim Brakefield

    It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

    I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.

    Rick C.

    It's hard to picture an application where you couldn't spare a few hundred LUTs.

    There are advantages to using several soft core processors, each sized and customized to the need.

    I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

    There are many under 600 LUTs, including 32-bit. Had hoped the full featured LEM design would be under 100 LUTs.
    Have done some rough research of whats available for under 600 LUTs: https://opencores.org/project/up_core_list/downloads
    select: "By Performance Metric"

    A big rational for small soft core processors is that they replace LUTs (slow speed logic) with block RAM (instructions). And they are completely deterministic as opposed to doing the same by time slicing a ASIC (ARM) processor.
    I won't argue a bit that softcores and especially *customizable* softcore CPUs aren't useful. I was talking about there being at best a very tiny region of utility for 1-bit processors.
    My 600 LUT processor didn't trade off much for performance. It would run pretty fast and was pretty capable. In addition the word size was independent of the instruction set. That said, there are apps where a much less powerful processor would do fine and saving a few more LUTs would be useful.
    Rick C.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From David Brown@david.brown@hesbynett.no to comp.arch.embedded on Sun Oct 21 21:43:43 2018
    From Newsgroup: comp.arch.embedded

    On 21/10/2018 17:08, gnuarm.deletethisbit@gmail.com wrote:


    It is hard for me to imagine applications where a 1 bit processor
    would be useful. A useful N bit processor can be built in a small
    number of LUTs. I've built a 16 bit processor in just 600 LUTs and
    I've seen processors in a bit less.

    I discussed this with someone once and he imagined apps where the
    processing speed requirement was quite low and you can save LUTs with
    a bit serial processor. I just don't know how many or why it would
    matter. Even the smallest FPGAs have thousands of LUTs. It's hard
    to picture an application where you couldn't spare a few hundred
    LUTs.


    There is not much point in 1-bit processing with modern architectures
    and FPGAs. But it used to be more useful, for cheap and scalable
    solutions. You got systems that scaled in parallel, using bit-slice processors to make cpus as wide as you want. And you got serial
    scaling, giving you practical numbers of bits with minimal die area
    (like the COP8 microcontrollers).

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From jim.brakefield@jim.brakefield@ieee.org to comp.arch.embedded on Sun Oct 21 12:44:39 2018
    From Newsgroup: comp.arch.embedded

    On Sunday, October 21, 2018 at 12:51:34 PM UTC-5, gnuarm.del...@gmail.com wrote:
    On Sunday, October 21, 2018 at 12:31:34 PM UTC-4, jim.bra...@ieee.org wrote:
    On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:
    On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:
    On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
    On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin <no.email@nospam.invalid> wrote:

    Clifford Heath <no.spam@please.net> writes:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
    <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
    OTP, no SPI, UART or Iæ¶Ž, but still...

    That is impressive! Seems to be an 8-bit RISC with no registers, just
    an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram,
    enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
    I like that it's in a 6-pin SOT23 package since there aren't many other
    MCUs that small.

    Slightly OT, but I have often wonder how primitive a computer architecture can be and still do some useful work. In the tube/discrete/SSI times, there were quite a lot 1 bit processors. There were at least two types, the PLC (programmable Logic Controller)
    type replacing relay logic. These had typically at least AND, OR, NOT,
    (XOR) instructions.The other group was used as truly serial computers with the same instructions as the PLC but also at least a 1 bit SUB (and ADD) instructions to implement all mathematical functions.

    However, in the LSI era, there down't seem to be many implement ions.

    One that immediately comes in mind is the MC14500B PLC building block,
    from the 1970's, which requires quite lot of support chips (code memory, PC, /O chips) to do some useful work.

    After much searching, I found the (NI) National Instruments SBA (Serial Boolean Analyser) http://www.wass.net/othermanuals/GI%20SBA.pdf
    from the same era, with 1024 word instructions (8 bit) ROM and four banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package. For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC
    environment.

    Anyone seen more modern 1 bit chips either for relay replacement or for truly serial computers ?

    Anyone seen more modern 1 bit chips either for relay replacement or for truly serial computers ?

    LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose
    (Logic Emulation Machine) https://opencores.org/project/lem1_9min

    Jim Brakefield

    It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

    I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.

    Rick C.

    It's hard to picture an application where you couldn't spare a few hundred LUTs.

    There are advantages to using several soft core processors, each sized and customized to the need.

    I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

    There are many under 600 LUTs, including 32-bit. Had hoped the full featured LEM design would be under 100 LUTs.
    Have done some rough research of whats available for under 600 LUTs: https://opencores.org/project/up_core_list/downloads
    select: "By Performance Metric"

    A big rational for small soft core processors is that they replace LUTs (slow speed logic) with block RAM (instructions). And they are completely deterministic as opposed to doing the same by time slicing a ASIC (ARM) processor.

    I won't argue a bit that softcores and especially *customizable* softcore CPUs aren't useful. I was talking about there being at best a very tiny region of utility for 1-bit processors.

    My 600 LUT processor didn't trade off much for performance. It would run pretty fast and was pretty capable. In addition the word size was independent of the instruction set. That said, there are apps where a much less powerful processor would do fine and saving a few more LUTs would be useful.

    Rick C.
    there being at best a very tiny region of utility for 1-bit processors
    There are a small number of examples:
    Bit serial processors such as DEC PDP8L, early vacuum tube & drum machines, for example Bendix G-15.
    Bit serial Cordic
    Also telling, is that 4-bit processors for calculators have been replaced by 8-bit processors.
    My inspiration was EDIF, which was/is output from VHDL & Verilog compilers. E.g. use EDIF as a machine language. In the context of logic simulation, greater FPGA capacity possible for slow logic.
    This effort also lead to a theoretical insight for brain modelling: There is greater information content in the wiring than in the logic. The human brain has 2<<36+ neurons requiring 36-bits of information for each connection and only 16 or so bits for the state/configuration of each synapse. Also a FPGA requires 60+ bits to route each LUT input (assuming all LUT inputs in use) whereas each possible input can be specified by 20 bits or less (1M LUT FPGA).
    Of course optimizing simulators convert the EDIF to an existing machine language. Likewise for industrial automation (ladder logic, ...).
    Jim Brakefield
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Brett@ggtgp@yahoo.com to comp.arch.embedded on Mon Oct 22 00:28:51 2018
    From Newsgroup: comp.arch.embedded

    <jim.brakefield@ieee.org> wrote:
    On Sunday, October 21, 2018 at 12:51:34 PM UTC-5, gnuarm.del...@gmail.com wrote:
    On Sunday, October 21, 2018 at 12:31:34 PM UTC-4, jim.bra...@ieee.org wrote: >>> On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:
    On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:
    On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
    On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
    <no.email@nospam.invalid> wrote:

    Clifford Heath <no.spam@please.net> writes:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
    <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
    OTP, no SPI, UART or Iæ¶Ž, but still...

    That is impressive! Seems to be an 8-bit RISC with no registers, just >>>>>>> an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >>>>>>> enough for plenty of MCU things. Didn't check if it has an ADC or PWM. >>>>>>> I like that it's in a 6-pin SOT23 package since there aren't many other >>>>>>> MCUs that small.

    Slightly OT, but I have often wonder how primitive a computer
    architecture can be and still do some useful work. In the
    tube/discrete/SSI times, there were quite a lot 1 bit processors.
    There were at least two types, the PLC (programmable Logic Controller) >>>>>> type replacing relay logic. These had typically at least AND, OR, NOT, >>>>>> (XOR) instructions.The other group was used as truly serial computers >>>>>> with the same instructions as the PLC but also at least a 1 bit SUB >>>>>> (and ADD) instructions to implement all mathematical functions.

    However, in the LSI era, there down't seem to be many implement ions. >>>>>>
    One that immediately comes in mind is the MC14500B PLC building block, >>>>>> from the 1970's, which requires quite lot of support chips (code
    memory, PC, /O chips) to do some useful work.

    After much searching, I found the (NI) National Instruments SBA
    (Serial Boolean Analyser)
    http://www.wass.net/othermanuals/GI%20SBA.pdf
    from the same era, with 1024 word instructions (8 bit) ROM and four >>>>>> banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package. >>>>>> For the re-entrance enthusiasts, it contains stack pointer relative >>>>>> addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 >>>>>> Darlington buffers may be needed to drive loads typically found in PLC >>>>>> environment.

    Anyone seen more modern 1 bit chips either for relay replacement or >>>>>> for truly serial computers ?

    Anyone seen more modern 1 bit chips either for relay replacement or >>>>> ]> for truly serial computers ?

    LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose >>>>> (Logic Emulation Machine) https://opencores.org/project/lem1_9min

    Jim Brakefield

    It is hard for me to imagine applications where a 1 bit processor
    would be useful. A useful N bit processor can be built in a small
    number of LUTs. I've built a 16 bit processor in just 600 LUTs and
    I've seen processors in a bit less.

    I discussed this with someone once and he imagined apps where the
    processing speed requirement was quite low and you can save LUTs with
    a bit serial processor. I just don't know how many or why it would
    matter. Even the smallest FPGAs have thousands of LUTs. It's hard to >>>> picture an application where you couldn't spare a few hundred LUTs.

    Rick C.

    It's hard to picture an application where you couldn't spare a few hundred LUTs.

    There are advantages to using several soft core processors, each sized
    and customized to the need.

    I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

    There are many under 600 LUTs, including 32-bit. Had hoped the full
    featured LEM design would be under 100 LUTs.
    Have done some rough research of whats available for under 600 LUTs:
    https://opencores.org/project/up_core_list/downloads
    select: "By Performance Metric"

    A big rational for small soft core processors is that they replace LUTs
    (slow speed logic) with block RAM (instructions). And they are
    completely deterministic as opposed to doing the same by time slicing a
    ASIC (ARM) processor.

    I won't argue a bit that softcores and especially *customizable*
    softcore CPUs aren't useful. I was talking about there being at best a
    very tiny region of utility for 1-bit processors.

    My 600 LUT processor didn't trade off much for performance. It would
    run pretty fast and was pretty capable. In addition the word size was
    independent of the instruction set. That said, there are apps where a
    much less powerful processor would do fine and saving a few more LUTs would be useful.

    Rick C.

    there being at best a very tiny region of utility for 1-bit processors

    There are a small number of examples:
    Bit serial processors such as DEC PDP8L, early vacuum tube & drum
    machines, for example Bendix G-15.
    Bit serial Cordic

    Also telling, is that 4-bit processors for calculators have been replaced
    by 8-bit processors.

    My inspiration was EDIF, which was/is output from VHDL & Verilog
    compilers. E.g. use EDIF as a machine language. In the context of logic simulation, greater FPGA capacity possible for slow logic.

    This effort also lead to a theoretical insight for brain modelling: There
    is greater information content in the wiring than in the logic. The
    human brain has 2<<36+ neurons requiring 36-bits of information for each connection and only 16 or so bits for the state/configuration of each synapse. Also a FPGA requires 60+ bits to route each LUT input (assuming
    all LUT inputs in use) whereas each possible input can be specified by 20 bits or less (1M LUT FPGA).

    The clock speed is quite low, 2 Hz?
    So the wetware is is not quite impossible to emulate with current tech.
    Raising a baby and training the resultant adult to do a task is still many orders of magnitude cheaper.
    ;)

    Of course optimizing simulators convert the EDIF to an existing machine language. Likewise for industrial automation (ladder logic, ...).

    Jim Brakefield




    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From George Neuner@gneuner2@comcast.net to comp.arch.embedded on Sun Oct 21 20:59:55 2018
    From Newsgroup: comp.arch.embedded

    On Sun, 21 Oct 2018 16:27:31 +0300, upsidedown@downunder.com wrote:

    Slightly OT, but I have often wonder how primitive a computer
    architecture can be and still do some useful work. In the
    tube/discrete/SSI times, there were quite a lot 1 bit processors.
    There were at least two types, the PLC (programmable Logic Controller)
    type replacing relay logic. These had typically at least AND, OR, NOT,
    (XOR) instructions.The other group was used as truly serial computers
    with the same instructions as the PLC but also at least a 1 bit SUB
    (and ADD) instructions to implement all mathematical functions.

    However, in the LSI era, there down't seem to be many implement ions.

    One that immediately comes in mind is the MC14500B PLC building block,
    from the 1970's, which requires quite lot of support chips (code
    memory, PC, /O chips) to do some useful work.

    After much searching, I found the (NI) National Instruments SBA
    (Serial Boolean Analyser)
    http://www.wass.net/othermanuals/GI%20SBA.pdf
    from the same era, with 1024 word instructions (8 bit) ROM and four
    banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
    For the re-entrance enthusiasts, it contains stack pointer relative >addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 >Darlington buffers may be needed to drive loads typically found in PLC >environment.

    Anyone seen more modern 1 bit chips either for relay replacement or
    for truly serial computers ?

    Circa 1985-1993, Thinking Machines Connection Machine.
    Circa 1987-1996, MasPar MP series.

    The CM-1, 2, 2a, and 200 all were SIMD parallel using 1-bit serial
    integer-only CPUs. Sizes ranged from 8K CPUs at the low end to 64K
    CPUs at the high end. Each CPU had 4K *bits* of private RAM, and the
    CPUs were connected in a multidimensional hypercube network.

    The CM-2, 2a, and 200 were augmented with 32-bit FPUs (1 per 32 CPUs),
    and the 200 featured a higher clock speed.


    The MP-1 was SIMD parallel using 4-bit serial integer-only CPUs in
    sizes from 1K to 16K CPUs. It also had 32-bit FPUs, but I don't
    remember how many / what ratio. I remember that it had an accumulator
    register rather than going memory->memory like the CM.

    [I can't find much information now about the MP-1 ... unfortunately
    MasPar didn't last very long in the marketplace. The Wikipedia
    article has some information about the MP-2, but the MP-2 was a later
    full 32-bit design, very different from the MP-1.]


    My college had both an 8K CM-2 and a 1K MP-1, accessible to those who
    took various parallel processing electives. I never got to use the
    MP-1 much - it was new at the end of my time and I only ever played
    with it a bit. But I spent 2 semesters working with the CM-2.

    Even though the CM's clock speed was only ~8MHz, the performance was
    amazing IF the problem was a good fit to the architecture. E.g., at
    that time, I owned a 66MHz (dx2) i486. Converted for the CM-2
    architecture, O(n^4) array processing on the i486 became O(n) on the
    CM-2. I had a physics simulation that took over 3 hours on my i486
    that ran in ~10 minutes on the CM.

    George
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Wed Oct 24 15:57:55 2018
    From Newsgroup: comp.arch.embedded

    Am 14.10.2018 um 11:55 schrieb Theo:
    Tim <cpldcpu+usenet@gmail.com> wrote:
    This is quite curious. I wonder

    - Has anyone actually received the devices they ordered? The cheaper
    variants seem to be sold out.

    I think they've sold out since they went viral. EEVblog did a video showing 550 in stock - that's only $16 worth of parts, not hard to imagine they've been bought up.

    The other option is they're some kind of EOL part and 3c is the 'reduced to clear' price - which they have done, very successfully.

    Theo


    They're back in stock, though the price rose by 21% to 0.046$.
    Also, LCSC seems to now be stocking more Padauk parts, including more
    dual-core devices. Unfortunately, the programmer seems to be out of
    stock, and they have neither the flash nor the DIP variants.

    Philipp
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Mon Nov 5 12:41:27 2018
    From Newsgroup: comp.arch.embedded

    Am 12.10.2018 um 09:44 schrieb David Brown:
    On 12/10/18 08:50, Philipp Klaus Krause wrote:
    Am 12.10.2018 um 01:08 schrieb Paul Rubin:
    upsidedown@downunder.com writes:
    There is a lot of operations that will update memory locations, so why >>>> would you need a lot of CPU registers.

    Being able to (say) add register to register saves traffic through the
    accumulator and therefore instructions.

    1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
    assembly program listing.

    It would be nice to have a C compiler, and registers help with that.


    Looking at the instruction set, it should be possible to make a backend
    for this in SDCC; the architecture looks more C-friendly than the
    existing pic14 and pic16 backends. But it surely isn't as nice as stm8
    or z80.
    reentrant functions will be inefficent: No registers, and no sp-relative
    adressing mode. On would want to reserve a few memory locations as
    pseudo-registers to help with that, but that only goes so far.


    It looks like the lowest 16 memory addresses could be considered pseudo-registers - they are the ones that can be used for direct memory access rather than needing indirect access.


    Considering the multi-core variants of the Padauk µCs:
    Those adresses are shared across all cores. Each core only has its own
    A, SP, F, PC.
    How do we handle local variables?

    Option 1: Make functions non-reentrant. Requires duplication of code (we
    need per-thread copies of functions), and link-time analysis to ensure
    that each thread only calls the function implementation meant for it.
    Functions pointers get complicated.

    Option 2: Use an inefficient combination of thread-local storage and stack.

    Since this is a small µC, we need a lot of support functions, which the compiler inserts (e.g. for multiplication); of course those are affected
    by the same problems.

    Philipp
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Thu Nov 8 13:53:48 2018
    From Newsgroup: comp.arch.embedded

    Am 12.10.18 um 20:39 schrieb upsidedown@downunder.com:
    On Fri, 12 Oct 2018 10:18:56 +0200, Philipp Klaus Krause <pkk@spth.de>
    wrote:

    Am 10.10.2018 um 03:05 schrieb Clifford Heath:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>


    OTP, no SPI, UART or I²C, but still...

    Clifford Heath

    They even make dual-core variants (the part where the first digit in the
    part number is '2'). It seems program counter, stack pointer, flag
    register and accumulator are per-core, while the rest, including the ALU
    is shared. In particular, the I/O registers are also shared, which means
    some multiplier registers would also be - but currently all variants
    with integrated multiplier are single-core.
    Use of the ALU is shared byt he two cores, alternating by clock cycle.

    Philipp


    Interesting, that would make it easy to run a multitasking RTOS (foreground/background) monitor, which might justify the use of some reentrant library routines :-). But in reality, the available memory (ROM/RAM) is so small so that you could easily manage this with static
    memory allocations.



    But static memory allocation would require one copy of each function per thread. And the linker would have to analyze the call graph to always
    call the correct function for each thread. Function pointers get
    complicated.

    Unfortunately, reentrancy becomes even harder with
    hardware-multithreading: TO access the stack, one has to construct a
    pointer to the stack location in a memory location. That memory location
    (as any pseudo-registers) is then shared among all running instances of
    the function. So it needs to be protected (e.g. with a spinlock), making
    access even more inefficient. And that spinlock will cause issues with interrupts (a solution might be to heavily restrict interrupt routines, essentially allowing not much more than setting some global variables).

    The there is the trade-off of using one such memory location per
    function vs. per program (the latter reducing memroy usage, but
    resulting in less paralellism).

    The pseudo-registers one would want to use are not so much a problem for interrupt routines (they would just need saving and thus increase
    interrupt overhead a bit), but for hardware parallelism. Essentially all
    access to them would again have to be protected by a spinlock.

    All these problems could have relatively easily been avoided by
    providing an efficient stack-pointer-relative addressing mode. Having a
    few general-purpose or index registers would have somewhat helped as well.

    Philipp
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tauno Voipio@tauno.voipio@notused.fi.invalid to comp.arch.embedded on Thu Nov 8 15:08:24 2018
    From Newsgroup: comp.arch.embedded

    On 8.11.18 14:53, Philipp Klaus Krause wrote:
    Am 12.10.18 um 20:39 schrieb upsidedown@downunder.com:
    On Fri, 12 Oct 2018 10:18:56 +0200, Philipp Klaus Krause <pkk@spth.de>
    wrote:

    Am 10.10.2018 um 03:05 schrieb Clifford Heath:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>


    OTP, no SPI, UART or I²C, but still...

    Clifford Heath

    They even make dual-core variants (the part where the first digit in the >>> part number is '2'). It seems program counter, stack pointer, flag
    register and accumulator are per-core, while the rest, including the ALU >>> is shared. In particular, the I/O registers are also shared, which means >>> some multiplier registers would also be - but currently all variants
    with integrated multiplier are single-core.
    Use of the ALU is shared byt he two cores, alternating by clock cycle.

    Philipp


    Interesting, that would make it easy to run a multitasking RTOS
    (foreground/background) monitor, which might justify the use of some
    reentrant library routines :-). But in reality, the available memory
    (ROM/RAM) is so small so that you could easily manage this with static
    memory allocations.



    But static memory allocation would require one copy of each function per thread. And the linker would have to analyze the call graph to always
    call the correct function for each thread. Function pointers get
    complicated.

    Unfortunately, reentrancy becomes even harder with
    hardware-multithreading: TO access the stack, one has to construct a
    pointer to the stack location in a memory location. That memory location
    (as any pseudo-registers) is then shared among all running instances of
    the function. So it needs to be protected (e.g. with a spinlock), making access even more inefficient. And that spinlock will cause issues with interrupts (a solution might be to heavily restrict interrupt routines, essentially allowing not much more than setting some global variables).

    The there is the trade-off of using one such memory location per
    function vs. per program (the latter reducing memroy usage, but
    resulting in less paralellism).

    The pseudo-registers one would want to use are not so much a problem for interrupt routines (they would just need saving and thus increase
    interrupt overhead a bit), but for hardware parallelism. Essentially all access to them would again have to be protected by a spinlock.

    All these problems could have relatively easily been avoided by
    providing an efficient stack-pointer-relative addressing mode. Having a
    few general-purpose or index registers would have somewhat helped as well.

    Philipp


    And you'll end up with a low-end Cortex ...
    --

    -TV

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Thu Nov 8 14:34:44 2018
    From Newsgroup: comp.arch.embedded

    Am 08.11.18 um 14:08 schrieb Tauno Voipio:


    And you'll end up with a low-end Cortex ...


    A low-end Cortex would still be far heavier than a Padauk variant with
    an sp-relative adressing mode or a few registers added.
    I think a more multithreading-friendly variant of the Padauk would even
    still be simpler than an STM8.
    But one could surely create a nice STM8-like (with a few STM8 weaknesses
    fixed) processor with hardware multihreading.

    Philipp
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From upsidedown@upsidedown@downunder.com to comp.arch.embedded on Thu Nov 8 21:52:49 2018
    From Newsgroup: comp.arch.embedded

    On Thu, 8 Nov 2018 13:53:48 +0100, Philipp Klaus Krause <pkk@spth.de>
    wrote:

    Am 12.10.18 um 20:39 schrieb upsidedown@downunder.com:
    On Fri, 12 Oct 2018 10:18:56 +0200, Philipp Klaus Krause <pkk@spth.de>
    wrote:

    Am 10.10.2018 um 03:05 schrieb Clifford Heath:
    <https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>


    OTP, no SPI, UART or I²C, but still...

    Clifford Heath

    They even make dual-core variants (the part where the first digit in the >>> part number is '2'). It seems program counter, stack pointer, flag
    register and accumulator are per-core, while the rest, including the ALU >>> is shared. In particular, the I/O registers are also shared, which means >>> some multiplier registers would also be - but currently all variants
    with integrated multiplier are single-core.
    Use of the ALU is shared byt he two cores, alternating by clock cycle.

    Philipp


    Interesting, that would make it easy to run a multitasking RTOS
    (foreground/background) monitor, which might justify the use of some
    reentrant library routines :-). But in reality, the available memory
    (ROM/RAM) is so small so that you could easily manage this with static
    memory allocations.



    But static memory allocation would require one copy of each function per >thread.

    For a foreground/background monitor, the worst case would be two
    copies of static data, if both threads use the same rubroutine.

    And the linker would have to analyze the call graph to always
    call the correct function for each thread.

    Linker for such small target ?

    With such small processor, just track any dependencies manually.

    Function pointers get complicated.

    Do you really insist of using function pointer with such small
    targets?


    Unfortunately, reentrancy becomes even harder with
    hardware-multithreading:

    With two hardware threads, you would need at most two copies of static
    data.

    TO access the stack, one has to construct a
    pointer to the stack location in a memory location.

    Why would you want to access the stack ?

    The stack is usable for handling return addresses, but I guess that a
    hardware thread must have its own return address stack pointer.

    In fact many minicomputers from the 1960's did not even have a stack
    at all. The calling program just stored the return address in the
    first word of the subroutine and the at the end o the subroutine,
    performed an indirect jump through the first word of the subroutine to
    return to the calling program. Of course, this is not re-entrant and
    in those days one did not have to worry about multiple CPUs accessing
    the same routines:-).

    BTW, who needs a program counter (PC), many microprograms run without
    a PC, with the next instruction address stored at the end of the long instruction word :-)


    That memory location
    (as any pseudo-registers) is then shared among all running instances of
    the function. So it needs to be protected (e.g. with a spinlock), making >access even more inefficient. And that spinlock will cause issues with >interrupts (a solution might be to heavily restrict interrupt routines, >essentially allowing not much more than setting some global variables).

    Disabling all interrupts for the duration of some critical operations
    is often enough, but of course, the number of instructions executed
    during interrupt disabled should be minimized. In MACRO-11 assembler,
    the standard practice was to start the comment field with a semicolon,
    when task switching was disabled with two semicolons and when
    interrupt disabled with three semicolons, it was visually easy to
    detect when interrupts were disabled and not mess too much with such
    code sections.


    The there is the trade-off of using one such memory location per
    function vs. per program (the latter reducing memroy usage, but
    resulting in less paralellism).

    The pseudo-registers one would want to use are not so much a problem for >interrupt routines (they would just need saving and thus increase
    interrupt overhead a bit), but for hardware parallelism. Essentially all >access to them would again have to be protected by a spinlock.

    All these problems could have relatively easily been avoided by
    providing an efficient stack-pointer-relative addressing mode. Having a
    few general-purpose or index registers would have somewhat helped as well.

    Philipp

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Thu Nov 8 21:56:16 2018
    From Newsgroup: comp.arch.embedded

    Am 08.11.18 um 20:52 schrieb upsidedown@downunder.com:

    But static memory allocation would require one copy of each function per
    thread.

    For a foreground/background monitor, the worst case would be two
    copies of static data, if both threads use the same rubroutine.

    And the linker would have to analyze the call graph to always
    call the correct function for each thread.

    Linker for such small target ?

    Of course. The support routines the compiler uses reside in some
    library, the linker links them in if necessary. Also, the larger
    variants are not that small, with up to 256 B of RAM and 8 KB of ROM.
    One might want to e.g. have one .c file for handling I²", one for the
    soft UART, etc.


    With such small processor, just track any dependencies manually.

    See above.


    Function pointers get complicated.

    Do you really insist of using function pointer with such small
    targets?


    I want to have C, function pointers are part of it.


    Unfortunately, reentrancy becomes even harder with
    hardware-multithreading:

    With two hardware threads, you would need at most two copies of static
    data.

    Padauk still makes one chip with 8 hardware threads (and it looks to me
    as if there were more in the past, though they are not currently listed
    on their website, one can find them e.g. in their IDE).


    TO access the stack, one has to construct a
    pointer to the stack location in a memory location.

    Why would you want to access the stack ?

    For reentrency, so I can use one function implementation for all
    threads. It would also be useful to dynamically assign threads to
    hardware threads (so no thread is tied to specific hardware, and some OS schedules them).


    The stack is usable for handling return addresses, but I guess that a hardware thread must have its own return address stack pointer.

    Each hardware thread has its flag register (4 bits) accumulator (8
    bits), pc (12 bits) and stack pointer (8 bits).


    That memory location
    (as any pseudo-registers) is then shared among all running instances of
    the function. So it needs to be protected (e.g. with a spinlock), making
    access even more inefficient. And that spinlock will cause issues with
    interrupts (a solution might be to heavily restrict interrupt routines,
    essentially allowing not much more than setting some global variables).

    Disabling all interrupts for the duration of some critical operations
    is often enough, but of course, the number of instructions executed
    during interrupt disabled should be minimized.

    Disabling interrupts any time a spinlock is held or a thread is wating
    for one might be too much, especially if there are many threads, so the spinlock is held often.

    Philipp
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From upsidedown@upsidedown@downunder.com to comp.arch.embedded on Fri Nov 9 00:35:55 2018
    From Newsgroup: comp.arch.embedded

    On Thu, 8 Nov 2018 21:56:16 +0100, Philipp Klaus Krause <pkk@spth.de>
    wrote:

    Am 08.11.18 um 20:52 schrieb upsidedown@downunder.com:

    But static memory allocation would require one copy of each function per >>> thread.

    For a foreground/background monitor, the worst case would be two
    copies of static data, if both threads use the same rubroutine.

    And the linker would have to analyze the call graph to always
    call the correct function for each thread.

    Linker for such small target ?

    Of course. The support routines the compiler uses reside in some
    library, the linker links them in if necessary. Also, the larger
    variants are not that small, with up to 256 B of RAM and 8 KB of ROM.
    One might want to e.g. have one .c file for handling I²", one for the
    soft UART, etc.

    A linker is required, if the libraries are (for copyright reasons)
    delivered as binary object code only.

    However, if the library are delivered as source files and the compiler/assembler has even a rudimentary #include mechanism, just
    include those library files you need. With a include or macro
    processor with parameter passing, just invoke same include file or
    macro twice with different parameters for different static variable
    instances.

    Of course, linkers are also needed, if very primitive compilation
    machines are used, such as floppy based Intellecs or Exorcisers. It
    could take a day to compile a large program all the way from sources,
    with multiple floppy changes to get the final absolute file to a
    single floppy, ready to be burnt into EPROMS for an additional hour or
    two. In such environment compiling, linking and burning only the
    source file changed would speed up program development a lot.

    When using a modern PC for compilation, there are no such issues.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Fri Nov 9 09:00:41 2018
    From Newsgroup: comp.arch.embedded

    Am 08.11.18 um 23:35 schrieb upsidedown@downunder.com:
    And the linker would have to analyze the call graph to always
    call the correct function for each thread.

    Linker for such small target ?

    Of course. The support routines the compiler uses reside in some
    library, the linker links them in if necessary. Also, the larger
    variants are not that small, with up to 256 B of RAM and 8 KB of ROM.
    One might want to e.g. have one .c file for handling I²", one for the
    soft UART, etc.

    A linker is required, if the libraries are (for copyright reasons)
    delivered as binary object code only.

    However, if the library are delivered as source files and the compiler/assembler has even a rudimentary #include mechanism, just
    include those library files you need. With a include or macro
    processor with parameter passing, just invoke same include file or
    macro twice with different parameters for different static variable instances.

    Of course, linkers are also needed, if very primitive compilation
    machines are used, such as floppy based Intellecs or Exorcisers. It
    could take a day to compile a large program all the way from sources,
    with multiple floppy changes to get the final absolute file to a
    single floppy, ready to be burnt into EPROMS for an additional hour or
    two. In such environment compiling, linking and burning only the
    source file changed would speed up program development a lot.

    When using a modern PC for compilation, there are no such issues.


    Separate compilation and then linking is the normal thing to, and a
    common workflow for small devices. This is e.g. how most people use
    SDCC, a mainstream free compiler targeting various 8-bit architectures.

    That doesn't mean it is the only way (and since SDCC does not have
    link-time optimization it might not be the optimal way either). But it
    is something people use and expect to work reasonably well.

    So for anyone designing an architecture it would be wise to not put too
    many obstacles into that workflow.

    Philipp
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Sun Nov 11 09:27:20 2018
    From Newsgroup: comp.arch.embedded

    Am 12.10.18 um 22:45 schrieb upsidedown@downunder.com:
    On Fri, 12 Oct 2018 22:06:02 +0200, Philipp Klaus Krause <pkk@spth.de>
    wrote:

    Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:

    The real issue would be the small RAM size.

    Devices with this architecture go up to 256 B of RAM (but they then cost
    a few cent more).

    Philipp

    Did you find the binary encoding of various instruction formats, i.e
    how many bits allocated to the operation code and how many for the
    address field ?

    My initial guess was that the instruction word is simple 8 bit opcode
    + 8 bit address, but the bit and word address limits for the smaller
    models would suggest that for some op-codes, the op-code field might
    be wider than 8 bits and address fields narrower than 8 bits (e.g. bit
    and word addressing).


    It is more complicated. Apparently the encoding changed from a 16-bit instruction word used by older types (https://www.mikrocontroller.net/topic/461002#5616813) to a 14-bit
    instruction word used by newer types (https://www.mikrocontroller.net/topic/461002#5616603).

    Padauk also dropped and added various instructions at some points (e.g.
    ldtabh, ldtabl, mul, pushw, popw).

    Philipp
    --- Synchronet 3.20a-Linux NewsLink 1.114