Forum: War Ensemble BBS

Re: $0.03 microcontroller

From Niklas Holsti@niklas.holsti@tidorum.invalid to comp.arch.embedded on Wed Oct 17 23:07:12 2018

From Newsgroup: comp.arch.embedded

On 18-10-17 20:04 , gnuarm.deletethisbit@gmail.com wrote:

On Wednesday, October 17, 2018 at 11:37:14 AM UTC-4, Niklas Holsti
wrote:

On 18-10-17 17:08 , gnuarm.deletethisbit@gmail.com wrote:

On Wednesday, October 17, 2018 at 2:35:46 AM UTC-4, Niklas
Holsti wrote:

On 18-10-17 01:46 , David Brown wrote: ...

When I am faced with someone else's code to examine or
maintain, I often run it through Doxygen with "generate
documentation for /everything/ - caller graphs, callee
graphs, cross-linked source, etc." It can make it quick to
jump around in the code. And recursive (or re-entrant,
whichever you prefer) code stands out like a sore thumb, as
long as the code is single-threaded - you get loops in the
call graphs.

Anecdote: some years ago, when I was applying a WCET analysis
tool to someone else's program, the tool found recursion. This
surprised the people I was working with, because they had
generated call graphs for the program, analysed them visually,
and found no recursive, looping paths.

Turned out that they had asked the call-graph tool to optimize
the size of the window used to display the call-graphs. The
tool did as it was told, with the result that the line segments
on the path for the recursive call went down to the bottom edge
of the diagram, then *merged* with the lower border line of the
diagram, followed that lower border, went up one side of the
diagram -- still merged with the border line -- and then
reentered the diagram to point at the source of the recursive
call, effectively making the loop very hard to see...

(It turned out that this recursion was intentional. At this
point, the program was sending an alarm message, but the alarm
buffer was full, so the alarm routine called itself to send an
alarm about the full buffer -- and that worked, because one
buffer slot was reserved, by design, for this "buffer full"
alarm.)

Seems to me what actually failed was that they knew they had
recursion in the design but didn't realize the fact that they
didn't see the recursion in the call graphs was an error that
should have been caught.

The guys creating and viewing the call-graphs were not the
designers of the program, either, so they didn't know, but for sure
it was something they should have discovered and remarked on as
part of their work.

Do you know the intended purpose of the call graphs?

IIRC they were doing independent SW verification & validation of the
program (and the WCET analysis was also a part of that). But it was many
years ago, and I don't remember the details well enough to say much
more, nor can I say why the program was recursive in this way, or if it
could as easily have been made non-recursive.
--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .
--- Synchronet 3.20a-Linux NewsLink 1.114

From upsidedown@upsidedown@downunder.com to comp.arch.embedded on Sun Oct 21 16:27:31 2018

From Newsgroup: comp.arch.embedded

On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:

Clifford Heath <no.spam@please.net> writes:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I�C, but still...

That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other
MCUs that small.

Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT,
(XOR) instructions.The other group was used as truly serial computers
with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.

However, in the LSI era, there down't seem to be many implement ions.

One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.

After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative
addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

--- Synchronet 3.20a-Linux NewsLink 1.114

From jim.brakefield@jim.brakefield@ieee.org to comp.arch.embedded on Sun Oct 21 07:47:21 2018

From Newsgroup: comp.arch.embedded

On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:

On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:

Clifford Heath <no.spam@please.net> writes:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I涎, but still...

That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other >MCUs that small.

Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT,
(XOR) instructions.The other group was used as truly serial computers
with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.

However, in the LSI era, there down't seem to be many implement ions.

One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.

After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose
(Logic Emulation Machine) https://opencores.org/project/lem1_9min
Jim Brakefield
--- Synchronet 3.20a-Linux NewsLink 1.114

From Phil Martel@pomartel@comcast.net to comp.arch.embedded on Sun Oct 21 11:03:18 2018

From Newsgroup: comp.arch.embedded

On 10/21/2018 09:27, upsidedown@downunder.com wrote:

On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:

Clifford Heath <no.spam@please.net> writes:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...

That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram,
enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other
MCUs that small.

Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT,
(XOR) instructions.The other group was used as truly serial computers
with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.

However, in the LSI era, there down't seem to be many implement ions.

One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.

After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

I have a memory of a 1-bit GPU from the late 70's, but can't pin it
down. There is an article on Wikipedia https://en.wikipedia.org/wiki/1-bit_architecture
--
Best wishes,
--Phil
pomartel At Comcast(ignore_this) dot net
--- Synchronet 3.20a-Linux NewsLink 1.114

From gnuarm.deletethisbit@gnuarm.deletethisbit@gmail.com to comp.arch.embedded on Sun Oct 21 08:08:02 2018

From Newsgroup: comp.arch.embedded

On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:

On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:

On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:

Clifford Heath <no.spam@please.net> writes:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I涎, but still...

That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other >MCUs that small.

Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT, (XOR) instructions.The other group was used as truly serial computers
with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.

However, in the LSI era, there down't seem to be many implement ions.

One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.

After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose (Logic Emulation Machine) https://opencores.org/project/lem1_9min

Jim Brakefield

It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.
Rick C.
--- Synchronet 3.20a-Linux NewsLink 1.114

From jim.brakefield@jim.brakefield@ieee.org to comp.arch.embedded on Sun Oct 21 09:31:29 2018

From Newsgroup: comp.arch.embedded

On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:

On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:

On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:

On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:

Clifford Heath <no.spam@please.net> writes:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I涎, but still...

That is impressive! Seems to be an 8-bit RISC with no registers, just >an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM. >I like that it's in a 6-pin SOT23 package since there aren't many other >MCUs that small.

Slightly OT, but I have often wonder how primitive a computer architecture can be and still do some useful work. In the tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller) type replacing relay logic. These had typically at least AND, OR, NOT, (XOR) instructions.The other group was used as truly serial computers with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.

However, in the LSI era, there down't seem to be many implement ions.

One that immediately comes in mind is the MC14500B PLC building block, from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.

After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose (Logic Emulation Machine) https://opencores.org/project/lem1_9min

Jim Brakefield

It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.

Rick C.

It's hard to picture an application where you couldn't spare a few hundred LUTs.

There are advantages to using several soft core processors, each sized and customized to the need.

I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

There are many under 600 LUTs, including 32-bit. Had hoped the full featured LEM design would be under 100 LUTs.
Have done some rough research of whats available for under 600 LUTs: https://opencores.org/project/up_core_list/downloads
select: "By Performance Metric"
A big rational for small soft core processors is that they replace LUTs (slow speed logic) with block RAM (instructions). And they are completely deterministic as opposed to doing the same by time slicing a ASIC (ARM) processor.
Jim Brakefield
--- Synchronet 3.20a-Linux NewsLink 1.114

From gnuarm.deletethisbit@gnuarm.deletethisbit@gmail.com to comp.arch.embedded on Sun Oct 21 10:51:29 2018

From Newsgroup: comp.arch.embedded

On Sunday, October 21, 2018 at 12:31:34 PM UTC-4, jim.bra...@ieee.org wrote:

On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:

On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:

On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:

On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin <no.email@nospam.invalid> wrote:

Clifford Heath <no.spam@please.net> writes:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I涎, but still...

That is impressive! Seems to be an 8-bit RISC with no registers, just >an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram,
enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other
MCUs that small.

Slightly OT, but I have often wonder how primitive a computer architecture can be and still do some useful work. In the tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller) type replacing relay logic. These had typically at least AND, OR, NOT, (XOR) instructions.The other group was used as truly serial computers with the same instructions as the PLC but also at least a 1 bit SUB (and ADD) instructions to implement all mathematical functions.

However, in the LSI era, there down't seem to be many implement ions.

One that immediately comes in mind is the MC14500B PLC building block, from the 1970's, which requires quite lot of support chips (code memory, PC, /O chips) to do some useful work.

After much searching, I found the (NI) National Instruments SBA (Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

Anyone seen more modern 1 bit chips either for relay replacement or for truly serial computers ?

LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose (Logic Emulation Machine) https://opencores.org/project/lem1_9min

Jim Brakefield

It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.

Rick C.

It's hard to picture an application where you couldn't spare a few hundred LUTs.

There are advantages to using several soft core processors, each sized and customized to the need.

I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

There are many under 600 LUTs, including 32-bit. Had hoped the full featured LEM design would be under 100 LUTs.
Have done some rough research of whats available for under 600 LUTs: https://opencores.org/project/up_core_list/downloads
select: "By Performance Metric"

A big rational for small soft core processors is that they replace LUTs (slow speed logic) with block RAM (instructions). And they are completely deterministic as opposed to doing the same by time slicing a ASIC (ARM) processor.

I won't argue a bit that softcores and especially *customizable* softcore CPUs aren't useful. I was talking about there being at best a very tiny region of utility for 1-bit processors.
My 600 LUT processor didn't trade off much for performance. It would run pretty fast and was pretty capable. In addition the word size was independent of the instruction set. That said, there are apps where a much less powerful processor would do fine and saving a few more LUTs would be useful.
Rick C.
--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.arch.embedded on Sun Oct 21 21:43:43 2018

From Newsgroup: comp.arch.embedded

On 21/10/2018 17:08, gnuarm.deletethisbit@gmail.com wrote:

It is hard for me to imagine applications where a 1 bit processor
would be useful. A useful N bit processor can be built in a small
number of LUTs. I've built a 16 bit processor in just 600 LUTs and
I've seen processors in a bit less.

I discussed this with someone once and he imagined apps where the
processing speed requirement was quite low and you can save LUTs with
a bit serial processor. I just don't know how many or why it would
matter. Even the smallest FPGAs have thousands of LUTs. It's hard
to picture an application where you couldn't spare a few hundred
LUTs.

There is not much point in 1-bit processing with modern architectures
and FPGAs. But it used to be more useful, for cheap and scalable
solutions. You got systems that scaled in parallel, using bit-slice processors to make cpus as wide as you want. And you got serial
scaling, giving you practical numbers of bits with minimal die area
(like the COP8 microcontrollers).

--- Synchronet 3.20a-Linux NewsLink 1.114

From jim.brakefield@jim.brakefield@ieee.org to comp.arch.embedded on Sun Oct 21 12:44:39 2018

From Newsgroup: comp.arch.embedded

On Sunday, October 21, 2018 at 12:51:34 PM UTC-5, gnuarm.del...@gmail.com wrote:

On Sunday, October 21, 2018 at 12:31:34 PM UTC-4, jim.bra...@ieee.org wrote:

On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:

On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:

On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:

On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin <no.email@nospam.invalid> wrote:

Clifford Heath <no.spam@please.net> writes:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I涎, but still...

That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram,
enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other
MCUs that small.

Slightly OT, but I have often wonder how primitive a computer architecture can be and still do some useful work. In the tube/discrete/SSI times, there were quite a lot 1 bit processors. There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT,
(XOR) instructions.The other group was used as truly serial computers with the same instructions as the PLC but also at least a 1 bit SUB (and ADD) instructions to implement all mathematical functions.

However, in the LSI era, there down't seem to be many implement ions.

One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code memory, PC, /O chips) to do some useful work.

After much searching, I found the (NI) National Instruments SBA (Serial Boolean Analyser) http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package. For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC
environment.

Anyone seen more modern 1 bit chips either for relay replacement or for truly serial computers ?

Anyone seen more modern 1 bit chips either for relay replacement or for truly serial computers ?

LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose
(Logic Emulation Machine) https://opencores.org/project/lem1_9min

Jim Brakefield

It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.

Rick C.

It's hard to picture an application where you couldn't spare a few hundred LUTs.

There are advantages to using several soft core processors, each sized and customized to the need.

I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

There are many under 600 LUTs, including 32-bit. Had hoped the full featured LEM design would be under 100 LUTs.
Have done some rough research of whats available for under 600 LUTs: https://opencores.org/project/up_core_list/downloads
select: "By Performance Metric"

A big rational for small soft core processors is that they replace LUTs (slow speed logic) with block RAM (instructions). And they are completely deterministic as opposed to doing the same by time slicing a ASIC (ARM) processor.

I won't argue a bit that softcores and especially *customizable* softcore CPUs aren't useful. I was talking about there being at best a very tiny region of utility for 1-bit processors.

My 600 LUT processor didn't trade off much for performance. It would run pretty fast and was pretty capable. In addition the word size was independent of the instruction set. That said, there are apps where a much less powerful processor would do fine and saving a few more LUTs would be useful.

Rick C.

there being at best a very tiny region of utility for 1-bit processors

There are a small number of examples:
Bit serial processors such as DEC PDP8L, early vacuum tube & drum machines, for example Bendix G-15.
Bit serial Cordic
Also telling, is that 4-bit processors for calculators have been replaced by 8-bit processors.
My inspiration was EDIF, which was/is output from VHDL & Verilog compilers. E.g. use EDIF as a machine language. In the context of logic simulation, greater FPGA capacity possible for slow logic.
This effort also lead to a theoretical insight for brain modelling: There is greater information content in the wiring than in the logic. The human brain has 2<<36+ neurons requiring 36-bits of information for each connection and only 16 or so bits for the state/configuration of each synapse. Also a FPGA requires 60+ bits to route each LUT input (assuming all LUT inputs in use) whereas each possible input can be specified by 20 bits or less (1M LUT FPGA).
Of course optimizing simulators convert the EDIF to an existing machine language. Likewise for industrial automation (ladder logic, ...).
Jim Brakefield
--- Synchronet 3.20a-Linux NewsLink 1.114

From Brett@ggtgp@yahoo.com to comp.arch.embedded on Mon Oct 22 00:28:51 2018

From Newsgroup: comp.arch.embedded

<jim.brakefield@ieee.org> wrote:

On Sunday, October 21, 2018 at 12:51:34 PM UTC-5, gnuarm.del...@gmail.com wrote:

On Sunday, October 21, 2018 at 12:31:34 PM UTC-4, jim.bra...@ieee.org wrote: >>> On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:

On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:

On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:

On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:

Clifford Heath <no.spam@please.net> writes:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I涎, but still...

That is impressive! Seems to be an 8-bit RISC with no registers, just >>>>>>> an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >>>>>>> enough for plenty of MCU things. Didn't check if it has an ADC or PWM. >>>>>>> I like that it's in a 6-pin SOT23 package since there aren't many other >>>>>>> MCUs that small.

Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller) >>>>>> type replacing relay logic. These had typically at least AND, OR, NOT, >>>>>> (XOR) instructions.The other group was used as truly serial computers >>>>>> with the same instructions as the PLC but also at least a 1 bit SUB >>>>>> (and ADD) instructions to implement all mathematical functions.

However, in the LSI era, there down't seem to be many implement ions. >>>>>>
One that immediately comes in mind is the MC14500B PLC building block, >>>>>> from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.

After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four >>>>>> banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package. >>>>>> For the re-entrance enthusiasts, it contains stack pointer relative >>>>>> addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 >>>>>> Darlington buffers may be needed to drive loads typically found in PLC >>>>>> environment.

Anyone seen more modern 1 bit chips either for relay replacement or >>>>>> for truly serial computers ?

Anyone seen more modern 1 bit chips either for relay replacement or >>>>> ]> for truly serial computers ?

LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose >>>>> (Logic Emulation Machine) https://opencores.org/project/lem1_9min

Jim Brakefield

It is hard for me to imagine applications where a 1 bit processor
would be useful. A useful N bit processor can be built in a small
number of LUTs. I've built a 16 bit processor in just 600 LUTs and
I've seen processors in a bit less.

I discussed this with someone once and he imagined apps where the
processing speed requirement was quite low and you can save LUTs with
a bit serial processor. I just don't know how many or why it would
matter. Even the smallest FPGAs have thousands of LUTs. It's hard to >>>> picture an application where you couldn't spare a few hundred LUTs.

Rick C.

It's hard to picture an application where you couldn't spare a few hundred LUTs.

There are advantages to using several soft core processors, each sized
and customized to the need.

I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.

There are many under 600 LUTs, including 32-bit. Had hoped the full
featured LEM design would be under 100 LUTs.
Have done some rough research of whats available for under 600 LUTs:
https://opencores.org/project/up_core_list/downloads
select: "By Performance Metric"

A big rational for small soft core processors is that they replace LUTs
(slow speed logic) with block RAM (instructions). And they are
completely deterministic as opposed to doing the same by time slicing a
ASIC (ARM) processor.

I won't argue a bit that softcores and especially *customizable*
softcore CPUs aren't useful. I was talking about there being at best a
very tiny region of utility for 1-bit processors.

My 600 LUT processor didn't trade off much for performance. It would
run pretty fast and was pretty capable. In addition the word size was
independent of the instruction set. That said, there are apps where a
much less powerful processor would do fine and saving a few more LUTs would be useful.

Rick C.

there being at best a very tiny region of utility for 1-bit processors

There are a small number of examples:
Bit serial processors such as DEC PDP8L, early vacuum tube & drum
machines, for example Bendix G-15.
Bit serial Cordic

Also telling, is that 4-bit processors for calculators have been replaced
by 8-bit processors.

My inspiration was EDIF, which was/is output from VHDL & Verilog
compilers. E.g. use EDIF as a machine language. In the context of logic simulation, greater FPGA capacity possible for slow logic.

This effort also lead to a theoretical insight for brain modelling: There
is greater information content in the wiring than in the logic. The
human brain has 2<<36+ neurons requiring 36-bits of information for each connection and only 16 or so bits for the state/configuration of each synapse. Also a FPGA requires 60+ bits to route each LUT input (assuming
all LUT inputs in use) whereas each possible input can be specified by 20 bits or less (1M LUT FPGA).

The clock speed is quite low, 2 Hz?
So the wetware is is not quite impossible to emulate with current tech.
Raising a baby and training the resultant adult to do a task is still many orders of magnitude cheaper.
;)

Of course optimizing simulators convert the EDIF to an existing machine language. Likewise for industrial automation (ladder logic, ...).

Jim Brakefield

--- Synchronet 3.20a-Linux NewsLink 1.114

From George Neuner@gneuner2@comcast.net to comp.arch.embedded on Sun Oct 21 20:59:55 2018

From Newsgroup: comp.arch.embedded

On Sun, 21 Oct 2018 16:27:31 +0300, upsidedown@downunder.com wrote:

Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT,
(XOR) instructions.The other group was used as truly serial computers
with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.

However, in the LSI era, there down't seem to be many implement ions.

One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.

After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative >addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 >Darlington buffers may be needed to drive loads typically found in PLC >environment.

Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?

Circa 1985-1993, Thinking Machines Connection Machine.
Circa 1987-1996, MasPar MP series.

The CM-1, 2, 2a, and 200 all were SIMD parallel using 1-bit serial
integer-only CPUs. Sizes ranged from 8K CPUs at the low end to 64K
CPUs at the high end. Each CPU had 4K *bits* of private RAM, and the
CPUs were connected in a multidimensional hypercube network.

The CM-2, 2a, and 200 were augmented with 32-bit FPUs (1 per 32 CPUs),
and the 200 featured a higher clock speed.

The MP-1 was SIMD parallel using 4-bit serial integer-only CPUs in
sizes from 1K to 16K CPUs. It also had 32-bit FPUs, but I don't
remember how many / what ratio. I remember that it had an accumulator
register rather than going memory->memory like the CM.

[I can't find much information now about the MP-1 ... unfortunately
MasPar didn't last very long in the marketplace. The Wikipedia
article has some information about the MP-2, but the MP-2 was a later
full 32-bit design, very different from the MP-1.]

My college had both an 8K CM-2 and a 1K MP-1, accessible to those who
took various parallel processing electives. I never got to use the
MP-1 much - it was new at the end of my time and I only ever played
with it a bit. But I spent 2 semesters working with the CM-2.

Even though the CM's clock speed was only ~8MHz, the performance was
amazing IF the problem was a good fit to the architecture. E.g., at
that time, I owned a 66MHz (dx2) i486. Converted for the CM-2
architecture, O(n^4) array processing on the i486 became O(n) on the
CM-2. I had a physics simulation that took over 3 hours on my i486
that ran in ~10 minutes on the CM.

George
--- Synchronet 3.20a-Linux NewsLink 1.114

From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Wed Oct 24 15:57:55 2018

From Newsgroup: comp.arch.embedded

Am 14.10.2018 um 11:55 schrieb Theo:

Tim <cpldcpu+usenet@gmail.com> wrote:

This is quite curious. I wonder

- Has anyone actually received the devices they ordered? The cheaper
variants seem to be sold out.

I think they've sold out since they went viral. EEVblog did a video showing 550 in stock - that's only $16 worth of parts, not hard to imagine they've been bought up.

The other option is they're some kind of EOL part and 3c is the 'reduced to clear' price - which they have done, very successfully.

Theo

They're back in stock, though the price rose by 21% to 0.046$.
Also, LCSC seems to now be stocking more Padauk parts, including more
dual-core devices. Unfortunately, the programmer seems to be out of
stock, and they have neither the flash nor the DIP variants.

Philipp
--- Synchronet 3.20a-Linux NewsLink 1.114

From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Mon Nov 5 12:41:27 2018

From Newsgroup: comp.arch.embedded

Am 12.10.2018 um 09:44 schrieb David Brown:

On 12/10/18 08:50, Philipp Klaus Krause wrote:

Am 12.10.2018 um 01:08 schrieb Paul Rubin:

upsidedown@downunder.com writes:

There is a lot of operations that will update memory locations, so why >>>> would you need a lot of CPU registers.

Being able to (say) add register to register saves traffic through the
accumulator and therefore instructions.

1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.

It would be nice to have a C compiler, and registers help with that.

Looking at the instruction set, it should be possible to make a backend
for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative
adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.

It looks like the lowest 16 memory addresses could be considered pseudo-registers - they are the ones that can be used for direct memory access rather than needing indirect access.

Considering the multi-core variants of the Padauk µCs:
Those adresses are shared across all cores. Each core only has its own
A, SP, F, PC.
How do we handle local variables?

Option 1: Make functions non-reentrant. Requires duplication of code (we
need per-thread copies of functions), and link-time analysis to ensure
that each thread only calls the function implementation meant for it.
Functions pointers get complicated.

Option 2: Use an inefficient combination of thread-local storage and stack.

Since this is a small µC, we need a lot of support functions, which the compiler inserts (e.g. for multiplication); of course those are affected
by the same problems.

Philipp
--- Synchronet 3.20a-Linux NewsLink 1.114

From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Thu Nov 8 13:53:48 2018

From Newsgroup: comp.arch.embedded

Am 12.10.18 um 20:39 schrieb upsidedown@downunder.com:

On Fri, 12 Oct 2018 10:18:56 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:

Am 10.10.2018 um 03:05 schrieb Clifford Heath:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>

OTP, no SPI, UART or I²C, but still...

Clifford Heath

They even make dual-core variants (the part where the first digit in the
part number is '2'). It seems program counter, stack pointer, flag
register and accumulator are per-core, while the rest, including the ALU
is shared. In particular, the I/O registers are also shared, which means
some multiplier registers would also be - but currently all variants
with integrated multiplier are single-core.
Use of the ALU is shared byt he two cores, alternating by clock cycle.

Philipp

Interesting, that would make it easy to run a multitasking RTOS (foreground/background) monitor, which might justify the use of some reentrant library routines :-). But in reality, the available memory (ROM/RAM) is so small so that you could easily manage this with static
memory allocations.

But static memory allocation would require one copy of each function per thread. And the linker would have to analyze the call graph to always
call the correct function for each thread. Function pointers get
complicated.

Unfortunately, reentrancy becomes even harder with
hardware-multithreading: TO access the stack, one has to construct a
pointer to the stack location in a memory location. That memory location
(as any pseudo-registers) is then shared among all running instances of
the function. So it needs to be protected (e.g. with a spinlock), making
access even more inefficient. And that spinlock will cause issues with interrupts (a solution might be to heavily restrict interrupt routines, essentially allowing not much more than setting some global variables).

The there is the trade-off of using one such memory location per
function vs. per program (the latter reducing memroy usage, but
resulting in less paralellism).

The pseudo-registers one would want to use are not so much a problem for interrupt routines (they would just need saving and thus increase
interrupt overhead a bit), but for hardware parallelism. Essentially all
access to them would again have to be protected by a spinlock.

All these problems could have relatively easily been avoided by
providing an efficient stack-pointer-relative addressing mode. Having a
few general-purpose or index registers would have somewhat helped as well.

Philipp
--- Synchronet 3.20a-Linux NewsLink 1.114

From Tauno Voipio@tauno.voipio@notused.fi.invalid to comp.arch.embedded on Thu Nov 8 15:08:24 2018

From Newsgroup: comp.arch.embedded

On 8.11.18 14:53, Philipp Klaus Krause wrote:

Am 12.10.18 um 20:39 schrieb upsidedown@downunder.com:

On Fri, 12 Oct 2018 10:18:56 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:

Am 10.10.2018 um 03:05 schrieb Clifford Heath:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>

OTP, no SPI, UART or I²C, but still...

Clifford Heath

They even make dual-core variants (the part where the first digit in the >>> part number is '2'). It seems program counter, stack pointer, flag
register and accumulator are per-core, while the rest, including the ALU >>> is shared. In particular, the I/O registers are also shared, which means >>> some multiplier registers would also be - but currently all variants
with integrated multiplier are single-core.
Use of the ALU is shared byt he two cores, alternating by clock cycle.

Philipp

Interesting, that would make it easy to run a multitasking RTOS
(foreground/background) monitor, which might justify the use of some
reentrant library routines :-). But in reality, the available memory
(ROM/RAM) is so small so that you could easily manage this with static
memory allocations.

But static memory allocation would require one copy of each function per thread. And the linker would have to analyze the call graph to always
call the correct function for each thread. Function pointers get
complicated.

Unfortunately, reentrancy becomes even harder with
hardware-multithreading: TO access the stack, one has to construct a
pointer to the stack location in a memory location. That memory location
(as any pseudo-registers) is then shared among all running instances of
the function. So it needs to be protected (e.g. with a spinlock), making access even more inefficient. And that spinlock will cause issues with interrupts (a solution might be to heavily restrict interrupt routines, essentially allowing not much more than setting some global variables).

The there is the trade-off of using one such memory location per
function vs. per program (the latter reducing memroy usage, but
resulting in less paralellism).

The pseudo-registers one would want to use are not so much a problem for interrupt routines (they would just need saving and thus increase
interrupt overhead a bit), but for hardware parallelism. Essentially all access to them would again have to be protected by a spinlock.

All these problems could have relatively easily been avoided by
providing an efficient stack-pointer-relative addressing mode. Having a
few general-purpose or index registers would have somewhat helped as well.

Philipp

And you'll end up with a low-end Cortex ...
--

-TV

--- Synchronet 3.20a-Linux NewsLink 1.114

From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Thu Nov 8 14:34:44 2018

From Newsgroup: comp.arch.embedded

Am 08.11.18 um 14:08 schrieb Tauno Voipio:

And you'll end up with a low-end Cortex ...

A low-end Cortex would still be far heavier than a Padauk variant with
an sp-relative adressing mode or a few registers added.
I think a more multithreading-friendly variant of the Padauk would even
still be simpler than an STM8.
But one could surely create a nice STM8-like (with a few STM8 weaknesses
fixed) processor with hardware multihreading.

Philipp
--- Synchronet 3.20a-Linux NewsLink 1.114

From upsidedown@upsidedown@downunder.com to comp.arch.embedded on Thu Nov 8 21:52:49 2018

From Newsgroup: comp.arch.embedded

On Thu, 8 Nov 2018 13:53:48 +0100, Philipp Klaus Krause <pkk@spth.de>
wrote:

Am 12.10.18 um 20:39 schrieb upsidedown@downunder.com:

On Fri, 12 Oct 2018 10:18:56 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:

Am 10.10.2018 um 03:05 schrieb Clifford Heath:

<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>

OTP, no SPI, UART or I�C, but still...

Clifford Heath

They even make dual-core variants (the part where the first digit in the >>> part number is '2'). It seems program counter, stack pointer, flag
register and accumulator are per-core, while the rest, including the ALU >>> is shared. In particular, the I/O registers are also shared, which means >>> some multiplier registers would also be - but currently all variants
with integrated multiplier are single-core.
Use of the ALU is shared byt he two cores, alternating by clock cycle.

Philipp

Interesting, that would make it easy to run a multitasking RTOS
(foreground/background) monitor, which might justify the use of some
reentrant library routines :-). But in reality, the available memory
(ROM/RAM) is so small so that you could easily manage this with static
memory allocations.

But static memory allocation would require one copy of each function per >thread.

For a foreground/background monitor, the worst case would be two
copies of static data, if both threads use the same rubroutine.

And the linker would have to analyze the call graph to always
call the correct function for each thread.

Linker for such small target ?

With such small processor, just track any dependencies manually.

Function pointers get complicated.

Do you really insist of using function pointer with such small
targets?

Unfortunately, reentrancy becomes even harder with
hardware-multithreading:

With two hardware threads, you would need at most two copies of static
data.

TO access the stack, one has to construct a
pointer to the stack location in a memory location.

Why would you want to access the stack ?

The stack is usable for handling return addresses, but I guess that a
hardware thread must have its own return address stack pointer.

In fact many minicomputers from the 1960's did not even have a stack
at all. The calling program just stored the return address in the
first word of the subroutine and the at the end o the subroutine,
performed an indirect jump through the first word of the subroutine to
return to the calling program. Of course, this is not re-entrant and
in those days one did not have to worry about multiple CPUs accessing
the same routines:-).

BTW, who needs a program counter (PC), many microprograms run without
a PC, with the next instruction address stored at the end of the long instruction word :-)

That memory location
(as any pseudo-registers) is then shared among all running instances of
the function. So it needs to be protected (e.g. with a spinlock), making >access even more inefficient. And that spinlock will cause issues with >interrupts (a solution might be to heavily restrict interrupt routines, >essentially allowing not much more than setting some global variables).

Disabling all interrupts for the duration of some critical operations
is often enough, but of course, the number of instructions executed
during interrupt disabled should be minimized. In MACRO-11 assembler,
the standard practice was to start the comment field with a semicolon,
when task switching was disabled with two semicolons and when
interrupt disabled with three semicolons, it was visually easy to
detect when interrupts were disabled and not mess too much with such
code sections.

The there is the trade-off of using one such memory location per
function vs. per program (the latter reducing memroy usage, but
resulting in less paralellism).

The pseudo-registers one would want to use are not so much a problem for >interrupt routines (they would just need saving and thus increase
interrupt overhead a bit), but for hardware parallelism. Essentially all >access to them would again have to be protected by a spinlock.

All these problems could have relatively easily been avoided by
providing an efficient stack-pointer-relative addressing mode. Having a
few general-purpose or index registers would have somewhat helped as well.

Philipp

--- Synchronet 3.20a-Linux NewsLink 1.114

From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Thu Nov 8 21:56:16 2018

From Newsgroup: comp.arch.embedded

Am 08.11.18 um 20:52 schrieb upsidedown@downunder.com:

But static memory allocation would require one copy of each function per
thread.

For a foreground/background monitor, the worst case would be two
copies of static data, if both threads use the same rubroutine.

And the linker would have to analyze the call graph to always
call the correct function for each thread.

Linker for such small target ?

Of course. The support routines the compiler uses reside in some
library, the linker links them in if necessary. Also, the larger
variants are not that small, with up to 256 B of RAM and 8 KB of ROM.
One might want to e.g. have one .c file for handling I²", one for the
soft UART, etc.

With such small processor, just track any dependencies manually.

See above.

Function pointers get complicated.

Do you really insist of using function pointer with such small
targets?

I want to have C, function pointers are part of it.

Unfortunately, reentrancy becomes even harder with
hardware-multithreading:

With two hardware threads, you would need at most two copies of static
data.

Padauk still makes one chip with 8 hardware threads (and it looks to me
as if there were more in the past, though they are not currently listed
on their website, one can find them e.g. in their IDE).

TO access the stack, one has to construct a
pointer to the stack location in a memory location.

Why would you want to access the stack ?

For reentrency, so I can use one function implementation for all
threads. It would also be useful to dynamically assign threads to
hardware threads (so no thread is tied to specific hardware, and some OS schedules them).

The stack is usable for handling return addresses, but I guess that a hardware thread must have its own return address stack pointer.

Each hardware thread has its flag register (4 bits) accumulator (8
bits), pc (12 bits) and stack pointer (8 bits).

That memory location
(as any pseudo-registers) is then shared among all running instances of
the function. So it needs to be protected (e.g. with a spinlock), making
access even more inefficient. And that spinlock will cause issues with
interrupts (a solution might be to heavily restrict interrupt routines,
essentially allowing not much more than setting some global variables).

Disabling all interrupts for the duration of some critical operations
is often enough, but of course, the number of instructions executed
during interrupt disabled should be minimized.

Disabling interrupts any time a spinlock is held or a thread is wating
for one might be too much, especially if there are many threads, so the spinlock is held often.

Philipp
--- Synchronet 3.20a-Linux NewsLink 1.114

From upsidedown@upsidedown@downunder.com to comp.arch.embedded on Fri Nov 9 00:35:55 2018

From Newsgroup: comp.arch.embedded

On Thu, 8 Nov 2018 21:56:16 +0100, Philipp Klaus Krause <pkk@spth.de>
wrote:

Am 08.11.18 um 20:52 schrieb upsidedown@downunder.com:

But static memory allocation would require one copy of each function per >>> thread.

For a foreground/background monitor, the worst case would be two
copies of static data, if both threads use the same rubroutine.

And the linker would have to analyze the call graph to always
call the correct function for each thread.

Linker for such small target ?

Of course. The support routines the compiler uses reside in some
library, the linker links them in if necessary. Also, the larger
variants are not that small, with up to 256 B of RAM and 8 KB of ROM.
One might want to e.g. have one .c file for handling I�", one for the
soft UART, etc.

A linker is required, if the libraries are (for copyright reasons)
delivered as binary object code only.

However, if the library are delivered as source files and the compiler/assembler has even a rudimentary #include mechanism, just
include those library files you need. With a include or macro
processor with parameter passing, just invoke same include file or
macro twice with different parameters for different static variable
instances.

Of course, linkers are also needed, if very primitive compilation
machines are used, such as floppy based Intellecs or Exorcisers. It
could take a day to compile a large program all the way from sources,
with multiple floppy changes to get the final absolute file to a
single floppy, ready to be burnt into EPROMS for an additional hour or
two. In such environment compiling, linking and burning only the
source file changed would speed up program development a lot.

When using a modern PC for compilation, there are no such issues.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Fri Nov 9 09:00:41 2018

From Newsgroup: comp.arch.embedded

Am 08.11.18 um 23:35 schrieb upsidedown@downunder.com:

And the linker would have to analyze the call graph to always
call the correct function for each thread.

Linker for such small target ?

Of course. The support routines the compiler uses reside in some
library, the linker links them in if necessary. Also, the larger
variants are not that small, with up to 256 B of RAM and 8 KB of ROM.
One might want to e.g. have one .c file for handling I²", one for the
soft UART, etc.

A linker is required, if the libraries are (for copyright reasons)
delivered as binary object code only.

However, if the library are delivered as source files and the compiler/assembler has even a rudimentary #include mechanism, just
include those library files you need. With a include or macro
processor with parameter passing, just invoke same include file or
macro twice with different parameters for different static variable instances.

Of course, linkers are also needed, if very primitive compilation
machines are used, such as floppy based Intellecs or Exorcisers. It
could take a day to compile a large program all the way from sources,
with multiple floppy changes to get the final absolute file to a
single floppy, ready to be burnt into EPROMS for an additional hour or
two. In such environment compiling, linking and burning only the
source file changed would speed up program development a lot.

When using a modern PC for compilation, there are no such issues.

Separate compilation and then linking is the normal thing to, and a
common workflow for small devices. This is e.g. how most people use
SDCC, a mainstream free compiler targeting various 8-bit architectures.

That doesn't mean it is the only way (and since SDCC does not have
link-time optimization it might not be the optimal way either). But it
is something people use and expect to work reasonably well.

So for anyone designing an architecture it would be wise to not put too
many obstacles into that workflow.

Philipp
--- Synchronet 3.20a-Linux NewsLink 1.114

From Philipp Klaus Krause@pkk@spth.de to comp.arch.embedded on Sun Nov 11 09:27:20 2018

From Newsgroup: comp.arch.embedded

Am 12.10.18 um 22:45 schrieb upsidedown@downunder.com:

On Fri, 12 Oct 2018 22:06:02 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:

Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:

The real issue would be the small RAM size.

Devices with this architecture go up to 256 B of RAM (but they then cost
a few cent more).

Philipp

Did you find the binary encoding of various instruction formats, i.e
how many bits allocated to the operation code and how many for the
address field ?

My initial guess was that the instruction word is simple 8 bit opcode
+ 8 bit address, but the bit and word address limits for the smaller
models would suggest that for some op-codes, the op-code field might
be wider than 8 bits and address fields narrower than 8 bits (e.g. bit
and word addressing).

It is more complicated. Apparently the encoding changed from a 16-bit instruction word used by older types (https://www.mikrocontroller.net/topic/461002#5616813) to a 14-bit
instruction word used by newer types (https://www.mikrocontroller.net/topic/461002#5616603).

Padauk also dropped and added various instructions at some points (e.g.
ldtabh, ldtabl, mul, pushw, popw).

Philipp
--- Synchronet 3.20a-Linux NewsLink 1.114

Who's Online

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,097
Nodes:	10 (0 / 10)
Uptime:	15:17:06
Calls:	14,089
Files:	187,110
D/L today:	146 files (47,201K bytes)
Messages:	2,491,113

Re: $0.03 microcontroller

Who's Online

System Info