Forum: War Ensemble BBS

Random Idea: Densely Packed Ternary

From BGB@cr88192@gmail.com to comp.arch on Thu Mar 19 01:59:30 2026

From Newsgroup: comp.arch

So, idea:
Schemes for densely packing ternary values in a way that allows for
cheap hardware decoding (assuming implementation on binary hardware).

Likely mapping here would map 0/1/-1 as 0/1/2

Encoding scheme here is partly inspired by Chen-Ho, but this is not the
only possibility here.

The idea here is to try to "factor" things such that decoding can look
at fewer than the full 7/8 bits to decode each trit.

So, first idea, 4 trits in 7 bits:
000abcd //abcd: all 0 or 1
001xxxx //abcd=2 (well, put this somewhere in here)
01yyxxx //-
1000bcd //bcd=0|1, a=2
1001acd //acd=0|1, b=2
1010abd //abd=0|1, c=2
1011abc //abc=0|1, d=2

11000ab //cd=2
11001ac //bd=2
11010bc //ad=2
11011bd //ac=2
11100cd //ab=2
11101ad //bc=2
11110xx //[a/b/c/d]=0, default=2
11111xx //[a/b/c/d]=1, default=2

There is enough encoding space in theory, but I am having a harder time
coming up with a good packing scheme for the 5 trit case (space is
tighter, and the 5 factor is also being annoying).

Though, one possible case could be:
Most cases:
Assume e=0|1, so just stick on another 'e' bit.
So, leverage the 7-bit scheme as the starter.
Then shove the 'e' trit into the remaining spaces.

Some other cases:
000abcde //abcde: all 0 or 1

00100xx0 //abcde=2
00101xx0 //-

00110xx0 //11110xx, e=2
00111xx0 //11111xx, e=2

001abcd1 //abcd=0|1, e=2

0100xxx0 //1000xxx, e=2
0100xxx1 //1001xxx, e=2
0101xxx0 //1010xxx, e=2
0101xxx1 //1011xxx, e=2

01100xx0 //11100xx, e=2
01100xx1 //11101xx, e=2

01110xx0 //11000xx, e=2
01110xx1 //11010xx, e=2
01111xx0 //11001xx, e=2
01111xx1 //11011xx, e=2

1xxxxxxx //punch through to 7b case, e=0|1

But, unclear if this is the best strategy...

Could partly allow the 4 and 5 trit cases to share some of the same
lookup tables, but may not benefit much in the face of "combinatorial
logic evilness" (if you MUX on the input side, then synthesis tends to
clone all the logic).

One option could be to put the 'e' bit on the MSB side, but this breaks
the pattern. One other option could be to always decode the 7 bit case
as-if it were the 8 bit case, but then ignore the 'e' trit in cases
where only 4 trits are expected.

Another option being to decode the 7 bit case the same as the 8-bit
case, just the LSB is ignored (as is the 'e' trit of the output)

Though, with any sort of DPT scheme, a lot would depend on how it is
intended to be unpacked and used.

Main use case I can think of ATM are mostly things like multiplying SIMD vectors by ternary values (say, to mask or sign-invert elements).

Say, for example:
dv = sv * (__vec4f_t) { 1.0, 1.0, -1.0, 0.0 };

But, maybe other use-cases could exist.

But, in normal code, even SIMD code, this sort of thing may not be
common enough to justify a special instruction for it (vs, say, loading
the vector into a register and doing a normal SIMD multiply).

Granted, even in the absence of a solid use-case, or explicit CPU
support, could still use normal RAM-based lookup tables.

Though, the relative merit of a more complex encoding scheme is
diminished in the case of normal RAM based lookup tables (it doesn't
hurt per-se, but does require more complex logic to initialize the
lookup tables, so this could arguably be a drawback).

But, any thoughts?...

--- Synchronet 3.21d-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Thu Mar 19 11:51:50 2026

From Newsgroup: comp.arch

On Thu, 19 Mar 2026 01:59:30 -0500
BGB <cr88192@gmail.com> wrote:

So, idea:
Schemes for densely packing ternary values in a way that allows for
cheap hardware decoding (assuming implementation on binary hardware).

That's not very interesting.
It's obvious that 5t->8b is best and there are multiple ways of doing it cheaply.
There is no denser T->B packing until 17t->27b, which is obviously
impractical.

An opposite is a little more interesting, i.e. packing binary data to
ternary storage.
There could be two separate challenges: do it in the way that is easy
to implement on binary logic or do it in a way that is easy to
implement in ternary logic. Not that I know which basic ops are easy in
ternary logic.
The most likely candidate is 11b->7t, but figuring out details can be interesting for somebody with a lot of spare time at hand.

--- Synchronet 3.21d-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Thu Mar 19 05:39:57 2026

From Newsgroup: comp.arch

On 3/19/2026 4:51 AM, Michael S wrote:

On Thu, 19 Mar 2026 01:59:30 -0500
BGB <cr88192@gmail.com> wrote:

So, idea:
Schemes for densely packing ternary values in a way that allows for
cheap hardware decoding (assuming implementation on binary hardware).

That's not very interesting.
It's obvious that 5t->8b is best and there are multiple ways of doing it cheaply.
There is no denser T->B packing until 17t->27b, which is obviously impractical.

Granted...

An opposite is a little more interesting, i.e. packing binary data to
ternary storage.
There could be two separate challenges: do it in the way that is easy
to implement on binary logic or do it in a way that is easy to
implement in ternary logic. Not that I know which basic ops are easy in ternary logic.
The most likely candidate is 11b->7t, but figuring out details can be interesting for somebody with a lot of spare time at hand.

Possible, though I realized after doing a mock-up that my original idea
failed at one of its primary goals:
To be significantly cheaper to decode in hardware than the naive
polynomial...

Or, naive: enc=va+vb*3+vc*9+vd*27

It failed in that it would still require looking at all 7 bits of the
4t/7b case to determine the value of each trit.

I was then fiddling with other schemes, to hopefully find one that
allows for cheaper decoding (namely, requiring at most 5 or 6 bits to be examined to determine the value of any given trit).

But, at the moment, I haven't figured out a good encoding scheme that
could achieve this goal...

--- Synchronet 3.21d-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Thu Mar 19 12:46:03 2026

From Newsgroup: comp.arch

On 3/19/2026 5:39 AM, BGB wrote:

On 3/19/2026 4:51 AM, Michael S wrote:

On Thu, 19 Mar 2026 01:59:30 -0500
BGB <cr88192@gmail.com> wrote:

So, idea:
Schemes for densely packing ternary values in a way that allows for
cheap hardware decoding (assuming implementation on binary hardware).

That's not very interesting.
It's obvious that 5t->8b is best and there are multiple ways of doing it
cheaply.
There is no denser T->B packing until 17t->27b, which is obviously
impractical.

Granted...

An opposite is a little more interesting, i.e. packing binary data to
ternary storage.
There could be two separate challenges: do it in the way that is easy
to implement on binary logic or do it in a way that is easy to
implement in ternary logic. Not that I know which basic ops are easy in
ternary logic.
The most likely candidate is 11b->7t, but figuring out details can be
interesting for somebody with a lot of spare time at hand.

Possible, though I realized after doing a mock-up that my original idea failed at one of its primary goals:
To be significantly cheaper to decode in hardware than the naive polynomial...

Or, naive: enc=va+vb*3+vc*9+vd*27

It failed in that it would still require looking at all 7 bits of the
4t/7b case to determine the value of each trit.

I was then fiddling with other schemes, to hopefully find one that
allows for cheaper decoding (namely, requiring at most 5 or 6 bits to be examined to determine the value of any given trit).

But, at the moment, I haven't figured out a good encoding scheme that
could achieve this goal...

Next scheme:
00ab0cd //abcd: all 0 or 1
00ab10c //abc2
00ab11d //ab2d

010a0cd //a2cd
011b0cd //2bcd
010a10c //a2c2
011b10c //2bc2

010a11d //a22d
011b11d //2b2d

10xx0xx //-

10ab10x //ab22
110x0cd //22cd

100a11x //a222
101b11x //2b22
111x00c //22c2
111x01d //222d
111x11x //2222

The 7b/4t trace able to decode each trit using 4b of lookup.

The 5t case would be encoded in a similar way to before:
All the defined 4t cases punch through to having an extra 'e' bit (0/1);
The undefined cases in this scheme punch through to the "e=2" cases.

00ab0cde //abcde
10ab0cd0 //abcd2

10ab100e //ab22e
11000cde //22cde
10ab1010 //ab222
11010cd0 //22cd2

100a110e //a222e
101b110e //2b22e
100a1110 //a2222
101b1110 //2b222

111000ce //22c2e
111001de //222de
111100c0 //22c22
111101d0 //222d2

111011xe //2222e
111111x0 //22222

Which still needs 4b lookup to decode a/b/c/d, but 6b for 'e'.

...

--- Synchronet 3.21d-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Mar 19 22:30:10 2026

From Newsgroup: comp.arch

BGB <cr88192@gmail.com> posted:

So, idea:
Schemes for densely packing ternary values in a way that allows for
cheap hardware decoding (assuming implementation on binary hardware).

Likely mapping here would map 0/1/-1 as 0/1/2

Consider binary NAND gate, then
consider binary NOR gate with inverted inputs !

All you are doing is assigning a value to a voltage.

-----------------------------
But, any thoughts?...

never going to fly so I waste no time on assembling a failure.
smatter people than me discarded it long ago in the age of tubes
where it was "not that hard" to build trit-gates and storage.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Robert Finch@robfi680@gmail.com to comp.arch on Thu Mar 19 19:17:08 2026

From Newsgroup: comp.arch

On 2026-03-19 6:30 p.m., MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

So, idea:
Schemes for densely packing ternary values in a way that allows for
cheap hardware decoding (assuming implementation on binary hardware).

Likely mapping here would map 0/1/-1 as 0/1/2

Consider binary NAND gate, then
consider binary NOR gate with inverted inputs !

All you are doing is assigning a value to a voltage.

-----------------------------
But, any thoughts?...

never going to fly so I waste no time on assembling a failure.
smatter people than me discarded it long ago in the age of tubes
where it was "not that hard" to build trit-gates and storage.

Now there are quantum computers and I think they are four-state.
So maybe they could handle ternary.
--- Synchronet 3.21d-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Fri Mar 20 00:54:39 2026

From Newsgroup: comp.arch

On 3/19/2026 6:17 PM, Robert Finch wrote:

On 2026-03-19 6:30 p.m., MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

So, idea:
Schemes for densely packing ternary values in a way that allows for
cheap hardware decoding (assuming implementation on binary hardware).

Likely mapping here would map 0/1/-1 as 0/1/2

Consider binary NAND gate, then
consider binary NOR gate with inverted inputs !

All you are doing is assigning a value to a voltage.

-----------------------------
But, any thoughts?...

never going to fly so I waste no time on assembling a failure.
smatter people than me discarded it long ago in the age of tubes
where it was "not that hard" to build trit-gates and storage.

Now there are quantum computers and I think they are four-state.
So maybe they could handle ternary.

FWIW:
Storing 4 states as bits is a lot easier than storing 3 states as bits
(since each 4-state can be stored as 2 bits).

Granted, one can store trits as 2-bits each, but this costs more bits.

But, yeah, my post from earlier was missing some states, and fitting
these last states in (for the 5t/8b case) in a way that doesn't wreck lookup-size is proving a little more difficult.

But, yeah, seems if I can't fully crack this problem, this leaves the
use of a lookup table as the more practical option.

It works OK for the 4-trits in 7-bits case, but thus far my efforts are
still failing for the 5 trit case...

As noted, 4-trit pattern:
0 0ab 0cd //abcd: all 0 or 1
0 0ab 10c //abc2
0 0ab 11d //ab2d

0 10a 0cd //a2cd
0 11b 0cd //2bcd

0 10a 10c //a2c2
0 11b 10c //2bc2

0 10a 11d //a22d
0 11b 11d //2b2d

1 0xx 0xx //-

1 0ab 10x //ab22
1 00a 110 //a222
1 01b 110 //2b22

1 10x 0cd //22cd
1 110 00c //22c2
1 110 01d //222d

1 110 110 //2222

Can resolve any trit with a 4 bit lookup...

Stuff falls on its face for the 5-trit / 8-bit case:

0 0ab 0cd e //abcde: all 0 or 1

0 0ab 10c e //abc2e
0 0ab 11d e //ab2de

0 10a 0cd e //a2cde
0 11b 0cd e //2bcde

0 10a 10c e //a2c2e
0 11b 10c e //2bc2e
0 10a 11d e //a22de
0 11b 11d e //2b2de

1 0ab 0cd 0 //abcd2
1 00a 00c 1 //a2c22
1 01b 00c 1 //2bc22
1 00a 01d 1 //a22d2
1 01b 01d 1 //2b2d2

1 0ab 100 e //ab22e
1 00a 110 e //a222e
1 01b 110 e //2b22e

1 0ab 101 0 //ab222
1 00e 111 0 //2222e
1 010 111 0 //22222
1 0ab 1c1 1 //abc22

1 100 0cd e //22cde
1 110 00c e //22c2e
1 110 01d e //222de

1 101 0cd 0 //22cd2
1 111 00c 0 //22c22
1 111 01d 0 //222d2
1 1a1 0cd 1 //a2cd2

1 10a 10x 0 //a2222
1 10b 11x 0 //2b222
1 110 10c 0 //22c22
1 110 11d 0 //222d2

1 1ab 10d 1 //ab2d2
1 1db 11c 1 //2bcd2

Where the logic still needs to look at all 8-bits to decode each trit,
even if it can be done with a decision tree...

But, the decision tree can't ignore any of the bits per trit.

The goal would have been 6 bits, but I seem to have failed after trying
to get all the cases fit in.

Issue I think is that with 243 states in 8 bits, there isn't any real
way to organize things to avoid needing all of the bits to fully
disambiguate some cases...

Whereas, it is a little easier to fit 81 states in 7 bits.

Having beaten at this basically all day, I think I may be defeated on
this for the time being.

Well, either this, or I am just not being smart enough for this level of mental puzzle.

Annoyingly, can't use a brute force as, even though the problem looks
simple enough, a brute force search through this combination space would likely take a very long time.

I guess, it could be like the branch-predictor FSM where one possible
option could be to throw a genetic algorithm at the problem. But, this
may be a problem for another day.

...

--- Synchronet 3.21d-Linux NewsLink 1.2

From Niklas Holsti@niklas.holsti@tidorum.invalid to comp.arch on Fri Mar 20 11:53:52 2026

From Newsgroup: comp.arch

On 2026-03-20 1:17, Robert Finch wrote:

On 2026-03-19 6:30 p.m., MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

So, idea:
Schemes for densely packing ternary values in a way that allows for
cheap hardware decoding (assuming implementation on binary hardware).

Likely mapping here would map 0/1/-1 as 0/1/2

Consider binary NAND gate, then
consider binary NOR gate with inverted inputs !

All you are doing is assigning a value to a voltage.

-----------------------------
But, any thoughts?...

never going to fly so I waste no time on assembling a failure.
smatter people than me discarded it long ago in the age of tubes
where it was "not that hard" to build trit-gates and storage.

Now there are quantum computers and I think they are four-state.

I don't see in what sense qubits are "four-state". AIUI the state of a
qubit is a superposition of "0" and "1" where the "proportions" of the
two pure values can depend (via entangling) in a complex way on the
states of other qubits. At the end of a quantum computation the
"read-out" value of a qubit is 0 or 1, so binary.

--- Synchronet 3.21d-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Fri Mar 20 05:18:19 2026

From Newsgroup: comp.arch

On 3/20/2026 4:53 AM, Niklas Holsti wrote:

On 2026-03-20 1:17, Robert Finch wrote:

On 2026-03-19 6:30 p.m., MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

So, idea:
Schemes for densely packing ternary values in a way that allows for
cheap hardware decoding (assuming implementation on binary hardware).

Likely mapping here would map 0/1/-1 as 0/1/2

Consider binary NAND gate, then
consider binary NOR gate with inverted inputs !

All you are doing is assigning a value to a voltage.

-----------------------------
But, any thoughts?...

never going to fly so I waste no time on assembling a failure.
smatter people than me discarded it long ago in the age of tubes
where it was "not that hard" to build trit-gates and storage.

Now there are quantum computers and I think they are four-state.

I don't see in what sense qubits are "four-state". AIUI the state of a
qubit is a superposition of "0" and "1" where the "proportions" of the
two pure values can depend (via entangling) in a complex way on the
states of other qubits. At the end of a quantum computation the "read-
out" value of a qubit is 0 or 1, so binary.

Yeah.

I had missed that this was in reference to qubits, but yeah.

Like, qubits would be represented as some sort of entangled state, and
then the result is measured by "collapsing" the state and seeing where
it lands.

Meanwhile, ironically, my idea here ATM had noting to do with "actual"
ternary (in the electronics sense), rather debating if it could make
sense in some niche use-cases in an otherwise binary context.

But, at present, it isn't looking all that promising.
Meh results;
Not much in terms of a strong use-case (where one would want such an encoding).

Only the 4t/7b case is looking promising ATM, and even then, still not
an obvious use-case. Using it so negate or zero SIMD elements is
possible, but not a strong use-case.

Even then, in the hypothetical SIMD use-case, would still need to
compete against ops that use a 2-bit scheme, IIRC:
00: Zero
01: Keep
10: Negate
11: Bitwise NOT (more typically for integer SIMD).

And, probably don't need to add more ops that don't have strong
use-cases ATM. Even if ironically, my SIMD stuff is still pretty minimal
if compared with RV-P or RV-V (both of which went a little excessive on "features").

...

--- Synchronet 3.21d-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Toadster
  Sun Mar 22 09:52:29 2026
  from Ashland, Wi via Telnet
- Microbot
  Sun Mar 22 08:05:36 2026
  from Moore, Ok via Telnet
- Pixelrez
  Sat Mar 21 16:03:42 2026
  from Lenexa,ks via Telnet
- Pixelrez
  Sat Mar 21 15:57:39 2026
  from Lenexa,ks via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,104
Nodes:	10 (0 / 10)
Uptime:	492389:01:03
Calls:	14,151
Calls today:	2
Files:	186,281
D/L today:	2,617 files (990M bytes)
Messages:	2,501,190
Posted today:	1

Random Idea: Densely Packed Ternary

Who's Online

Recent Visitors

System Info