Hi,
Ok I was looking at this learning challenge,
producing vector (y1,y2,y3,y4) from a vector
(x1,x2,x3,x4), System R can do it via least square?
| 0 0 0 1 | | x1 | | x4 |
| 0 0 1 0 | | x2 | = | x3 |
| 0 1 0 0 | | x3 | | x2 |
| 1 0 0 0 | | x4 | | x1 |
How it started:
"multiplicative RNNs arises naturally from a
proof-theoretic interpretation of next-token
prediction as nested intuitionistic implication"
Paul Tarau - 2026
https://arxiv.org/abs/2601.19915
How its going:
"Dave uses a PDP-11 to train a real Neural
Network complete with Transformers and
Attention so you can see them at their most basic."
Mr. Taskmanager - 2026
https://www.youtube.com/watch?v=OUE3FSIk46g
We see Doctor Frankstein in action from
the Bronze Age of Computing, producing
a Humunkulus, the progenitor of todays
Bulgakov Shuriks in the Hyperscale Age!
Bye
P.S.: My impression neither cut to the core, that
this incredible transformer most likely
produced this deterministic attention:
| -1 | * | k | + | 5 | = | k' |
Or differently expressed y_k = x_{5-k}.
How did the transformer do it? It produced
a neural network with 1216 parameters, but
didn't use embeddings or polar encoding
of positions. But if we strip the noise
and denoise from the position encoding,
the denoise is done via softmax. We somehow
must get the above, right? I still need to
verify my claim! BTW: The PDP-11 assembly
from 1979 uses wider example not with n=4
but with n=8.
Hi,
You just escaped AI dooms day. Humanity has
reset all internet and computers as a last resort
to prevent AGI developing, by an electromagnetic
pulse. You are stuck in Güttinger Wald and hunted
down a deer by your bare hands, the deer still
confused and tame because tourists were feeding it.
Now you have no knife, what do you do:
Chimpanzees Have Entered The Stone Age https://www.youtube.com/watch?v=wPXX2I_uYjc
So we are just apes with internet.
Bye
Mild Shock schrieb:
Hi,
Ok I was looking at this learning challenge,
producing vector (y1,y2,y3,y4) from a vector
(x1,x2,x3,x4), System R can do it via least square?
| 0 0 0 1 | | x1 | | x4 |
| 0 0 1 0 | | x2 | = | x3 |
| 0 1 0 0 | | x3 | | x2 |
| 1 0 0 0 | | x4 | | x1 |
How it started:
"multiplicative RNNs arises naturally from a
proof-theoretic interpretation of next-token
prediction as nested intuitionistic implication"
Paul Tarau - 2026
https://arxiv.org/abs/2601.19915
How its going:
"Dave uses a PDP-11 to train a real Neural
Network complete with Transformers and
Attention so you can see them at their most basic."
Mr. Taskmanager - 2026
https://www.youtube.com/watch?v=OUE3FSIk46g
We see Doctor Frankstein in action from
the Bronze Age of Computing, producing
a Humunkulus, the progenitor of todays
Bulgakov Shuriks in the Hyperscale Age!
Bye
P.S.: My impression neither cut to the core, that
this incredible transformer most likely
produced this deterministic attention:
| -1 | * | k | + | 5 | = | k' |
Or differently expressed y_k = x_{5-k}.
How did the transformer do it? It produced
a neural network with 1216 parameters, but
didn't use embeddings or polar encoding
of positions. But if we strip the noise
and denoise from the position encoding,
the denoise is done via softmax. We somehow
must get the above, right? I still need to
verify my claim! BTW: The PDP-11 assembly
from 1979 uses wider example not with n=4
but with n=8.
Hi,
You just escaped AI dooms day. Humanity has
reset all internet and computers as a last resort
to prevent AGI developing, by an electromagnetic
pulse. You are stuck in Güttinger Wald and hunted
down a deer by your bare hands, the deer still
confused and tame because tourists were feeding it.
Now you have no knife, what do you do:
Chimpanzees Have Entered The Stone Age https://www.youtube.com/watch?v=wPXX2I_uYjc
So we are just apes with internet.
Bye
Mild Shock schrieb:
Hi,
Ok I was looking at this learning challenge,
producing vector (y1,y2,y3,y4) from a vector
(x1,x2,x3,x4), System R can do it via least square?
| 0 0 0 1 | | x1 | | x4 |
| 0 0 1 0 | | x2 | = | x3 |
| 0 1 0 0 | | x3 | | x2 |
| 1 0 0 0 | | x4 | | x1 |
How it started:
"multiplicative RNNs arises naturally from a
proof-theoretic interpretation of next-token
prediction as nested intuitionistic implication"
Paul Tarau - 2026
https://arxiv.org/abs/2601.19915
How its going:
"Dave uses a PDP-11 to train a real Neural
Network complete with Transformers and
Attention so you can see them at their most basic."
Mr. Taskmanager - 2026
https://www.youtube.com/watch?v=OUE3FSIk46g
We see Doctor Frankstein in action from
the Bronze Age of Computing, producing
a Humunkulus, the progenitor of todays
Bulgakov Shuriks in the Hyperscale Age!
Bye
P.S.: My impression neither cut to the core, that
this incredible transformer most likely
produced this deterministic attention:
| -1 | * | k | + | 5 | = | k' |
Or differently expressed y_k = x_{5-k}.
How did the transformer do it? It produced
a neural network with 1216 parameters, but
didn't use embeddings or polar encoding
of positions. But if we strip the noise
and denoise from the position encoding,
the denoise is done via softmax. We somehow
must get the above, right? I still need to
verify my claim! BTW: The PDP-11 assembly
from 1979 uses wider example not with n=4
but with n=8.
Hi,
You just escaped AI dooms day. Humanity has
reset all internet and computers as a last resort
to prevent AGI developing, by an electromagnetic
pulse. You are stuck in Güttinger Wald and hunted
down a deer by your bare hands, the deer still
confused and tame because tourists were feeding it.
Now you have no knife, what do you do:
Chimpanzees Have Entered The Stone Age https://www.youtube.com/watch?v=wPXX2I_uYjc
So we are just apes with internet.
Bye
Mild Shock schrieb:
Hi,
Ok I was looking at this learning challenge,
producing vector (y1,y2,y3,y4) from a vector
(x1,x2,x3,x4), System R can do it via least square?
| 0 0 0 1 | | x1 | | x4 |
| 0 0 1 0 | | x2 | = | x3 |
| 0 1 0 0 | | x3 | | x2 |
| 1 0 0 0 | | x4 | | x1 |
How it started:
"multiplicative RNNs arises naturally from a
proof-theoretic interpretation of next-token
prediction as nested intuitionistic implication"
Paul Tarau - 2026
https://arxiv.org/abs/2601.19915
How its going:
"Dave uses a PDP-11 to train a real Neural
Network complete with Transformers and
Attention so you can see them at their most basic."
Mr. Taskmanager - 2026
https://www.youtube.com/watch?v=OUE3FSIk46g
We see Doctor Frankstein in action from
the Bronze Age of Computing, producing
a Humunkulus, the progenitor of todays
Bulgakov Shuriks in the Hyperscale Age!
Bye
P.S.: My impression neither cut to the core, that
this incredible transformer most likely
produced this deterministic attention:
| -1 | * | k | + | 5 | = | k' |
Or differently expressed y_k = x_{5-k}.
How did the transformer do it? It produced
a neural network with 1216 parameters, but
didn't use embeddings or polar encoding
of positions. But if we strip the noise
and denoise from the position encoding,
the denoise is done via softmax. We somehow
must get the above, right? I still need to
verify my claim! BTW: The PDP-11 assembly
from 1979 uses wider example not with n=4
but with n=8.
Hi,
Lets get emotional! While Varoufakis painted
the picture of cloud capital. That might have
mobilized "The Internationale", or another
more defensive less motolotov throwing song:
Pink Floyd - Run Like Hell (Live)
https://www.youtube.com/watch?v=lKgOe1Rl8YY
Now since Athropic is teaming with xAI, we
might ask do we see the next OneDrive of Prolog
on the horizon. Even a tame Erlang dream:
populate the Web with clever Prolog agents! https://trinity.elfenbenstornet.se/
Might have a nasty Prolog as SaaS aspect!
As long as we talk about services and not
assets, we might miss something. Who owns
the present and future LLMs/LRMs?
Bye
Mild Shock schrieb:
Hi,
You just escaped AI dooms day. Humanity has
reset all internet and computers as a last resort
to prevent AGI developing, by an electromagnetic
pulse. You are stuck in Güttinger Wald and hunted
down a deer by your bare hands, the deer still
confused and tame because tourists were feeding it.
Now you have no knife, what do you do:
Chimpanzees Have Entered The Stone Age
https://www.youtube.com/watch?v=wPXX2I_uYjc
So we are just apes with internet.
Bye
Mild Shock schrieb:
Hi,
Ok I was looking at this learning challenge,
producing vector (y1,y2,y3,y4) from a vector
(x1,x2,x3,x4), System R can do it via least square?
| 0 0 0 1 | | x1 | | x4 |
| 0 0 1 0 | | x2 | = | x3 |
| 0 1 0 0 | | x3 | | x2 |
| 1 0 0 0 | | x4 | | x1 |
How it started:
"multiplicative RNNs arises naturally from a
proof-theoretic interpretation of next-token
prediction as nested intuitionistic implication"
Paul Tarau - 2026
https://arxiv.org/abs/2601.19915
How its going:
"Dave uses a PDP-11 to train a real Neural
Network complete with Transformers and
Attention so you can see them at their most basic."
Mr. Taskmanager - 2026
https://www.youtube.com/watch?v=OUE3FSIk46g
We see Doctor Frankstein in action from
the Bronze Age of Computing, producing
a Humunkulus, the progenitor of todays
Bulgakov Shuriks in the Hyperscale Age!
Bye
P.S.: My impression neither cut to the core, that
this incredible transformer most likely
produced this deterministic attention:
| -1 | * | k | + | 5 | = | k' |
Or differently expressed y_k = x_{5-k}.
How did the transformer do it? It produced
a neural network with 1216 parameters, but
didn't use embeddings or polar encoding
of positions. But if we strip the noise
and denoise from the position encoding,
the denoise is done via softmax. We somehow
must get the above, right? I still need to
verify my claim! BTW: The PDP-11 assembly
from 1979 uses wider example not with n=4
but with n=8.
Hi,
Even the Buddos are cluless, while Tarau might
indeed appear in the anals of the Borg, as a
notable human being, seeing connections.
While the Buddos are the man mountains of
Janathan Swists Gulliver's Travel, creating
huge egg montains, replaying some rewriting
school inventions. They might nevertheless be
strapped down by Liliputians:
Gulliver captzured by the Liliputians https://www.lookandlearn.com/history-images/M301092/Scene-from-Gullivers-Travels
But who are these Liliputians? Well just
toying around with a deep seek v4 derivate in
LM Studio, a model that came out 9 days ago.
Etc.. etc.. it shows more text, all generated
on a laptop that was even only $1000 since
end of year 2025, there were some discounts.
The laptop has the Windows Copilot+ specs.
The secrete sauce? Some general matrix
multiplications (GEMM) tucked in your iGPU:
What is Xe Matrix eXtensions (XMX)? https://www.intel.com/content/www/us/en/support/articles/000091112/graphics.html
Bye
Mild Shock schrieb:
Hi,
Lets get emotional! While Varoufakis painted
the picture of cloud capital. That might have
mobilized "The Internationale", or another
more defensive less motolotov throwing song:
Pink Floyd - Run Like Hell (Live)
https://www.youtube.com/watch?v=lKgOe1Rl8YY
Now since Athropic is teaming with xAI, we
might ask do we see the next OneDrive of Prolog
on the horizon. Even a tame Erlang dream:
populate the Web with clever Prolog agents!
https://trinity.elfenbenstornet.se/
Might have a nasty Prolog as SaaS aspect!
As long as we talk about services and not
assets, we might miss something. Who owns
the present and future LLMs/LRMs?
Bye
Mild Shock schrieb:
Hi,
You just escaped AI dooms day. Humanity has
reset all internet and computers as a last resort
to prevent AGI developing, by an electromagnetic
pulse. You are stuck in Güttinger Wald and hunted
down a deer by your bare hands, the deer still
confused and tame because tourists were feeding it.
Now you have no knife, what do you do:
Chimpanzees Have Entered The Stone Age
https://www.youtube.com/watch?v=wPXX2I_uYjc
So we are just apes with internet.
Bye
Mild Shock schrieb:
Hi,
Ok I was looking at this learning challenge,
producing vector (y1,y2,y3,y4) from a vector
(x1,x2,x3,x4), System R can do it via least square?
| 0 0 0 1 | | x1 | | x4 |
| 0 0 1 0 | | x2 | = | x3 |
| 0 1 0 0 | | x3 | | x2 |
| 1 0 0 0 | | x4 | | x1 |
How it started:
"multiplicative RNNs arises naturally from a
proof-theoretic interpretation of next-token
prediction as nested intuitionistic implication"
Paul Tarau - 2026
https://arxiv.org/abs/2601.19915
How its going:
"Dave uses a PDP-11 to train a real Neural
Network complete with Transformers and
Attention so you can see them at their most basic."
Mr. Taskmanager - 2026
https://www.youtube.com/watch?v=OUE3FSIk46g
We see Doctor Frankstein in action from
the Bronze Age of Computing, producing
a Humunkulus, the progenitor of todays
Bulgakov Shuriks in the Hyperscale Age!
Bye
P.S.: My impression neither cut to the core, that
this incredible transformer most likely
produced this deterministic attention:
| -1 | * | k | + | 5 | = | k' |
Or differently expressed y_k = x_{5-k}.
How did the transformer do it? It produced
a neural network with 1216 parameters, but
didn't use embeddings or polar encoding
of positions. But if we strip the noise
and denoise from the position encoding,
the denoise is done via softmax. We somehow
must get the above, right? I still need to
verify my claim! BTW: The PDP-11 assembly
from 1979 uses wider example not with n=4
but with n=8.
| Sysop: | DaiTengu |
|---|---|
| Location: | Appleton, WI |
| Users: | 1,116 |
| Nodes: | 10 (0 / 10) |
| Uptime: | 86:53:34 |
| Calls: | 14,305 |
| Files: | 186,338 |
| D/L today: |
1,016 files (320M bytes) |
| Messages: | 2,525,511 |