Nearly 40 years ago, the MIT press published Parallel Distributed
Processing, Vol. 1 & 2, by Rumelhart, McClelland et al. I read those
as well as similar material published at MIT in the early 90s, wrote
some functional toy code (on an Osborne I).
But I haven't kept up.
Can someone suggest books at more or less the same level of
technicality that that I might look at to catch up a bit on how neural
nets are now constructed, trained, connected etc. to produce what is
being called "large language models"?
The net is of course rife, indeed inundated, with stuff on the topic.
But the vast bulk of it falls into one of two categories. One category
is mass media news and pop science reporting, intended to provoke "Oh,
gee whiz" by the average person or at best a vague notion of the
subject for for the literate but non-technical. The other category
is material intended for someone who has read all the technical
literature for the last 40 years or at least has obtained a master's
degree in AI computing/theory in the last decade. In the latter case,
just the terminology is a barrier.
I'm now an old guy. I'm not going to completely beat up all the math
that has evolved since PDP but I'd like to get a more or less
caught-up handle on how this stuff works internally.
Any suggestions?
[ Yes, I had a look at some of the AI newsgroups. Moribund or
highjacked by politics.]
Can someone suggest books at more or less the same level of
technicality that that I might look at to catch up a bit on how neural
nets are now constructed, trained, connected etc. to produce what is
being called "large language models"?
Any suggestions?
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,070 |
Nodes: | 10 (0 / 10) |
Uptime: | 158:46:24 |
Calls: | 13,734 |
Calls today: | 2 |
Files: | 186,966 |
D/L today: |
809 files (292M bytes) |
Messages: | 2,418,680 |