Comment on:
While it is generally true that "bigger is not always better...", the "scaling question" was asked back in 2001 (Man and Thinking Machine... - will more transistors, computing power etc. lead automatically to "thinking" etc.) and I myself am a proponent of and work on more conceptual ("symbolic") causal approaches, which don't need to read the whole Internet, it is more complex than that:...
The need to think with world-models (hierarchical simulators of virtual universes at different resolutions of causality control and methods for segmenting/granulating the sensory-motor data etc.) was defined in AGI in the early 2000s, and then rediscovered in the mainstream for the LLMs. However the scaling of LLMs from a long time is not just GPT1-GPT2-"Megatron" -GPT3, it's more than just adding more data. It got hyper multimodal, ensemble models, it includes robot data and physical interactions, RL, RLHL etc. Also the big platforms are unlikely just a single LLM, they are probably multi-component systems, which can include whatever "tool-use", simulators (another branch: Simulation Intelligence), physics-informed deep learning, game engines (=world models). Some foundation models from the last 2-3 years became also "generalist Agents" and they already do have practical "world models" to different precision, span etc.; JEPA-2 as well (although 1 Million hours video is way too much and it is still "dumb" ).
"World model" is ambiguous, in the classical theories 20-some years ago there was the concept of "resolution of causality control" and the good LLMs actually *do" have an implicit "world model" with linguistic resolution for many cases. Not perfect, not complete etc. "World modelling", physics, simulations is about predicting what will come next, given particular known preconditions and consequences, that's the causality, i.e. the mechanism is the same, but with LLMs it is not always predicting from the correct premises and don't follow the right laws/reason.
The late popular meaning of AGI about "economically feasible tasks", "better than the best human" (in these tasks), was promoted by the business opportunists. Originally AGI was about universal learning, prediction and compression in all modalities with a general algorithm ("next token prediction" was predicted by the AGI); i.e. in many ways this already happened.
1. Humans demonstrate the same faults like LLMs, but worse
The LLMs are "vulgarized" and some of the faults of the models are the same in human capabilities and behavior, however the humans don't understand them or they lack integrity in order to admit it.
2. Phonological loop of 2-3 seconds - conscious back context window of just several words
A few humans can remember more than 2-3 seconds in the past in their "phonological loop" - listen to a podcast or video clip, interrupt it and write down what you heard in the previous seconds; on which word you start to hesitate "was it this or that word?... was it a car or a vehicle?"? When you start to lose the correct word order? On which word you completely lose the thread and remember only that "you spoke about processors, the party, the movie, ...".*
The persons who can remember slightly more are yet unlikely to remember perhaps even just 10 or 15 seconds or these ones would be rare "freaks".
So how big is your "context window" then, counted in tokens comparable to LLM's? This is for the past, where the LLMs and computers can remember not just thousands, but even EVERYTHING, billions or trillions of tokens exactly (as memory).
Let's look forward now.
3. Next token prediction for humans - context window for future tokens is even worse
How many "tokens" *ahead* can you predict correctly in your *own* utterances, in plain text and words? How well can you plan your sentences when you write on paper or on the computer?
One word, two words, rarely 3-4-several words? Can you remember in your working memory a long complex sentence of 15-20-25 words? So it seems that you're a 3-4 gram model. Besides, unlike the LLMs, which have the "log probabilities" and can perform various kinds of "sampling" or "beam search" and sample differently, you usually don't or it would be too hard for you and too slow to write down many variants, so you wouldn't generate even two or three possible variants of your own "next word prediction task", except if you are a writer, you compose a poem, a story etc.
I myself do this, but it takes time: you first generate an initial version of a verse or a poem, then you replay it many times and let the time pass, and in minutes, hours, days you start to find better thoughts and words, ideas which better suit the overall message etc.**
* Perhaps the working memory capacity of 7+-2 items is a common prerequisite for the limitations of this "loop" and "buffer lengths".
4. Hallucinations - intentional lies and cheating and sometimes inability to even hallucinate
Humans not only "hallucinate", or make mistakes - say things about which they lack definite information, or they lack reliable proves that it is true (or they lack information that makes them believe that the information is true - it can be wrong either), and they are convinced and aggressive in their believe they are correct, and try to convince the others often based on authority and power, they are "bosses", they are "on the right side of the history", they are "the good guys" (in the Hollywood movies, LOL).
Humans intentionally and aggressively LIE and try to CHEAT and suggest lies in order to take advantage ot someone or the situation and they don't consider this wrong (LLMs learn from humans either, LOL).
Even worse: in many ways and often humans can't even hallucinate, when "prompted", or their attempts are grossly ridiculous, i.e. they have no clue what to say, what the next word would be, even randomly or generate any coherent continuation with correct syntax, guided by a topic or more material.
This is true especially if the "HumanLLM" lacks appropriate "pretraining" in the domain. Ask somebody what he thinks about an area of knowledge or event where he has little or no expertise, or in language she doesn't speak. What she would answer? What if you show a "humanLLM" who has never programmed a piece of machine code in hexadecimal A9 7F 8D 00 20 ... What "reasonable" content would the human generate if she is obliged to continue and return 100, 200, 1000 tokens?
LLMs are blamed for "just retrieving", repeating existing knowledge, just recombining it.
What else you do most of the time, humans? You can't do even that!
People with slightly better memory than the rest are called or compared to "computers", and that used to be even in the 1950s, LOL, or having "prodigious memory", if they can remember 50 or 1000 "digits of the number pi" (a meaningless record).
The "normal" situation is when humans have poor memory and remember up to 5-6-7-8... digits of Pi (3.1415926...). Even the most intelligent ones need to constantly consult with dictionaries, textbooks, books, mathematical tables with formulas, to lookup the code they write or repetitively reread the paragraphs of the posts and articles they write, to dig into the Internet, ... and to rely on LLMs to do the job for them.
In general almost everything in the Universe and in computers, and Universe is Computer*, is a repetition or retrieval from memory anyway, preservation has far bigger weight than development (see Zrim, Creating Thinking Machines).
The creative humans do the same ("Creativity is imitation at the level of algorithms", 2003), while most humans CAN'T DO EVEN THAT well: repeating even 3 seconds in the past from their own memories, their own intentions, thoughts etc.
Yet they blame the LLMs or computers - the real nature of LLMs - for being "dumb", "not intelligent" etc.
.....
* Yes, yet another (short) exercise in "pamphlet against...", see Lem, 1963-1964 and Arnaudov, 2001 (Man and Thinking Machine: Analysis of ... etc., Theory of Universe and Mind - see it also for the Universe Computer, as well the recently published appendix "Algorithmic Complexity"... ).
** These days I did that with a new song, a cover of a Bulgarian rock-n-roll classic, however with new lyrics about something else. Stay tuned.
*** Yes, that belongs to a paper, a continuation of the mentioned one etc.
...
*** See also The intro etc. of. The Architecture of Cognitive Control in the Human Prefrontal Cortex, 12.2003
- 302(5648):1181-5
https://www.researchgate.net/publication/9010007_The_Architecture_of_Cognitive_Control_in_the_H uman_Prefrontal_Cortex
More and more recent literature in the Neuroscience and LLM-related research in one of the Appendices of The Prophets of The Thinking Machines, currently called #listove.