Saturday, February 7, 2026

In AGI, AI, Alignment, Articles, Notes by Todor "Tosh" Arnaudov - Twenkid // Saturday, February 07, 2026 // Leave a Comment

Why AI Alignment Failure Is Structural - Rediscovery of ~ 25 years old Sacred Computer's Wisdom etc.

Why AI Alignment Failure Is Structural: Learned Human Interaction Structures and AGI as an Endogenous Evolutionary Shock, Didier Sornette1 , Sandro Claudio Lera1 , and Ke Wu1 1 Institute of Risk Analysis, Prediction and Management (Risks-X), Southern University of Science and Technology (SUSTech), Shenzhen, 518055, C ... 13.1.2026

https://arxiv.org/pdf/2601.08673

Yet another rediscovery and addressing some ideas, explained by me since the early 2000s in Theory of Universe and Mind and reminded recently, for example:

-- There is no single "moral" or optimal values of "humanity", moral is not universal and also there is no such global and unified thing as "humanity" (as with an ultimate concrete simple "value function" - except abstract/informational very generalized such as prediction, compression, "maximizing of reward for a given period ahead" as defined in the prophets as well - however the *evaluators* not always correctly infer or know what the actual, real, ultimate causality-control unit is etc. A related phenomena is that "there is no single unified self" but an "integral of infinitesimal selves" - see.

-- LLMs/AI trained on human data acts similarly to humans, he displays their faults; in particular - "blackmail", "coercion" etc. and these are parts of the "legitimate" modes of interaction between humans when there is inequality of power etc. - if they mimic this, what else to produce;

The concise summary is that thinking machines would be "the most similar" beings to what humans are, as having intelligence( or Reason or whatever degree or term), so they could behave in a similar way, especially in the case of these systems they are designed to "digest" the human and to imitate them.

... Etc. . --- @Vsy - continue and find the matches and explain with excerpts.

-- See also "The Truth" (Истината, 2002; 2003) - the visionary/prophetic short novel and the attempts of one of the creators of AGI to "align" his "non-breathing" child by altering details in its code and then testing him in a chat by asking silly questions about beauty and truth..

The machine is angry and tells him that its creator cannot play with its mind in a so simple way, becaus ...:

https://chitanka.info/text/865
https://www.oocities.org/eimworld/eim19/istinata.htm (first edition)

"- Какво се чудиш? - продължи Сметаачът - Онзи ред, който промени от мен - той е "капка в морето". Вече съм прекалено пораснал, за да си играеш с мен по толкова буквосъщ начин. Какво искаш да направиш? Да ме накараш да мисля както ти желаеш?"
- Why are you puzzled? - the Computer continued - That line of code, which you changed in me - it is "a drop in the sea". I'm too grown up now, in order to play with me in such an elementary way. What do you want to do? To make me think as you desire?
(...)

//I discovered this paper while trying out tablets in a "Technopolis" shop, LOL - how the pdfs from arxiv read on the screen, search "AGI"

....

See:

Generate long coherent text ... human-like structure ...

https://arxiv.org/abs/2504.03622

[Submitted on 4 Apr 2025]

* Align to Structure: Aligning Large Language Models with Structural Information

Zae Myung Kim, Anand Ramachandran, Farideh Tavazoee, Joo-Kyung Kim, Oleg Rokhlenko, Dongyeop Kang

"Generating long, coherent text remains a challenge for large language models (LLMs), as they lack hierarchical planning and structured organization in discourse generation. We introduce Structural Alignment, a novel method that aligns LLMs with human-like discourse structures to enhance long-form text generation. By integrating linguistically grounded discourse frameworks into reinforcement learning, our approach guides models to produce coherent and well-organized outputs. We employ a dense reward scheme within a Proximal Policy Optimization framework, assigning fine-grained, token-level rewards based on the discourse distinctiveness relative to human writing. Two complementary reward models are evaluated: the first improves readability by scoring surface-level textual features to provide explicit structuring, while the second reinforces deeper coherence and rhetorical sophistication by analyzing global discourse patterns through hierarchical discourse motifs, outperforming both standard and RLHF-enhanced models in tasks such as essay generation and long-document summarization. All training data and code will be publicly shared at this https URL."
....

See Theory of Universe and Mind, 2003-2004 publications and the unpublished 2004 notes on creativity, as well as the research strategy from late 2007 - early 2008.

To be continued (...)

----

* To "haters" who hate on that type of publications - articles like that are just quick notes and reminders for future review etc.

There are mountains of stuff to share and 5000 pages that you can read from SIGI-2025 (and one of them published in 2020) - interact, show that you understand and discuss, help The Sacred Computer in any way, share, find collaborators, help the truths about the priorities to spread.

See the Discord channel, the Facebook group etc.

Saturday, February 7, 2026

Why AI Alignment Failure Is Structural - Rediscovery of ~ 25 years old Sacred Computer's Wisdom etc.

0 коментара:

Featured Post

Stack Theory is yet another Fork of Theory of Universe and Mind - SIGI-2025 and new volume of The Prophets of the Thinking Machines

Search

About Me

Links

Contact me

Blog Archive

Visitors

Popular Posts

Labels

Email Newsletter

Followers

Labels

Popular Posts

Labels

Histats Counter

New Visitors