Read: "The Prophets of the Thinking Machines: Artificial General Intelligence and Transhumanism: History, Theory and Pioneers; Past, Present and Future, currently >4600-4800?(11.1.2026) (4000 3300? ; 3050-3100? pages (7.8.2025) 1240 1600 >2400-2500? (2900+?) pages [6.2025] This work will continue with "Creating Thinking Machines". Visit SIGI-2025 with some works and volumes already published and check and join and help with the open projects, which are not stealth: the AGI infrastructure called "Vsy" or "Jack of All Trades" etc. Welcome to Artificial Mind, part of The Sacred Computer. I am always looking for friends, partners and collaborators to work with, interesting project and new fields and things to study, explore and create. Join my exploration* or invite me to yours! (...) *Versatile (Limitless) Explorer and (Self-)Improver - one of my alternative terms for AGI/Universal Thinking Machines. What does Twenkid means?

Thursday, May 28, 2026

// // Leave a Comment

How much Bulgarian is the so called Bulgarian institute INSAIT? Part I

The so called "Bulgarian" international Swiss-Bulgarian-World Institute for AI and computer science INSAIT, which "strategically" was in many ways a cynical rip-off and a steal of my project published in 2003 :))), with arrogant and ridiculous claims for being "innovative", "visionary" or original, LOL (however "hyperscaled" in its financial demands and expenses), are promoting their success at CVPR 2026 and their role for "Bulgaria": 17 publications + 3 other participations, 20 total. They are among the greatest in the world. [See more in the book "The First modern AI strategy .." (Първата съвременна стратегия за развитие чрез изкуствен интелект ..." and "AI Institutes ... " (Институти и стратегии за изкуствен интелект на световно ниво ...) from the hyperbook "The Prophets of the Thinking Machines..." from SIGI-2025)

Right, they are among the leading institutes in their and in this game, however are they BULGARIAN as they so aggressively emphasize and claim on every occasion?  (Maybe that is why they do.)

Asking the free ChatGPT (no login) with an excerpt from the list of publications: ... 


==================================================

QUESTION:  How many papers? How many of them include Bulgarian or Slavic names? 
[see the full text below the answers]

ANSWER: There are 18 papers listed.

Papers that appear to include Bulgarian or other Slavic names (based on author names alone) are approximately:

  1. ConceptPose: Training-Free Zero-Shot Object Pose Estimation using Concept Vectors
    – includes Dani Velikova (likely Bulgarian/Slavic surname)
  2. SPEAR-1: Scaling Beyond Robot Demonstrations via 3D Understanding
    – includes Nikolay Nikolov and Aleksandar Yanev (clearly Bulgarian names)
  3. FireScope: Wildfire Risk Raster Prediction With a Chain-of-Thought Oracle
    – includes Mario Markov (Bulgarian/Slavic surname)

So the count is:

  • 18 total papers
  • About 3 papers with clearly Bulgarian/Slavic author names

If you want, I can also break it down into:

  • specifically Bulgarian names,
  • broader South Slavic / East Slavic / West Slavic names,
  • or estimate the nationality distribution of all authors.



QUESTION: Estimate the nationality distribution of all authors.

This can only be a rough estimate from names and affiliations conventions, not an actual nationality census. Many researchers work internationally, have multicultural backgrounds, or use romanized names that overlap across regions.

That said, from the author lists, the distribution appears heavily concentrated in Chinese researchers, with a secondary cluster from European institutions (especially Swiss/Western European labs connected to ETH Zurich / KU Leuven style collaborations around Luc Van Gool and Danda Pani Paudel).

Approximate estimate across all listed authors:

Estimated background/nationality clusterApprox. shareNotes
Chinese~70–75%Dominant throughout nearly every paper
South Asian / Nepali / Indian~5–8%Mainly from names like Danda Pani Paudel, Mohammad Mahdi
Western/Central European~8–10%Swiss, Germanic, Dutch, Belgian, etc.
Balkan / Slavic (incl. Bulgarian)~4–6%Nikolov, Yanev, Markov, Velikova
Spanish / Catalan / Latin European~3–5%Vazquez-Corral, Serrano-Lozano, Rocafort
Korean~1–2%Kailun Yang is Chinese despite surname “Yang”; not many obvious Korean names
Japanese~0–1%Essentially none obvious
Middle Eastern / Turkish~1–2%Ahmet Berke Gokmen
Other / unclearremainderMixed international researchers

A more concrete estimate by author count:

  • Total unique authors listed: roughly 90–100
  • Estimated:
    • Chinese-origin names: ~70
    • European-origin names: ~12–15
    • South Asian-origin names: ~5–7
    • Bulgarian/Slavic names: ~4–6
    • Spanish-origin names: ~3–4
    • Turkish-origin names: ~1–2

The most internationally connected recurring senior collaborators are:

  • Luc Van Gool
  • Danda Pani Paudel
  • Kailun Yang
  • Jinjin Gu

The overall pattern is very typical of modern CVPR:

  • strong Chinese participation,
  • Swiss/European lab hubs,
  • globally mixed PhD/postdoc collaborations,
  • and increasing cross-institution research teams
-------------------------------------------
==========================

The Content of First Prompt:

* Yuanfan Zheng, Kunyu Peng, Xu Zheng, Kailun Yang

Seeing Beyond: Extrapolative Domain Adaptive Panoramic Segmentation

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)

Paper

Xiaolong Qian, Qi Jiang, Lei Sun,  Zongxi Yu, Kailun Yang, Peixuan Wu, Jiacheng Zhou, Yao Gao,  Yaoguang Ma, Ming-Hsuan Yang, Kaiwei Wang

Learning Latent Transmission and Glare Maps for Lens Veiling Glare Removal

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Shaolin Su, Josep M. Rocafort, David Serrano-Lozano, Lei Sun, Danna Xue, Javier Vazquez-Corral

Bridging the Perception Gap in Image Super-Resolution Evaluation

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


 


Paper

Jiaqi Tan, Xu Zheng, Yang Liu

RMMSS: Towards Advanced Robust Multi-Modal Semantic Segmentation with Hybrid Prototype Distillation and Feature Selection

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)

Paper

Haoyu Chen, Keda Tao, Yizao Wang, Xinlei Wang, Lei Zhu, Jinjin Gu

Intelligent Photo Retouching with Language Model-Based Artist Agents

In: The Findings track of IEEE/CVF Conference on Computer Vision and Pattern Recognition 2026 (CVPR 2026 Findings)


Paper

Xiaoye Wang, Chen Tang, Xiangyu Yue, Wei-Hong Li

3D-Aware Multi-Task Learning with Cross-View Correlations for Dense Scene Understanding

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Zhiyuan You, Ke Wang, He Zhang, Xin Cai, Jinjin Gu, Tianfan Xue, Chao Dong, Zhoutong Zhang

PhotoFramer: Multi-modal Image Composition Instruction

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Xiang Yin, Jinfan Hu, Zhiyuan You, Kainan Yan, Yu Tang, Chao Dong, Jinjin Gu

How Far Have We Gone in Generative Image Restoration? A Study on Its Capability, Limitations and Evaluation Practices

In: The Findings track of IEEE/CVF Conference on Computer Vision and Pattern Recognition 2026 (CVPR 2026 Findings)


Paper

Zihao Dongfang, Xu Zheng, Ziqiao Weng, Yuanhuiyi Lyu, Danda Pani Paudel, Luc Van Gool, Kailun Yang, Xuming Hu

Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?

In: The Findings track of IEEE/CVF Conference on Computer Vision and Pattern Recognition 2026 (CVPR 2026 Findings)


Paper

Bingwen Zhu, Bingwen_Zhu, Yuqian Fu, Qiaole Dong, Guolei Sun, Tianwen Qian, Yuzheng Wu, Danda Pani Paudel, Yanwei Fu, Xiangyang Xue

EgoSound: Benchmarking Sound Understanding in Egocentric Videos

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Liming Kuang, Dani Velikova, Mahdi Saleh, Jan-Nico Zaech, Danda Pani Paudel, Benjamin Busam

ConceptPose: Training-Free Zero-Shot Object Pose Estimation using Concept Vectors

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Ahmet Berke Gokmen, Ajad Chhatkuli, Luc Van Gool, Danda Pani Paudel

Inferring Compositional 4D Scenes without Ever Seeing One

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Website

Code

Nikolay Nikolov, Giuliano Albanese, Sombit Dey, Aleksandar Yanev, Luc Van Gool, Jan-Nico Zaech, Danda Pani Paudel

SPEAR-1: Scaling Beyond Robot Demonstrations via 3D Understanding

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Website

Jiancheng Pan, Runze Wang, Tianwen Qian, Mohammad Mahdi, Yanwei Fu, Xiangyang Xue, Xiaomeng Huang, Luc Van Gool, Danda Pani Paudel, Yuqian Fu

V^{2}-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Yuheng Zhang, Mengfei Duan, Kunyu Peng, Yuhang Wang, Ruiping Liu, Fei Teng, Kai Luo, Zhiyong Li, Kailun Yang

ProOOD: Prototype-Guided Out-of-Distribution 3D Occupancy Prediction

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Xiaolong Qian,  Qi Jiang, Yao Gao,  Lei Sun, Zhonghua Yi, Kailun Yang, Luc Van Gool, Kaiwei Wang

Towards Universal Computational Aberration Correction in Photographic Cameras: A Comprehensive Benchmark Analysis

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Mario Markov, Stefan Maria Ailuro, Luc Van Gool, Konrad Schindler, Danda Pani Paudel

FireScope: Wildfire Risk Raster Prediction With a Chain-of-Thought Oracle

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)


Paper

Yue Li, Qi Ma, Runyi Yang, Mengjiao Ma, Bin Ren, Nikola Popovic, Nicu Sebe, Theo Gevers, Luc Van Gool, Danda Pani Paudel, Martin R. Oswald

Chorus: Multi-Teacher Pretraining for Holistic 3D Gaussian Scene Encoding

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)





Read More

Wednesday, May 20, 2026

// // Leave a Comment

Twenkid - The Child of AGI - is Challenging the Grandfather of AI Yann Lecun

Yann Lecun cites a post, which is acknowledging that his ideas were correct etc.


A  part of the concluding punch lines:

* It is "dime a dozen", but people decades older than me who *literaly* repeated and ripped-off my suggestions and observations decades later, got prized with billions to *waste* and I am not even mentioned. They did it even in my own country, where one Bulgarian-Canadian became an "architect" of an institute in Sofia, with statements which were a *20 years late rip-off* of the above-cited essay, which were sold as  "innovative" and ground-breaking :))), "for the first time in Eastern Europe" etc.

* As of "Dime a dozen"--> yes, or even "Five a dozen" -->
The current  "supercomputer" of my lab is called "PETAK I", where
  "Pet" means "5": from: 1. "Pentium" (historically the CPU and brand on which TUM was created), 2. The CPUs of all nodes: Core i5 (all old ones, 11-14 years old models, LOL); 3. Five nodes of the cluster (the initial full configuration) 4. A parody CPU-name from a science fiction work from 2004 from that theory ("Pentium 5"; "Петият Петак") and 5. In Bulgarian it also means "5 cents"... LMAO


"""Tunisia.AI

 
Администратор
Експерт в групата на тема Изкуствен интелект и машинно обучение
 20 април в 21:34 
Yann LeCun may have been right about something important: next-token and next-pixel prediction are probably not the most efficient path to real world understanding.
For years, the industry has been scaling generative models under the assumption that bigger models, more data, and more compute would eventually produce deeper intelligence. LeCun has been arguing the opposite: predicting every word or every pixel forces models to spend huge amounts of compute on surface details instead of learning the underlying structure of reality.
That’s the core idea behind JEPA (Joint-Embedding Predictive Architecture): instead of reconstructing the world pixel by pixel, learn a compact latent representation and predict what happens next inside that space.
The problem is that these models have historically been unstable. They suffer from “representation collapse,” where the latent space becomes too simple to carry useful information unless you add complex training tricks, auxiliary losses, or frozen components.
A new paper, LeWorldModel (LeWM), shows a much cleaner approach. It trains end-to-end from raw pixels using only two losses: a next-embedding prediction loss and a Gaussian regularizer on the latent space. This drastically simplifies the training setup compared to prior approaches.
The efficiency gains are striking. The model has around 15 million parameters, trains on a single GPU in a few hours, and can plan up to 48× faster than larger foundation-model-based world models, while staying competitive on several 2D and 3D control tasks. Its latent space also appears to capture meaningful physical structure and can detect physically implausible events in controlled environments.
This doesn’t mean generative AI is a dead end. LLMs remain extremely powerful. But it does reinforce a key technical point: for world modeling and physical reasoning, predictive latent-space approaches may be far more compute-efficient than brute-force generation.
The real shift might be this: not models that generate everything, but models that understand enough of the world to predict what actually matters."""
https://www.facebook.com/yann.lecun/posts/pfbid0oEkmbuzwdvNC6JWRDoRzaDAyLNtzPZzvvman5ob89Z1v5AYvuQdTrQEjFcGJ3958l?__cft__[0]=AZaoNFuahXF3_DoSg2wL8ID4WPPsHBRWvjU51k3Sa742m0aQX2r7nl0VWyDXlyXx5eBBbxc-OWgq18zIcllnPjdIm_HNuV1Mga9hvE-Nga9s6_vMV0SjQVFusWFQhxpNCbrhQX0HjKTjh5WwzRfqeSFqWjE6X0vy_kidQqxMcHABvF2r6roXC6OCa1EcctDCHZNSND1P9hmQT8Cl2AcxeawX&__tn__=%2CO%2CP-R 

Todor Arnaudov

From the books "The First modern AI Strategy ..."... and "Stack Theory is Yet another fork of Theory of Universe and Mind" published at SIGI-2025

This idea, together with the prediction and next-token prediction (but in multi-scale, multi-precision hierarchy of resolutions of causality-control and perception), was published and explained nearly 25 years ago in Theory of Universe and Mind and presented during the world's first university courses in AGI in 2010 and 2011. Y.Bengio also rediscovered it 2017-2018 (Consciousness prior) and his example is almost literary repetition of an introductory definition from a treatise published about 14 years earlier. The author was a teenager, LOL.

Yann LeCun:

@Todor Arnaudov as I pointed out on another platform, ideas are a dime a dozen. The hard part, for something like this, is to implement it and to make it work.

The whole idea of hierarchical representations and learning by prediction is very old.

But learning hierarchies of representations didn't really work until convolutional nets were shown to do it in the late 1980s and more forcefully in the early 2010s (this took a while).

===

Todor Arnaudov:


Hi, first thanks for your answer as I didn't expect this honor. I don't disagree that there were earlier "prophets", I recently published a hyperbook with a related name (nearly 5000 pages in total), where one of the intros in one of the sections with collectons of related, prior and later work is a citation from the Holy Bible:

"There is nothing new under the Sun"

Some of the prior work doesn't get enough credit and is unknown, even the "fellow AI  historian" Schmidhuber doesn't mention them, e.g. the Soviet lab of Bongard and his colleagues etc. (E.g. once I caught Chollet literary restating insights from the 1967 book "Проблема узнавания" - perhaps he didn't know; he also rediscovers definitions for general intelligence of mine, published in 2001 (he couldn't know about it) - see the link at the end and the reviews of the LLMs).

 The Bible is called "The Prophets of the Thinking Machines: Artificial General Intelligence & Transhumanism: History, Theory and  Pioneers; Past, Present and Future", SIGI-2025 - and yes, almost nobody will bother to even open it. :))

BTW, e.g. IMO your PhD student Marc’Aurelio Ranzato deserves more credit for his pioneering work in DL and his insights (which perhaps [are] ~ also yours) -- his work is credited in my historical collections here: https://twenkid.com/agi/Lazar_The_Prophets_of_the_Thinking_Machines_20-8-2025.pdf ~p.21.

I do agree that I had to push to implementations immediately (not your type of NNs though) and perhaps my claims would be accepted after I implement them all by myself (Or if I or somebody else had - 20 years ago with no collaborators or any funding, no mechanical Turks to labe a gazillion of data and computing iterations, compared to 20 years later and all the collected resources in all senses of the word: i.e. IMO the difficulty of the implementation is supposed to decrease and be "discounted" with time like in RL; an idea 25 or 50 years ago may end up more "valuable" than an implementation in the present - see generative AI and the final citation below)

* I know about your dismissive opinion about "ideas", e.g. your comments to Schmidhuber's recent challenge, that you also could find ideas in your unpublished notes or something etc. and I've listened to your answers to him since 2022, "The path towards autonomous AI..." - I remember you defended yourself with referring to Optimal Control etc.

However many works are proposals, theoretical etc. but still get recognized, while other prior ones - don't and are even "humiliated". Also the core novelty there in my reading of the paper was also matching the mentioned TUM (and too general, it was not an implementation too); in general it looked like another cognitive architecture, which were popular in the cognitive science and the AGI community decades earlier, perhaps I have to reread it.

* I understand that if you dismiss even the German, who is at a comparable status as yours or, say he has more ground to be believed that he is, then you (and almost anyone) wouldn't recognize the claimed "priority" or even just the "contribution" of some obscure "self-proclaimed" "crank" or the mentioned theory, no matter the evidence (maybe you wouldn't even bother to check any evidence or count it as "theory" or anything).

BTW, your recent work about the brain/humans as "not general ..." also matches and is closely related to my prior work/accounts, beginning in early 2000s, however with different interpretation of the observations. The limitations don't deny the concept of general intelligence and the possibility of general principles and modules (prediction-compression etc.) I may address the correspondences in a paper.


*  Stack Theory is yet another Fork of Theory of Universe and Mind, SIGI-2025

https://www.researchgate.net/publication/398934575_Stack_Theory_is_yet_another_Fork_of_Theory_of_Universe_and_Mind_-_Appendix_Volume_to_The_Prophets_of_the_Thinking_Machines_Artificial_General_Intelligence_and_Transhumanism_History_Theory_and_Pioneers 


* The first modern AI strategy was published by an 18-year old in 2003 and repeated and implemented by the whole world 15-20 years later: Bulgarian Prophecies: How would I invest one million for the greatest benefit for the development of my country? https://twenkid.com/agi/Purvata_Strategiya_UIR_AGI_2003_Arnaudov_SIGI-2025_31-3-2025.pdf (Bongard, 1967 vs Chollet,2024 p.169-170)


* BTW, cheers from Kyuchuk Paris - that's the district in the city of Plovdiv, where TUM was created. 🙂

* This is the world's first modern "AI strategy", 2003, repeated and implemented by "the whole world" 15-20 years later: https://twenkid.com/agi/proekt.htm 

* It is "dime a dozen", but people decades older than me who *literaly* repeated and ripped-off my suggestions and observations decades later, got prized with billions to *waste* and I am not even mentioned. They did it even in my own country, where one Bulgarian-Canadian became an "architect" of an institute in Sofia, with statements which were a *20 years late rip-off* of the above-cited essay, which were sold as  "innovative" and ground-breaking :))), "for the first time in Eastern Europe" etc.

* As of "Dime a dozen"--> yes, or even "Five a dozen" -->
The current  "supercomputer" of my lab is called "PETAK I", where "Pet" means "5": from: 1. "Pentium" (historically the CPU and brand on which TUM was created), 2. The CPUs of all nodes: Core i5 (all old ones, 11-14 years old models, LOL); 3. Five nodes of the cluster (the initial full configuration) 4. A parody CPU-name from a science fiction work from 2004 from that theory ("Pentium 5") and 5. In Bulgarian it also means "5 cents"... LMAO

 Also as I predicted in 2013 (counterintuitive to all "experts" up to just a few years ago, I namely wrote this article *because* of clueless "experts" predicted the opposite; they were later cited thousands of times for their *WRONG* world-model and wrong predictions):

"Creative Intelligence will be First Surpassed and Blown Away by the Thinking Machines, not the "low-skill" workers whose jobs require agile and quick physical motion and interactions with human-sized and human-shaped environment"

https://artificial-mind.blogspot.com/2013/10/creative-intelligence-will-be-first.html

" (...) For the intellectual jobs - it's much easier to pick a computer, run the appropriate software or connect it to the service,

and get it thinking - you already have decent cameras, microphones and many sensors even in smartphones. (...) The bottom line is that the "white collars" are more endangered in current-time economy. Perhaps that kind of economy could hardly survive the AGI revolution. I guess it may turn upside down for a while - the low-skill workers could get higher pay, because intellectual activities will be done in 1 ms for free... 😉  We, the smart guys (the smart asses, see "Super Smartasses" the graphical series ) wouldn't be needed by anyone... Not that we are needed now. :))"


 * The prediction of the generative AI (however it could have been created by the late 2000s-early 2010s - it came *too late*, not too quick as Hinton and Bengio "complain"; not with gradient-descent of course):  

 -- Creativity is Imitation at the Level of Algorithms - An outline sketch of a possible path of development of the Artificial Intelligence "Emil" 

https://www.researchgate.net/publication/395129890_Creativity_is_Imitation_at_the_Level_of_Algorithms_-_An_outline_sketch_of_a_possible_path_of_development_of_the_Artificial_Intelligence_Emil

 * Petak I: https://github.com/Twenkid/SIGI-2025/blob/main/petaki.md

Read More

Sunday, May 17, 2026

// // Leave a Comment

List of the Biggest Early GPT LLMs for Non-English Languages Circa 2020-2021-2022 - Update with Chinese, Russian and Spanish models

Updates to the table about the biggest GPT-models circa 2021 in the book "The First Modern Strategy ... " 

The GPT2-MEDIUM-BG seems to be among the biggest 6-7 models, trained on a free single Tesla T4 in Colab. :)) 

p.26

В този труд са дописани допълнителни бележки към цитирани откъси от класическия ТРИВ от [17] за мерките за зародиш на разум и степените на развитие. От[TT1]  големите езикови модели – сведения за тях и работата им и някои най-нови публикации, както и сравнение на данни за ранни GPT-модели на различни езици – арабски, френски и множество европейски, японски и китайски – като българският GPT2-MEDIUM-BG се оказва един от шест-седем най-големи модели от такъв тип в света до 2021 г. за езици различни от английския – по-големи около същото време или малко по-рано са само за китайски, арабски, руски, румънски и френски; с подобен размер е за японски[1], разработен по същото време като българския. Кратки  бележки за по-големия проект за инфраструктура за Общ ИИ и всякакви проекти, свързани и с пораждащи модели – Вседържец[TT2] “.(...)


[1] Възможно е и др.: на 18.5.2026 добавих китайски, руски и нов испански от 2022 г. Виж [236



p.238 [236]

236. Ранни пораждащи големи езикови модели от типа GPT за езици, различни от английския: български, френски, арабски, испански, португалски, немски, китайски; гръцки, сръбски, румънски, японски, китайски, руски – 2020-2021 г.  Датата на някои – по дати на файловете с теглата на модела, дата на научна статия и пр. До края на 2021 г. само китайският, френският, арабският, руският, румънският, японският и българският са с над 100-тина милиона параметъра. Румънският е силен, обучаван на 17 GB-ов корпус. Само българският вероятно е разработен от един-единствен човек с бюджет и подкрепа = 0 и авторът представя родната компютърна лингвистика в тази дисциплина като самозван „хайдутин“, понеже институциите и по-„елитните“ бойци чакаха до 2023-2024 г. [66]. Сравни с аналогичен случай с ДЗБЕ около 2001-2003 г. и бездействието на ИБЕ на БАН и на останалите филолози от университетите спрямо явленията, срещу които ДЗБЕ се противопоставяше и се опитваше да „призове“ „чети“ [16][40], а „маститите“ езиковеди (по определението на Павлин Стойчев, „PC World Bulgaria“, 5.2003 [239]) гледаха безучастно и обясняваха, че това били „естествени процеси“. Сравни с бележките за „Добродетелната дружина и нехранимайковците“ и [40], 2003 г., дали талантите не са имали избор да не учат в „най-престижните университети“ и да развият местните и пр. XLM-R от „Фейсбук“, 11.2019 е по-голям, но в него българският е един от 100 езика, на които е обучаван, и моделът е за класификация и отговаряне на въпроси, а не за пораждане.
Таблици: подредени по време на създаване и по размер:
Допълнена на 18.5.2026 с китайския, руския и испанския голям модел.

Ранни големи езикови модели “GPT“за разни езици по време

GPT

117 M

6.2018

GPT2

1.554 B

14.2.2019 (XL) (публик. 11.2019)

 

Италиански

117 М

4.2020

Португалски

124 M?

5.2020

Гръцки

124 M?

9.2020

Немски

124 M?

11.2020 – 8.2021

Китайски

124 M?

11.2020 – 5.2021

Испански

124 M?

12.2020

Китайски

2.6 B

12.2020

Арабски

1.46 B

3.2021

Руски

760 М

5.2021

Френски

1 B

5.2021

Румънски

774 M

7.2021

Сръбски

124 M?

7.2021

Български

355 М

6.2021 – 8.2021, Тош

Японски

336 М

16.8.2021

Японски

1 B

20.1.2022

Испански

774 M

1.4.2022

БАН

124 M

27.6.2023

INSAIT

7.3 B

2.2024

 

Ранни големи езикови модели тип “GPT“,
подредени по размер

Китайски

2.6 B

12.2020

Арабски

1.46 B

3.2021

Френски

B

5.2021

Руски

760 М

5.2021

Румънски

774 M

7.2021

Български

355 М

6.2021 – 8.2021, Тош

Японски

336 М

16.8.2021

Японски

1 B

20.1.2022

Испански

124 M?

12.2020

Португалски

124 M?

5.2020

Немски

124 M?

11.2020 – 8.2021

Италиански

117 М

4.2020

Китайски

124 M?

11.2020 – 5.2021

Гръцки

124 M?

9.2020

Сръбски

124 M?

7.2021

БАН

124 M

27.6.2023

INSAIT

7.3 B

2.2024

GPT

117 M

6.2018

GPT2

1.554 B

14.2.2019 (XL) (публик. 11.2019)

1. Тодор Арнаудов, GPT2-MEDIUM-BG, Свещеният сметач, ДЗБЕ ~6.2021 – 8.2021345М – български – обучен от нулата на Tesla T4 в Colab [31][46]

[31. Т.Арнаудов, Подготовка на набор данни и обучение на
GPT2-MEDIUM на български, 6.2021 г.: Train GPT2-MEDIUM Google Colab Tips & Tricks Any Language From Scratch
https://github.com/Twenkid/GPT2-Bulgarian-Training-Tips-and-Tools
https://github.com/Twenkid/GPT2-Bulgarian-Training-Tips-and-Tools/blob/main/bggpt_sacred_computer.ipynb
* T.Arnaudov, GPT2 Unlimited-Length Generation with Hidden Prompt Injections - Code Review, 2021 (1.2023)https://youtu.be/V1eO2OpsXBE
* T.Arnaudov, GPT2-Medium Training from Scratch on Colab for Any Language - Tips & Tricks by Twenkid,
2021: https://youtu.be/F-Xt-cK4L-g Код и подробна инструкция за подготовка на набор от данни и обучение на GPT2 модели безплатно в Google Colaboratory от нулата (на английски). Популярен клип по темата с над 4 хил. гледания, 68 харесвания, 30-тина абонати.]

46. T.Arnaudov, 2021 (2025), gpt2-medium-bg, https://huggingface.co/twenkid/gpt2-medium-bg    
* T.Arnaudov,
https://github.com/Twenkid/GPT2-Bulgarian-Training-Tips-and-Tools
* T.Arnaudov, Train GPT2-Medium in Google Colab – Tips & Tricks – Any language from scratch, 2021
https://youtu.be/F-Xt-cK4L-g
* T.Arnaudov, Unlimited length with GPT2 … Update 6-5-2023 …
* T.Arnaudov, Hidden Prompt Injections: Unlimited Length GPT2 Generation, 1. 2023 (work from 2021)
https://www.youtube.com/watch?v=V1eO2OpsXBE

]
 
2.
Antoine Simoulin, Benoit Crabbé. Un modèle Transformer Génératif Pré-entrainé pour le ______ français. Traitement Automatique des Langues Naturelles, 6.2021, Lille, France. pp.246-255. ffhal03265900f  https://hal.science/hal-03265900 : – френски GPTfr-124M и GPTfr-1B с архитектурата на GPT3. 5.2021
3.
https://huggingface.co/dbddv01/gpt2-french-small - друг френски малък SMALL 137M, също обучен в Colab като българския, но с платена услуга Colab Pro.
4. Wissam Antoun and
Fady Baly and Hazem HajjARAGPT2: Pre-Trained Transformer for Arabic Language Generation, 7.3.2021 https://arxiv.org/pdf/2012.15520
Арабски, 4 варианта: 135M, 370M, 792M, 1.46B
  
5.  https://huggingface.co/datificate/gpt2-small-spanish испански: SMALL 124M? 12.2020 (дообучен от английския, използва техники от португалския)
6.
https://huggingface.co/pierreguillou/gpt2-small-portuguese/tree/main - португалски, SMALL, 124M?, 5.2020
7.
https://github.com/stefan-it/german-gpt2 немски малък: SMALL : 11.2020 - 8.2021 (тората версия - преобучен с по-добри резултати, използващ dbmdz)
8.
https://huggingface.co/dbmdz/german-gpt2/tree/main - 8.2021  - 10.2021 SMALL 124M?
9. GePpeTto Carves Italian into a Language Model, Lorenzo De Mattei, Michele Cafagna, Felice Dell'Orletta, Malvina Nissim, Marco Guerini, 29.4.2020  – италиански SMALL 117M
10. Chinese GPT2 SMALL-like models:  https://huggingface.co/uer/gpt2-chinese-cluecorpussmall  SMALL  11.20 – 5.21  https://huggingface.co/ckiplab/gpt2-base-chinese
11. https://huggingface.co/nikokons/gpt2-greek - гръцки, малък, SMALL, 9.2020
12. https://huggingface.co/macedonizer/sr-gpt2 - сръбски, малък, SMALL, 25.7.2021
13. https://huggingface.co/readerbench/RoGPT2-medium румънски, 124M, 354M, 774M LARGE, 7.2021   https://huggingface.co/readerbench/RoGPT2-large/tree/main Много голям за тогава корпус: 17 GB и обстойни тестове за производителността в статията:
RoGPT2: Romanian GPT2 for Text Generatio
n, M.Niculescu, S.Ruseti, M.Dascalu, 11.2021, University Politehnica of Bucharest, 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI)
https://www.researchgate.net/publication/357227566_RoGPT2_Romanian_GPT2_for_Text_Generation
14. Японски модел от типа на
GPT2-MEDIUM с 336 М, 24 слоя, 1024-размерни вектори.  https://huggingface.co/rinna/japanese-gpt2-medium  * https://github.com/rinnakk/japanese-pretrained-models  Японският е много добре обучен (development perplexity, ppl 18, обучен за 45 дни на 8xV100 32 GB върху японската Уикипедия и др.), файл от 16.8.2021 г. https://huggingface.co/rinna/japanese-gpt-1b - 1-милиарден модел, 20.1.2022 г., 24 слоя, 2048-размерен вектор.


15.
Unsupervised Cross-lingual Representation Learning at Scale, Alexis Conneau, Kartikay Khandelwal, …, Veselin Stoyanov, 11.2019/4.2020  - XLM-R многоезичен езиков модел, обучаван и върху български корпус. В разработката участва Веселин Стоянов.

[Updates in the edition from 17.5.2026: Chinese, Russian, Spanish]
16. CPM: A Large-scale Generative Chinese Pre-trained Language Model, Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun, 1.12.2020, https://arxiv.org/abs/2012.00413

17. Methods for Detoxification of Texts for the Russian Language Daryna Dementieva‡ , Daniil Moskovskiy‡ , Varvara Logacheva‡ , David Dale‡ , Olga Kozlova† , Nikita Semenov† , and Alexander Panchenko‡ ‡Skolkovo Institute of Science and Technology, Moscow, Russia †Mobile TeleSystems (MTS), Moscow, Russia {daryna.dementieva, daniil.moskovskiy, v.logacheva, d.dale, a.panchenko}@skoltech.ru {oskozlo9,nikita.semenov}@mts.ru,  19.5.2021 https://arxiv.org/pdf/2105.09052 https://github.com/ai-forever/ru-gpts

18. Spanish Language Models
, Asier Gutiérrez-Fandiño, Jordi Armengol-Estapé, Marc Pàmies, Joan Llop-Palao, Joaquín Silveira-Ocampo, Casimiro Pio Carrino, Aitor Gonzalez-Agirre, Carme Armentano-Oller, Carlos Rodriguez-Penagos, Marta Villegas; 15.7.2021 – 5.4.2022 (v1 to v5); the GPT models appears in v3 from 1.4.2022
https://arxiv.org/abs/2107.07253v3

* GPT2 е обявен от OpenAI през 2.2019 г., но не е публикуван за използване заради опасения за възможна злоупотреба – пораждане на „фалшиви новини“ и пр. През 8.2019 пускат 774М, а през 11.2019 – двойно по-големият. OpenGPT2, представен през 8.2019 г., е обучен върху корпуса „OpenWebText”; цената за облачни услуги била около 50 хил. долара. https://en.wikipedia.org/wiki/GPT-2
* Размерите са приблизителни и може да са неточни за модели, които не са ползвали точно архитектурата, някои са с различен брой токени (32000) и пр. Влияние оказва не само броят параметри, а още качеството на данните и начинът на обучение и др. На онзи етап и мащаби всички модели са експериментални и с научна и образователна цел.




[Submitted on 1 Dec 2020]

CPM: A Large-scale Generative Chinese Pre-trained Language Model


during training: batch size = 3,072;  3M tokens .. (vs 1 M for GPT3 training)
strong few shot learning ...

* Methods for Detoxification of Texts for the Russian Language Daryna Dementieva‡ , Daniil Moskovskiy‡ , Varvara Logacheva‡ , David Dale‡ , Olga Kozlova† , Nikita Semenov† , and Alexander Panchenko‡ ‡Skolkovo Institute of Science and Technology, Moscow, Russia †Mobile TeleSystems (MTS), Moscow, Russia {daryna.dementieva, daniil.moskovskiy, v.logacheva, d.dale, a.panchenko}@skoltech.ru {oskozlo9,nikita.semenov}@mts.ru,  19.5.2021

https://arxiv.org/pdf/2105.09052

https://github.com/ai-forever/ru-gpts


* Others: later than GPT2-MEDIUM-BG (2021)


Spanish - MarIA GPT-2 -

https://arxiv.org/pdf/2107.07253v1 - only BERT-like model in July 2021

GPT2 models, up to Large 774M appears in v3, 1.4.2022:

https://arxiv.org/abs/2107.07253v3 

...

German -  https://www.kkirchheim.de/blog/german-gpt/ -  "Training a German LLM from scratch"

Existing German models available on Hugging Face have 137M parameters and a context length of 1024 tokens1, which is quite limited compared to recently released models, such as those in the LLAMA family.


Read More