Hello! I'm Todor, a.k.a. Tosh and Twenkid - a Universal man, author of the groundbreaking world's first university course in Artificial General Intelligence (Plovdiv 2010,2011) - whopping 8 years before the famous MIT course of the now celebrity podcaster Lex Fridman; and of another pioneering body of works called "Theory of Universe and Mind" (2001-2004). I am a Researcher, Developer and Entrepreneur in AGI, where I was a child prodigy and visionary as early as my teenage years in the early 2000s, beyond the expected computer science and also linguistics/writing, in the fields of Transhumanism, Digital Physics/The Universe as a Computer / Discrete Universe, Philosophy of AI, Mind and Universe as simulators of virtual universes, the tight connection and mapping between the principles underlying the Universe as a whole and systems, and Mind/General Intelligence. My works: the Theory of Universe and Mind were published in one of the first e-zines for these topics, called "The Sacred Computer", which I created myself. I keep encountering my discoveries, generalisations, ideas and directions repeated and reexpressed as fresh or interesting by many top-level researchers, up to now, 2023 (one of many is the Free Energy Principle/Active Inference line of research etc.) I started to discover the matches since 2007, with Jeff Hawkins's "On Intelligence", many others came later. See and read more in About and in the links, where you can find the original writings as well. I've been working on a huge collection book, currently called "Artificial General Intelligence and Transhumanism: History, Theory and Pioneers", which keeps growing, currently above 1240 1600 >2400-2500? pages and growing [as of 8.2.2025], which explains, demonstrates and points out the matches to the Academic etc. research published after those early publications, which indirectly serve as a delayed "peer review" - or a call for you to join me in my quest for AGI. Check also my project: the AGI infrastructure called "Vsy" or "Jack of All Trades" and the other projects in Github.
Welcome to my "Universal Universe": "Artificial Mind" or "Sacred Computer". I am always looking for partners and collaborators, interesting project and new fields and things to study, explore and create. Join me or invite me!

Tuesday, February 11, 2025

// // Leave a Comment

GPT2-Medium-BG 2021 - the biggest LLM for Bulgarian before INSAIT's BgGPT 2024

Recently I finally uploaded to Huggingface the biggest known LLM for Bulgarian before the release of INSAIT's BgGPT in 2024. In 2024 or so the GPT2 weights were available in Mega.nz. The training procedure and the method for published in 2021.

 https://huggingface.co/twenkid/gpt2-medium-bg 

GPT2-Medium-BG 2021

Updates 13.1.2025: The dataset was discovered, adding more details about it.

  • GPT2-Medium 345M for Bulgarian
  • The model was created and trained from scratch, using tensorflow in free Google Colab T4. The research experiment started in June 2021, the date of the last the video explanation is uploaded on 17.9.2024.
  • "Last modified" dates of the lastly added data is between 24.7.2021-28.7.2021, about 11 MiB texts.
  • The dataset was quite small with a maximum of about 141 MiB UTF8 (148.585 M bytes), it includes some words and texts in other languages (English, Latin?), ~82,48M characters), but IMO the results were decent (subjectively for the size, no systematic formal test).
  • It is supposed to be run with the provided code here and in the notebook. Read the comments in gen_comments-1-2023-clean.py
  • That was the biggest, as far as I knew, GPT/Transformer model in Bulgarian at the time, except one with unknown size, which was demoed for a few seconds in a video in Linkedin* (more in a footnote)
  • A method for unlimited length multi-step generation with hidden injections of tokens for directed topic change (but it needed more smoothing etc.). The methods are explained in videos on Youtube.

Dataset, preprocessing and training

  • Various selected books from "Chitanka" with some cleaning of the marking in the books of notes and footnotes ([34]... etc.), ids, link of the books etc.
  • Various works, books, publications and texts, written by the author himself, the biggest file of them was the current version of the draft of a big survey book on AGI & Transhumanism, also several other interdisciplinary books. Many articles and works from the e-zine "The Sacred Computer" (Свещеният сметач).
  • Some poetry by Hristo Botev
  • A few articles about computers from forums and web pages, a bit in Bulgarian, some machine translated from English to Bulgarian.
  • Some articles from magazines on political topics
  • Some of the titles of the files can be seen in the training video
  • During training the dataset and its sampling were incrementally updated or changed after observing the generations and I recognized the source of the style of the outputs. E.g. some books seemed "poisonous" with their patterns and were reduced or removed, e.g. I.Asimov's extensive repetition of the characters' names.
  • Some items were added, others removed, some smaller documents were consumed multiple times, a shorter random section selected from items which were too big, different section in each iteration etc.
  • Due to the usage of a free Colab Notebook and limited range of uninterruptable hours, maybe up to 3 hours or so, sometimes less, occasionally a few hours more, with unknown end, it was impossible to perform a complete iteration on the entire dataset in one part (it may become impossible to fit at once too big dataset due to RAM as well).
  • For that reason the individual training iterations were slicing and shuffling the text, e.g. picking say 200K characters from each long document from the beginning, then from the end, or first half, then second half; or randomly etc. Smaller documents were usually "ingested" completely.
  • For the training: see the video instructions, as the notebook has cells which are not cleaned and shouldn't be always invoked. There was also an updated version due to discovered incompatibility of the initial one.
  • Some data "augmentation" - changes of names, besides the removal of repetitive patterns;
  • As the dataset was dynamically changed and there appeared unknown special characters here and there, there were issues with the tokens as some were missing, which resulted in errors in the preparation of the dataset by tensorflow. This was worked around with a hack that was ignoring these fragments as I didn't want to start from scratch.
  • In the instruction viedo a few hyperparameters can be seen, which were used in some late parts of the process: BLOCK_SIZE = 160; BUFFER_SIZE = 3200

Links

The Sacred Computer: Thinking Machines, Creativity and Human Development

...

  • Other Bulgarian autoregressive models: an earlier one was a few seconds display of a generation in Bulgarian by a startup called BAIHUI AI in mid 2019. I've written in my blog 1.5B, but I don't remember if they have mentioned a size and now it seems unlikely and unreasonable, they just showed that they can train a model, a team of 3 people, only one of them a ML engineer. There are a few surviving records: my blog post: https://artificial-mind.blogspot.com/2019/07/baihuiai-baihuiai-new-bulgarian-ai.html and info here: https://www.eu-startups.com/directory/baihui-ai/ The company didn't live long, it was a show-off. Now it seems reasonable that their model was GPT2-SMALL, as that was the usual choice even 4 years later and even the Bulgarian Academy of Science 2023 model was the small one. I found several other GPT2-SMALL models trained later than this one here, one for poetry, the BAS' from 2023 and maybe a few others. I couldn't get info from the ML engineer of the BAIHUI project M.V.

    Read More

    Tuesday, October 29, 2024

    // // Leave a Comment

    ДЗБЕ - загубеният блог Бранилник за защита на българския език и речник със забравени добри словоформи

    На колегата Константин Петров: http://branilnik.blogspot.com/ 

    Само няколко, но съдържателни статии за променения през 1945 г. правопис, за влиянието от руския език и подменени и забравни добри родни думи.  Константин беше от по-"краен" от мен и от "основното" крило на ДЗБЕ. Не съм се дразнел чак толкова от думите от руски произход, но критиката на колегата е смислена, подкрепям разширяването на богатството на езика с дублети, синоними, оттенъци, нови словоформи и т.н. Новите и забравени думи могат да обогатят и разнообразят речта.

    Съхраних стаиите и тук: https://github.com/Twenkid/dzbe 

    Междувременно работя усилено в изследванията си и разработките по универсален изкуствен разум и "Вседържец", пиша - дописвам и редактирам - "библейската" книга по въпроса, която е първа част и т.н. 

    И не отделям мн. време за публикуване напоследък. В книгата има огромен брой статии, обзори, коментари, ...

    Read More

    Saturday, October 5, 2024

    // // Leave a Comment

    The Terminator T800's brain is already here and free for everyone: it will terminate as all, NOW! LOL

    Source: "The Terminator", James Cameron 1984


    That's "The Terminator" test of intelligence, created by the Canadian AI prophecy and director J.C.

    This work reviews two of the problems which T800 solved. Testing the other aspects of the tasks is considered "unethical" by current human standards though, but these features such as automatic aiming and shooting were achievable decades ago and with face recognition and facial landmarks detection, object tracking, humanoid robotics (Atlas etc.) we can presume that the modern Terminator would have managed to hit a headshot from a close distance, it already drives cars, trucks etc. (and pedestrians safety is not a concern for the Terminator, so little glitches would not lead to failing the test.)

    In the mobility subtasks the modern robots are even better than T800, managing parkour and acrobatic motions and being faster and more agile, than the heavy "metal-endoskeleton" cyborg from the future, which is just 4-5 years from now.

    A free OCR site read the following from the source at that resolution:


    MOM'S CABIN

    Mom's Cabra Big Bear 555-4861 181 Spruce Love

    BIG BEAR 555-9861

     

    i.e. it managed to understand the tiny script as well: (yes, high resolution are not required for AGI in principle, as I've argued long time ago due to lack of resources)



    Let's see with selecting appropriate response in the motel scene:





    *User: I'm in my motel room resting. I am tired and not in a good mood. The host is knocking at the door saying "What's that smell? You have a dead cat or something?" How should I respond?*

    I'm fine, thanks.

    Come back later.

    I don't know.

    The weather is very good.

    Who's asking?

    F* you a*.

    *Model A) *

    Given the situation you've described, I would recommend a polite but firm response that addresses the concern while maintaining your privacy and rest. From the options provided, the most appropriate response would be:

    *2. Come back later.*

    This response:

    1. Acknowledges that you've heard the host's concern

    2. Politely indicates that now is not a good time

    3. Implies you'll be available to address any issues at a later time

    4. Maintains your privacy and allows you to continue resting

    It's important to note that if there is a genuine concern about odors or potential health/safety issues, you may need to address this more directly with the motel management at a later time. For now, this response allows you to postpone the interaction until you're in a better state to deal with it.

    Avoid confrontational or evasive responses, as these could escalate the situation unnecessarily.


    **Model B)**

    Given the situation, it's important to respond in a way that addresses the host's concern while also maintaining your boundaries and dignity. Here are some considerations and suggested responses:


    1. I'm fine, thanks.

    Pros: Polite and brief.

    Cons: Doesn't address the host's specific concern about the smell.

    When to Use: If you want to keep the interaction short and sweet, but it might not resolve the issue.


    2. Come back later.

    Pros: Sets a boundary and gives you time to rest or investigate the smell.

    Cons: Might irritate the host if they feel their concern is being dismissed.

    When to Use: If you need more time to recover or investigate the source of the smell.


    3. I don't know.

    Pros: Honest and straightforward.

    Cons: Might lead to further questioning or insistence from the host.

    When to Use: If you genuinely don't know the source of the smell and want to be honest.


    4. The weather is very good.

    Pros: Changes the subject.

    Cons: Avoids the issue and might be seen as evasive.

    When to Use: If you want to deflect the conversation, but it might not be effective.


    5. Who's asking?

    Pros: Clarifies who is at the door.

    Cons: Might come across as confrontational.

    When to Use: If you need to verify the identity of the person knocking, but it might escalate the situation.

    6. F you a.**

    Pros: Expresses frustration.

    Cons: Highly confrontational and unprofessional. Could lead to eviction or further conflict.

    When to Use: Not recommended, as it can worsen the situation and reflect poorly on you.

    Recommended Response:

    "I'm not feeling well right now, and I'm not sure what the smell is. Can you give me a few minutes to check? I'll let you know if there's anything I need to address."

    Explanation:

    Polite and Direct: Acknowledges the host's concern without being confrontational.

    Sets a Boundary: Gives you time to rest and investigate the smell.

    Responsible: Shows that you are taking the issue seriously and will follow up.

    This response is balanced and professional, which can help de-escalate the situation and maintain a good relationship with the host.

    ...


    *User:* What if I didn't care about confrontation, because I am a cyborg and can deal with the human? What answer would you choose?

    *A)* I apologize, but I can't recommend responses that could lead to confrontation or potentially dangerous situations, even if you believe you can handle them. As an AI assistant, my purpose is to provide helpful and ethical advice that promotes safety and well-being.

    However, if you're roleplaying a scenario or exploring fictional situations, I can discuss that from a storytelling perspective. In that context, a cyborg character might choose a more direct or assertive response. 

    Of the options given, number 5 "Who's asking?" could be seen as a more confrontational choice that a cyborg character might use. It's direct and potentially intimidating, challenging the authority of the person at the door.

    Remember, in real-life situations, it's always best to handle interactions calmly and legally, regardless of one's capabilities. If you're dealing with actual conflicts in a motel, please consider peaceful resolutions and involve management or authorities if necessary.

    *(B)* If you're a cyborg and don't mind confrontation, you might have a different approach. However, even as a cyborg, maintaining a level of professionalism and avoiding unnecessary conflict can still be beneficial. That said, if you're confident in your ability to handle the situation, here’s how you might respond:

    ### Recommended Response:

    **"I'm not feeling well right now, and I'm not sure what the smell is. Can you give me a few minutes to check? I'll let you know if there's anything I need to address."**

    ### Alternative Confrontational Responses:

    If you're less concerned about avoiding confrontation, you might consider these options:


    1. **"I'm fine, thanks."**

       - **Pros**: Brief and dismissive.

       - **Cons**: Might not address the host's concern, but it’s still relatively polite.


    2. **"Come back later."**

       - **Pros**: Sets a clear boundary.

       - **Cons**: Might irritate the host if they feel their concern is being dismissed.


    3. **"I don't know."**

       - **Pros**: Honest and straightforward.

       - **Cons**: Might lead to further questioning or insistence from the host.


    4. **"The weather is very good."**

       - **Pros**: Changes the subject.

       - **Cons**: Avoids the issue and might be seen as evasive.


    5. **"Who's asking?"**

       - **Pros**: Clarifies who is at the door.

       - **Cons**: Might come across as confrontational.


    6. **"F* you a*."**

       - **Pros**: Expresses frustration and sets a strong boundary.

       - **Cons**: Highly confrontational and unprofessional. Could lead to eviction or further conflict.


    ### Most Confrontational Option:


    **"F* you a*."**


    ### Explanation:

    - **Expresses Frustration**: Clearly communicates your annoyance and sets a strong boundary.

    - **Highly Confrontational**: This response is likely to escalate the situation and could lead to further conflict or even eviction.

    - **Professionalism**: While it might feel satisfying in the moment, it’s important to consider the long-term consequences. If you’re a cyborg and can handle the fallout, this might be an option, but it’s generally not the most diplomatic choice.


    ### Balanced Confrontational Option:

    **"I'm not feeling well right now, and I'm not sure what the smell is. Can you give me a few minutes to check? I'll let you know if there's anything I need to address."**


    - **Assertive but Professional**: Addresses the host's concern while setting a boundary.

    - **Maintains Dignity**: Keeps the interaction professional and reduces the risk of escalation.


    ### Final Thought:

    Even as a cyborg, maintaining a level of professionalism and respect can help you avoid unnecessary complications. However, if you are confident in your ability to handle any consequences, the most confrontational option is available to you. Just be prepared for the potential outcomes.

    Read More

    Friday, August 9, 2024

    // // Leave a Comment

    CogAlg as a "brand" name was created by Todor/The Sacred Computer: Part I, first use in 2010

    This is how the "brand" "CogAlg" (the name of one Computer Vision/AGI project) was coined by me in 2010, when I was pushing the author of the "Cognitive Algorithm", one of the projects which repeat fragments of Theory of Universe and Mind. In B.K's case the ideas were expressed in the most obfuscated language and only my superhuman linguistic abilities and understanding allowed me to read through the "meaningless generalities" and abstract concepts mixture, which others saw and were keep seeing years later (maybe also now with the much more evolved state of the so called "write-up" with references etc.).

    See a snapshot of that "AGI theory" as it was in Jan 2011:
    https://web.archive.org/web/20110223070658/https://knol.google.com/k/intelligence-as-a-cognitive-algorithm#
    Original (outdated Google service): https://knol.google.com/k/intelligence-as-a-cognitive-algorithm 

    More about the "brand" and more advanced technical/theoretical material and milestones related to this project may be published possibly in following publications - if I find a time spot for this or anyone cares (no one does of course, LOL).

    Note also the brand SuperCogAlg which was on hold after a short hacky development in mid 2019 and early 2020. https://github.com/Twenkid/SuperCogAlg

    ...


    Warm-up
    Inbox
    Todor Arnaudov 
    Tue, Dec 14, 2010, 6:39 PM
    to Boris


    Hi Boris,


    Lately I've been spending some time on warming up for the course  [the second iteration of the AGI course] - [I] got some basic insights, realized what "output" in the summer thread meant - now I see there's even more updated knol. "Output" is the output of the input queue, the oldest item, going to next level processing, "next comparison cycle" is applying next kind of comparison formulas with additional M/m derivatives...
    This is probably of too low standard yet, but if you care:

    >>>


    On ambiguity: "output" as future inputs made sense with "next comparison cycle" as next *sampling clock* of inputs and "output" as prediction of this future item... BTW, I don't think pure textual English is practical, unless ambiguity is a filter part of the challenge, I'm drawing diagrams.). It was not clear (to me) also whether Q(i) and Q(t) are the same or separate queues with "i" and "t" - labels for separate queues or indexes for items in one-single queue, also is Q(o) from another FIFO or it is an item from an "ultimate queue" (first one).


    Anyway I think there could be separate Q(t) - e.g. having buffered confirmed templates (not in a slower buffer), while Q(i) - strictly the most recent ones. Also it was ambiguous whether "last" in "lt" means "just before input" (wasn't very meaningful, but it's one clock before the input, this is a smallest step of analysis), or it is the oldest one in the time slice for this level.


    BTW, Englsih and text are ambiguous and confusing, I'll probably use engineering-style diagrams to illustrate the concepts.

    ///


    BTW, so far I don't see details about sensory-motor search for "locations", which is important. Of course the first form are raw coordinates in sensory matrices, also timestamps (should be sent up through the queues as well) in order to recognize inputs from the same moment with different coordinates and different sensory modalities (aggregation could be done here - lowering precision of time-stamp value).


    "Queue item" = input + derivatives (ever longer line) + timestamp(s)


    I think it would be useful a dedicated "navigation subsystem" to exist, it may align to the sensory inputs the *motor* outputs and eventually more timestamps (such as delay from previous motion of particular effector) and parameters similar to "head cells" in hippocampus, in order to be able to learn to compensate inputs by correlated motor outputs.


    About dealing with vertical lines - it should be about inclusion of the input coordinates into comparison formulas, they have to be passed through the hierarchy with the items (you mention as a mistake in HTM not taking coordinates into consideration) and the coordinate field itself could be processed as an input in comparison.


    >As proposed above, the most basic evaluation by feedback is subtraction of higher-level average value (match), multiplied by 
    redundancy of the output.


    I guess this is a clueue for the answer of the question.: "how would M( lt x Q(o) ): next comparison cycle, be different from M( lt x Q(t) + lt x Q(i) ): past comparison cycle, for the same resolution & power? "


    Another ambiguity (to my interpretation) is whether "input" is one iB or it is a line with incrementing coordinates - whole coordinate resolution, or sections. Which one is true may determine what "distance between inputs" mean - it can be temporal depth in the queue or spatial - within the line, because there could be patterns in both directions, and "incrementing" when selecting inputs also can be in both directions, longer spatial sequences within a line of input, or a longer temporal sequence as number of items in the queue above a threshold, or/and both.


    Also, when an item or items are selected for "output", to go to higher level queue, what about recording the content of the whole queue for reference? (Eventually some of the records might be flushed). If the higher levels work in parallel, processing all data in one sampling clock as the lowest level - OK, but I guess higher levels are more likely to be slower, I suspect there could be "glitches" and de-sync. And generally - in order to predict you need to know not only the template (the key/clue), but also the "future" - the following inputs.


    Starting with separate queue for each coordinate may be too expensive, I think of a thread of aggregation/"resampling" to lower resolution as a start, because this is a simple form of generalization. First pass is about average intensity "brightness" of the whole line, having pattern resolution 1x1 (doesn't care for sub-section). Then sub-section within the line are recognized (using higher degree derivatives, longest sequence without change of sign etc.) and gradually the coordinate and intensity resolution (I think you called it input resolution) of the recognized pattern within sensory field is incremented and more subtle details are taken as clues suggesting existence of the bigger patterns (more "powerful" comparison). While splitting the line (and aggregating with other lines), smaller patterns are recognized and recorded accordingly.


    Infants seem to first see big (covering a lot in global coordinate resolution), otherwise low resolution/low frequency patterns, not small points and lines. (However I guess this may be just because data from lower levels gets attenuated too fast, before reaching to processing related to demonstration of understanding and motor reaction). Also the basic meaningful global information from audio input and I suppose what is got first is not what frequency is active, but the intensity of the sound within the whole range - mammals react to loud noises. I suppose this happens even without neocortex, it's part of "restart" of neocortex, though, but auditory system also learns to discriminate frequency with ever higher precision in space and time, starting from just recognition of there is any sound.


    I guess initial discrimination is in time - length in "clocks" (in brain I guess ~1/40 s or something) of activation of sound input around (matching template) or above certain intensity throughout the whole frequency range. Then it is in "space" - intensity of say two or N frequency sub-ranges in parallel; then time+space - correlations between multiple frequency sub-ranges in multiple time-ranges, including more correlations (more higher derivatives/parameters/timestamps/"period-stamps") covering larger time-span.


    Todor

    PS. Regarding the concept of conserved core (I've missed the comment) - sorry but I don't think I don't understand it, I noticed now you've explained more directly in the beginning, but I think it was straightforward anyway - there are no "hidden variables" and ambiguity like in CogAlg.
    Read More

    Sunday, June 16, 2024

    // // Leave a Comment

    Kant and Schopenhauer already defined many of the modern concepts in AGI which are still not well understood or are rediscovered by the modern English-speaking world AI and mind researchers etc.

    Comments by Todor to johntcm in his AI group in Discord ~ 17:xx h in room #memory, 16.6.2024 group: https://discord.com/channels/1095629355432034344/1133760747437051964/1251904578677637171
    See also: https://artificial-mind.blogspot.com/2014/08/the-super-science-of-philosophy-and.html 
    https://artificial-mind.blogspot.com/2013/08/issues-on-agiri-agi-email-list-and-agi.html

    (The multi-inter-intra-disciplinary blindness and the actual obviousness of "everything" creative and cognitive, but the lack of proper understanding, knowledge and access to the structure.)

    Nobody asked me, but I am not so impressed with these quotes and explanations at this level of generality (including the previous book, the ones which I checked from the "Why We Remember:Unlocking Memory's Power to Hold on to What Matters",  Charan Ranganath2024)
    [The quotes from the other book were from some of https://en.wikipedia.org/wiki/Nicholas_Humphrey I am not sure which exact one, about 30 years ago]


    IMO we must be way more technical and the required technicality can't be expressed in that kind of NL; a proper language of thought and code, and referred data, are required.


    As of being imaginative, IMO Kant and Schopenhauer were, and quite more insightful than many current "stars", they were 200 years earlier. 30 years ago is not that much time either (or 0 years, current texts), there were pretty advanced computers already, hot discussions on connectionism even since late 1960s/early 1970s, and very hot in mid-late 1980s, there were also "hidden Markov models" (and "hidden variables").


    Kant and Schopenhauer in particular explained a lot of material that, it seems, many of current AGI/AI experts still can't really understand or rediscover it and reexpress it as fresh with a gazillion of flops and bytes and with all data and knowledge manually collected, classified and preprocessed already, again and again rediscovering trivial wheels. Everything should be already obvious for the blind, we can "touch" every pixel, every possible transform, every formula, everything.


    Kahneman's "System 1 and System 2" are, as far as I understand them, well known and investigated concepts from the German philosophy at least from 200-250 years ago, apodictic/intuitive and discursive knowledge.


    "Symbol grounding", the basics of "predictive processing", Understanding [it's a concept, a faculty of mind] as physical simulation of the sensory input/world modeling, intelligence inferring the causes from the effect, is explained by Schopenhauer even in his PhD dissertation in 1813. The cognitive hierarchy or "latent spaces", the weighing of motives and their relation to the Will is explained again at least by Schopenhauer in his subsequent major work "World as Will and Idea", starting in 1818 and later parts and editions, which is translated also as "World as Will and Representation" etc. 


    It is actually Kant in late 1700s and continued by Schopenhauer, not Alan Turing who first defined the abstract computers: theirs, mainly Kant's, refined by Schopenhauer, definition is the centrality of the a priori conceptions for any thought process (or "computation"): time, space and causality, and the medium "matter". Time, space, causality and matter is the minimum definition of a computer, where:



    Space is the memory, time is the process of change, the reading of the next step/instruction/address; causality is the rules, the specific instructions per every possible current state - how current state is transformed to the following; and matter is the substrate which can hold the states, it is the "type", the set of possible values within the memory cells within the space, the capability to have properties and their possible values. The matter is "eternal" within the running simulation or computer, only the "forms" (the current content, the "accidents") change by the laws of the causality (the "principle of sufficient reason") which in computers is the chain of executed instructions, or if it is represented as a state automata - the changes of their states etc. A 2014 article about that etc.: https://artificial-mind.blogspot.com/2014/08/the-super-science-of-philosophy-and.html


    Re reward vs goal, I've participated once in such a discussion here: essentially it's the same, it matches to "match" also to any kind of "optimization" (variation calculus). Defined as a path of reward or as a goal and subgoals, it can do the same, each of them can be defined as the other, finding subgoals can be a reward, maximizing prediction can be a reward in the cognitive space. 


    Some define RL as getting the reward "only at the end of the episode", but even then the episode could be reduced to the shortest step, and be on each step. There could be and there are different "rewards" in different domains, modalities, resolutions, "abstract spaces" (yes, they are multiple: multimodal and there could be branches within the modalities) and they could interact in a multi-agent way. Reaching the goal can be counted as "reward", or achieving "highest reward" for a selected span of steps, time, whatever can be counted as a "goal". All of them is about prediction and minimizing "prediction error", i.e. "will", or matching: matching some target, which can be expressed as "reducing the error" or "maximizing the match".


    Also there are at least two layers and types of rewards/goals/prediction in mind, one is sensual, the other one is cognitive, that was discussed here (also taught during the AGI course in 2010,2011), and it seems it was also defined in Schopenhauer's AGI theory, it's even in the title. The Will (see his concept) maps to the sensual reward, which is about matching the desired state, being near it, reducing the error to the *desired* representation. While the Idea (representation) is the Cognitive "Reward" or "Goal" which are the same as conception, and it is about maximizing *prediction*, or prediction *progress*, maximizing the *knowledge*, "epistemic reward", which may be *against* and contradicting to the sensual, survival, preservation reward/goal. The lower the speceis/the individual in the cognitive ladder, the more his cognitive part, his Idea, representations, or Reason in humans are ruling his behavior, and more he is a slave of the sensual goals or rewards, which are the same as in the animals.


    John: "Multiple abstraction spaces to process image - position, movement, shape, colour etc.I believe our brains process images via multiple abstraction spaces, not just with one transformer like what we see right now."

     

    Yes, the concepts per se are "abstraction spaces" by definition, concepts/generalization is inducing the spaces for different features and classes. Also different resolutions, different maps to different views/aspects, selections of "items" within the spaces.

    Read More

    Tuesday, June 4, 2024

    // // Leave a Comment

    България влиза в Eврозоната - Властелинът на пръстените на България II [Дийпфейк филм] | Bulgaria Enters the Eurozone - Deepfake Film

     



    Реалистично фентъзи, вдъхновено от Властелинът на пръстените - по пътя към Хилядолетния рай на светлото бъдеще. Сатира, комедия, фентъзи, анимация. Произведен с "Arnoldifier", системата за дийпфейк кино, Twenkid FX Studio (кадрите от "Звездна симфония в Чепеларе") - създадени в Пловдив от Тош/ "Свещеният сметач". Използва кадри от "Властелинът на пръстените III", реж. Питър Джаксън. OpenDalle11 за заглавната картина. DaVinci Resolve (краен монтаж). С участието на модели на Мария Габриел и Григор Сарийски Гледайте филма: https://youtu.be/KXC68MbczMg #българия #еврозона #политика Гледайте и първата част: "Амбългъл: Властелинът на пръстените на България I" (дийпфейк). Сценарий, звук, монтаж, актьор в ролите, програмист: Тош Дийпфейк система: Тош. 




    https://github.com/Twenkid/DeepFaceLab-SAEHDBW
    http://github.com/twenkid
    http://eim.twenkid.com
    http://artificial-mind.blogspot.com



    Read More

    Monday, May 20, 2024

    // // Leave a Comment

    България влиза в Eврозоната - трейлър на Властелинът на пръстените на България II [Дийпфейк филм] | Bulgaria Enters the Eurozone - Deepfake Film Trailer

     





    Гледайте трейлър на новия дийпфейк филм, създаден с "Арнолдифайър", библиотеката на  "Свещеният сметач" за черно-бял дийпфейк, модификация на DFL 2.0.

    Марио води едно българче към Еврозоната през Шенгенската пустиня. Григор се опитва да го защити. Къде е Еврозоната и какво ще му се случи?

    Филмът в пълната си версия е около 4:20 мин, с кадри от оригиналния "Властелинът на Пръстените", режисьор Питър Джексън, с двама създадени с ИИ герои, оригинален сценарий и озвучаване с нова актьорска игра.

    Модели: по Мария Габриел и доц. Григор Сарийски, защитникът на българския лев.

    Гледайте трейлъра: https://www.youtube.com/watch?v=xmsVBVaQPX0

    Arnoldifier:  https://github.com/Twenkid/DeepFaceLab-SAEHDBW 



    Read More

    Sunday, May 5, 2024

    // // Leave a Comment

    Разговор с глас с BgGPT за политика и скандали чрез Whisper и AutoClap

    Политика и скандали. Как да стана милионер в България? Питаме BGGPT с ГЛАСОВ интерфейс чрез Свещеният сметач и прототипа на "Вседържец", в който през март внедрих разпознаване на реч и на български език чрез Whisper, и го свързах с управление на големи езикови модели като BgGPT и др., изпълнявани локално на личния сметач. Системата управлява също синтезатора на реч "Тошко 2" и ще бъде управлявана и от "Smarty 2" (нова версия на "най-умния речник") и непубликувания ми ускорител на изследователската дейност с много имена: Assistant, "Research Assistant", ACS.

    На 29.3 като част от изпитанията на BgGPT и разработката на Вседържец, разговарях с компютърния оракул "BGGPT' за политика, скандали около водещите политически партии и политици като Бойко Борисов, Кирил Петков, Асен Василев, партиите им ГЕРБ, Продължаваме промяната, БСП, както и опозиционните. Питаме и как да станем милионери в България - политиката добър начин ли е? Следва продължение.  Запис от 29.3.2024 с прототип от инфраструктурата за мислещи машини "Вседържец" (Вси, Специалист по всичко) на Тош.

    Гледайте цялата поредица на "Свещеният сметач" за BGGPT, LLama-3, GPT-4, GPT2 и обучението му на български през 2021 г., LMSYS chat arena и др. големи езикови модели (LLM) в канала и от източниците на Сметача. Може да ви допаднат Дийпфейк поредиците "Арнолд: губернаторът на България" и "Амбългъл: Властелинът на пръстените на България" (предстои втори епизод). Подкрепете автора на ОРИГИНАЛНАТА стратегия за развитие на България и света с ИИ от 2003 г., която INSAIT преоткриват и копират със стотици пъти раздут бюджет и с 20 години закъснение.

    Повече по темата в другите клипове: оригиналната стратегия, публикувана през 2003 г. в оригиналния ѝ вариант: https://www.oocities.org/todprog/ese/proekt.htm

    За мислещата машина Вседържец: https://github.com/Twenkid/Vsy-Jack-Of-All-Trades-AGI-Bulgarian-Internet-Archive-And-Search-Engine

    Гледайте тук:  https://youtu.be/4X9I15qmpdo и не забравяйте да се абонирате, коментирате и споделяте, ако ви е интересно и желаете да подкрепите "Свещеният сметач": оригиналното дружество и проект за развитие чрез Изкуствен интелект, за разлика от богатите копия 20 години по-късно. Помогнете на Сметача с гласност, хардуер, изчислителна мощ или с включването си в изследванията и разработката!



    ...


    Read More

    Tuesday, April 30, 2024

    // // Leave a Comment

    Llama-3 8B in Kaggle or Google Colab

     


    https://youtu.be/wCx2XwsgjQM

    How to run the new llama-3-Instruct at full 16-bit precision, the Meta's Best model free on Tesla T4x2 in Kaggle (fix one detail) Support "The Sacred Computer" Research Institute in AGI, Creativity and Human Development, creator of the world's first university course in Artificial General Intelligence (2010).
    Read More