Some possible explanations of mine. Български - като го преведа... :)
Babies early can make a difference between animate and inanimate objects etc...
Todor:They can, because the difference is easy to extract from the raw sensory inputs:
Inanimate objects are:
- Static. E.g. places which seem to drive "place cells" in mammal's brain are simpler stimuli than stimuli caused by "animate" objects.
- Dynamic, but mind can predict their future with a precision that it believes is "high enough". E.g. when one throws a ball, he doesn't know where exactly it will fall, but the ball wouldn't fly at all if an animate object didn't throw it.
- They change in parallel with the will, in pair with the motor commands.
- Linear (generally).
- Places are immovable inanimate objects.
An object is a set of correlated stimuli.
- Their future cannot be predicted with high enough precision and one cannot predict precisely enough their behavior based only on observations. Sometimes they are totally unpredictable.
- They can start to do a sequence of actions on their own, a behavior.
- They can react to your action, without being physically affected/touched.
- Mind can assume that there are internal states, which are unobservable. This is the inanimate object's "free will" or its "state of mind", opinion etc., as opposed to predictable entities, whose state is observable.
- Non linear.
- It's hard to synchronize your will with the will of another "animate objects", or to do so you need to do a complex sequence of voluntary actions.
Where "high enough" precision is a value that mind decides.
Recognition of faces
A baby can easily spot that this is an entity that changes the most in his first perceptions, especially the eyes and mouth. I think the eyes - two spots that move in the same direction and same speed can show to a developing mind that they are correlated. This also goes for nose and mouth, which all do translate in parallel. Entities which are correlated can be grouped together, this is "a face", i.e. this is a set of dynamic correlated elements.
Actually, this can be one of the first coherent and stable set of visual correlations/patterns that the mind understands, and may serve as a basis for the future ones, which may explain existence of "face cells" or so. Faces are seen everywhere afterwards, and we use to see faces in inanimate objects.
Besides, these stimuli are early and further strongly conditioned with the baby's/one's own feelings, behavior and other agents' behaviors - these functions are exercised a lot.
Why baby's cry irritates us?
Nativists: "Evolutionary encoded" etc., to call his mother to deal with it...
Actually a very simple explanation of this is: because it reminds us our own cry, and these sound patterns were conditioned with our own early unpleasant experiences.
Why "mama" means a mother in so many languages?
This is also a simple one to explain - my theory is that "mama" requires one of the simplest possible articulations, if not the simplest - just the mouth opens and closes twice + breathing.* The sounds is recognizable even without a breathing, and even dogs can be trained to say "mama".
I think it is important that there are two open-close operations, because this may serve as a confirmation of the first syllable.
So I believe that the babies themselves have coined the word for their mothers, then their parents started to use it themselves.
*BTW, I've been speech synthesis designer and developer and have some projects, but not the time....
- head direction cells
- spatial view cells
- place cells
At least several or even all of these can be generalized. Places and navigation goes together. Places are long-term memories of static immovable inanimate objects (the agent has not experiences that these entities move).
Navigation, head-direction, spatial-view, place-cells - they all are a set of correlations found between motor and sensory information, and long-term memories, which are invoked by the ongoing motor and sensory patterns.
The static immovable inanimate objects (places) change - they translate/rotate etc. - most rapidly in a correlation with head direction (position) and head movements.
Navigation and spatial view are derived from all.
This one needs work, but in short I would say that:
The language is a hierarchical redirection/abstraction/generalization/compression of sequences of sensory inputs and motor outputs, and records and predictions for both.
Chimps can communicate using sign languages, such as: http://en.wikipedia.org/wiki/Washoe_(chimpanzee)
They were found experimentally in rhesus monkeys (it's well known that monkeys use to imitate people) . Monkeys can imitate facial expressions and operations with hands such as picking.
There is a famous research of Miltzoff proving that newborn human babies at age of a few weeks are capable to imitate some facial expressions, like sticking out a tongue. They've never seen themselves and other experiments show that babies at that age are not capable to recognize different faces or to find a difference between a face without a nose or mouth or 2D vs 3D etc. (it smiles when see fake faces, by Fantz, R., 1966 trough R. Stamatov "Child Psychology")
This is interesting, I don't know how far the experiments have gone in the aspect of showing to the babies stimuli which are similar to faces.
This capability may require inborn links between basic image analysis (to find contrast/color change etc.) and motor commands to facial and tongue muscles, I suspect that part of this may be done out of the neocortex, it can be of a very low resolution. Or there can be some pre-wired mirror neurons at the neocortex, related to this.
Other researches state that a one month baby vision has about 1/30 of the adult's resolution, which grows rapidly to about 1/4 at 8 months.
Regarding imitation of manual operations by monkeys, though, I believe that this can be learned by mapping of similarities. Human arms and hands are visually similar to monkey's ones, they are "sticks" and "planes" which translate, rotate etc.
Unlike with the face, a monkey can see its own hands and other hands and can make a map between both. I don't know how it's done, but I think it can be done even without a pre-wired map.
I would quote my own teenage Theory of Mind and Universe (only Bulgarian yet: http://eim.hit.bg/razum), where I concluded that mind is a Universal hierarchical simulator and predictor of virtual universes, where these virtual universes are derived from sensory inputs.
Google keywords and expressions: "mirror neurons", "Miltzoff", "chimps sign language" etc....
Keywords: Artificial General Intelligence, Seed AI, Development, Cognition, Developmental psychology, Neuroscience