The Development of Embodied
Cognition: Six Lessons from Babies
Linda Smith
Psychology Department
Indiana University
Bloomington, IN 47405
Michael Gasser
Computer Science Department
Indiana University
Bloomington, IN 47405
Keywords
Development, cognition, language,
embodiment, motor control
Abstract The embodiment hypothesis is the idea that intelligence
emerges in the interaction of an agent with an environment and as a
result of sensorimotor activity. We offer six lessons for developing
embodied intelligent agents suggested by research in developmental
psychology. We argue that starting as a baby grounded in a physical,
social, and linguistic world is crucial to the development of the flexible
and inventive intelligence that characterizes humankind.
1 Introduction
Traditional theories of intellig ence concentrated on symbolic reasoning, paying little attention to the
body and to the ways intelligence is affected by and affects the physical world. More recently, there
has been a shift toward ideas of embodiment. The central idea behind the embodime nt hypothesis is
that intelligence emerges in the interaction of an agent with an environment and as a result of
sensorimotor activity. This view stands in opposition to more traditional notions of internal
representation and computation and in general has had little to say about symbols, symbolic
reasoning, and language. In this article we offer six lessons for developing embodied intelligent agents
suggested by research in developmental psychology. We argue that starting as a baby grounded in a
physical, social, and linguistic world is crucial to the development of the flexible and inventive
intelligence that characterizes humankind. In preview, the six lessons are these:
1. Babies’ experience of the world is profoundly multimodal. We propose that multiple overlapping
and time-locked sensory systems enable the developing system to educate itself without defined
external tasks or teachers just by perceiving and acting in the world.
2. Babies develop incrementally, and they are not smart at the start. We propose that their initial
prematurity and the particular path they take to development are crucial to their eventual outpacing
of the world’s smartest AI programs.
3. Babies live in a physical world, full of rich regularities that organize perception, action, and
ultimately thought. The intelligence of babies resides not just inside themselves but is distributed
across their interactions and experiences in the physical world. The physical world serves to
bootstrap higher mental functions.
4. Babies explorethey move and act in highly variable and playful ways that are not goal-oriented
and are seemingly random. In doing so, they discover new problems and new solutions.
Exploration makes intellig ence open-ended and inventive.
n 2005 Massachusetts Institute of Technology Artificial Life 11: 1329 (2005)
5. Babies act and learn in a social world in which more mature partners guide learning and add
supporting structures to that learning.
6. Babies learn a language, a shared communicative system that is symbolic. And this changes
everything, enabling children to form even higher-level and more abstract distinctions.
These ideas are not without precedent in the robotics and artificial intelligence literature. For
example, Brooks et al. [6] and Pfeiffer and Scheier [33] have demonstrated how solutions may fall
out of physical embodiment. Breazeal [5] and the Kismet project are beginning to explore how social
interactions can bootstrap learning. And exploration is a key idea in machine learning, especially
reinforcment learning [50]. The lessons to be learned from human development, however, are not
fully mined. Greater interaction between those who study developmental processes in children and
those who attempt to create artificial devices that develop through their interactions in the world
would be beneficial to both sets of researchers. Accordingly, we offer these lessons from the
perspective of researchers who study how babie s become smart.
2 Six Lessons
2.1 Lesson 1: Be Multimodal
People make contact with the physical world thr ough a vast array of sensory systems vision,
audition, touch, smell, proprioception, balance. Why so many? The answer lies in the concept of
degeneracy [15]. The notion of degeneracy in neural structure means that any single function can be
carried out by more than one configuration of neural signals and that different neural clusters also
participate in a number of different functions. Degeneracy creates redundancy such that the system
functions even with the loss of one component. For example, because we encounter space through
sight, sound, movement, touch, and even smell, we can know space even if we lack one modality.
Being blind, for example, does not wipe out spatial concepts; instead, as studies of blind children
show [25], comparable spatial concepts can be developed through different clu sters of modalities.
Degeneracy also means that sensory systems can educate each other, without an external teacher.
Careful observers of infants have long noted that they spend literally hours watching their own
actions [34, 7] holding their hands in front of their faces, watching as they turn them back and
forth, and, some months later, intently watching as they squeeze and release a cloth. This second
characteristic of multimodality is what Edelman [15] calls reentry, the explicit interrelating of multiple
simultaneous representations across modalities. For example, when a person experiences an apple
and immediately characterizes it as suchthe experience is visual, but also invokes the smell of the
apple, its taste, its feel, its heft, and a constellation of sensations and movements associated with
various actions on the apple. Importantly, the se multimodal experiences are time-locked and
corre lated.
Changes in the way the hand feels when it moves the apple are time-locked with the changes one
sees as the apple is moved. The time-locked correlations create a powerful learning mechanism, as
illustrated in Figure 1, which shows four related mappings. One map is between the physical
properties of the apple and the neuronal activity in the visual system. Another map is between the
physical properties of the apple and neuronal activity in the haptic system. The third and fourth
maps are what Edelman calls the reentrant maps: Activity in the visual syst em is mapped to the
haptic system , and activity in the haptic system is mapped to the visual system. Thus the two
independent mappings of the stimulusthe sight and the feelprovide qualitatively different
glosses on the world, and by being correlated in real time, they educate each other. At the same time,
the visu al system is activated by time-varying changes in shading and texture and collinear movement
of points on the apple, and the haptic system is activated by time-locked changes in pressures and
textures. At every step in real time, the activities in each of these heterogeneous processes are
mapped to each other, enabling the system in its own activity to discover higher-order regularities
that transcend particular modalities.
Artificial Life Volume 11, Number 1214
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
One clear demonstration of the power of this idea comes from a study of how babies come to
understand transparency. Transparency i s a problematic c oncept; think of birds who har m
themselves by trying to fly through windows. Transparency is a problem because correlations
between visual cues and the haptic cues that characterize most of our encounters with the world do
not work in this case. So babies, like birds, are confused by transparency. In one study, Diamond [13]
presented infants with toys hidden under boxes such that there was an opening on one side, as
illustrated in Figure 2. These boxes were either opaque hiding the toyor transparent so that the
infants could see the toy under the box. The key result is that 9-month-old infants are better able to
retrieve the toy from the opaque than from the transparent container. The problem with the
transparent container is tha t infants attempt to reach for the toy directly, through the transparent
surface, rather than searching for and finding the opening.
Infants readily solve this problem, however, if they are given experience with transparent
containers. Titzer, Thelen, and Smith [55] gave 8-month-old babies a set of either opaque or
transparent buckets to play with at home. Parents were given no instructions other than to put these
containers in the toy box, making them available to the infants during play. The infants were then
tested in Diamond’s task when they were 9 months old. The babies who had been given opaque
containers failed to retrieve objects from transparent ones just as in the original Diamond study.
However, infants who had played with the transparent containers sought out and rapidly found the
openings and retrieved the object from the transparent boxes.
Why? These bab ies in their play with the containers in the interrelation of seeing and
touchinghad learned to recognize the subtle visual cues that distinguish solid transparent surfaces
from no surface whatsoever and had learned that surfaces with the visual properties of transparency
are solid. The haptic cues from touching the transpa rent surfaces educated vision, and vision
educated reaching and touch, enabling infants to find the openings in transparent containers. These
results show how infants’ multimodal experiences in the world create knowledge about openings,
object retrieval, and transparent sur faces.
Recent experimental studies of human cognition suggest that many concepts and processes may
be inherently multimodal in ways that fit well with Edelman’s idea of reentrance [3, 16, 24]. One line
Figure 2 . A toy (ball) hidden under a transparent box and an opaque box in the Diamond task. The opening is indicated
by the arrow.
Figure 1 . Illustration of the time-locked mappings of two sensory systems to the events in the world and to each other.
Because visual and haptic systems actively collect information by moving hands, by moving eyes the arrows
connecting these systems to each other also can serve as teaching signals for each other.
Artificial Life Volume 11, Number 12 15
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
of evidence for this conclusion is that even in tasks meant to be explicitly unimodal, multiple
modalities contribute to performance. For example, visual object recognition appears to automatically
activate the actions associated with the object. In one study, adults were shown a picture of a water
pitcher such as that illustrated in Figure 3. The task was simple: to indicate by pressing a button
whether the object was a pitcher (‘‘yes’’) or it was not (‘‘no’’). Response time was the dependent
measure. This is a purely visual object recognition task. Yet the participants were much faster at
recognizing the object if the button pressed to indicate the ‘‘yes’’ resp onse was on the same side as
the pitcher’s handle, as if seeing the handle primed (and readied) the motor response of reaching to
that side. Similar results have been reported with a wide variety of objects and in tasks using several
different methods. In general, people are faster in visual recognition tasks when the response to be
made is compatible with a real action on the object. These results tell us that visual recognition is of a
piece with, in the same internal language as, action. This is how it must be under the idea of reentrant
mappings, where visual recognition is built out of and educated by its time-locked connections with
actions on objects.
2.2 Lesson 2: Be Incremental
Traditionally, both machine lear ning and human learning have concentrated on non-incremental
learning tasks, tasks in which the entire training set is fixed at the sta rt of learning and then is either
presented in its entirety or randomly sampled. This is not, however, the way children encounter the
world. The experiences of a 3-month-old are very different from (and much more constrained than)
the experiences of a 1-year-old, whose experiences, in turn, are very different from those of a 2-year-
old. All indications are that these systemat ic changes in the input, in the range and kind of
experiences, matterthat, in fact, they determine the developmental outcome.
Infants’ early experiences are strongly ordered by the development of sensory systems and
movement systems. At birth, audition and vision are online, but vision is limited by the infant’s ability
to focus. Nonetheless, shortly after birt h, infants look in the direction of sound [31, 57].
Incrementally, over the next few months, subtler and subtler sound properties in relation to visual
events begin to take control of visual attention, so that infants look at the visual event that matches
what they hear. For example, given two visual displays of bouncing balls, 4-month-olds look at the
displays that are in temporal synchrony with the sound of a bouncing ball [49]. This coupling of
hearing and looking organizes infants’ attention and thus what they learn. Indeed, children without
audition, deaf children, show altered and more disorganized visual attention [47].
Infants’ coordination of looking and listenin g is a form of the reentrant mappings and
multimodal learning highlighted under lesson 1. But the important point for lesson 2 is that these
corre lations do not stay the same over developmental time. After looking at and listening to the world for 3
or 4 months, infants begin to reach for objects, and the multimodal correlations change. Once
infants can reach, they can provide themselves with new multimodal experiences involving vision, haptic
Figure 3. Illustration of the Tucker-Ellis task. On each trial, the participant is shown one pitcher and is asked to answer as
rapidly as possible the question: ‘‘Is this a pitcher?’’ On some trials the pitcher’s handle is on the left; on some trials it is
on the right. Half the participants answer ‘‘yes’’ by pressing a button on the right and half by pressing a button on the left.
Participants are faster when the handle is on the same side as the ‘‘yes’’ response.
Artificial Life Volume 11, Number 1216
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
exploration, proprioceptive input from self-movement, and audition as the contacted objects squeak,
rattle, or squeal. After weeks and months of living in this new multimodal venue of sitting, looking,
listening, reaching, and manipulating objects, infants’ experiencesand the correlations available to
themagain change radically, as they begin to crawl and then to stand up and walk. Self-locomotion
changes the nature of the vi sual and auditory input even more dramatically, and the evidence
suggests that it also profoundly changes infant s’ cognitive development.
We review one piece of evidence for this idea that dramatic shifts in the input shifts that result
from changes in the infants’ own behaviorcause equally dramatic shifts in cognitive development.
Our example involves one of the best-studied tasks of infant cognition, the so-called object-concept
or A-not-B task [48, 51]. Piaget devised this task to assess when infants understand that objects
persist in time and space independent of one’s own actions on them. In this task, illustrated in
Figure 4, the experimenter hides a tantalizing toy under a lid at location A. A 35s delay is imposed
before the infant is allowed to search. Typically infants reach correctly to the hiding location A and
find the hidden toy. This A-location trial is repeated several times. Then, there is the critical switch
trial: The experimenter hides the object at a new locati on, B. A brief delay is again imposed, and then
the infant is allowed to reach. Infants 810 months of age make a curious ‘‘error.’’ They reach, not
to where they saw the object disappear, but back to A, where they had found the object previously.
This ‘‘A-not-B’’ error is especially compelling in that it is tightly linked to a highly circumscribed
developmental period; infants older than 12 months search correctly on the critical B trials. Why this
dramatic shift in search behavior between 8 and 12 months of age?
The shift appears to be tightly tied to self-locomotion, which also emerges in this same period.
Individual infants stop making the error when they begin to self-locomote. Critically, one can take
infants who do not yet self-locomote and who make the error and, by putting them in walkers, make
them self-locomote 3 to 4 months earlier than they normally would. When one experimentally
induces early experiences in self-locomotion, one als o accelerates the development of successful
search in the A-not-B task [4]. Why should experience in moving oneself about the world help one
remember and discriminate the locations of objects in a hide-and-seek reaching task? The answer is
because moving oneself aboutover things, by things, into things, around things presents new
experiences, new patterns of spatiotemporal relations, that alter the infant’s representation of objects,
space, and self.
All in all, infants’ experiencesthe regularities the learning system encounterschange
systematically as a function of development itself. Each developmental achievement on the part
of the infanthand-eye coord ination, sitting, crawling, walkingopens the infant to whole new
sets of multimodal regularities. Now here is the question that is critical for the creation of artificial life:
Does the ordering of experiences matter in the final outcome? Could one just as well build an
intelligent 2-year-old by starting with a baby that listened, looked, reached, and walked all together
right from the beginning?
Figure 4 . A schematic illustration of the course of events in the A-not-B task. After the delay, the hiding box is moved
forward, allowing the infant to reach and search for the hidden toy.
Artificial Life Volume 11, Number 12 17
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
Studies of comparative development make clear that the developmental ordering of sensory
systems matters greatly. Different species show decidedly different orderings in the development of
sensor y systems, and these differences are related to their specific form of intelligence. The
differential timing or heterochronicity is one way that evolution selects for different adaptive
outcomes (see especially [53]). Experimental studies show that r eorderingschanges in the normal
development path dramatically alter developmental outcomes. For example, opening kittens’ eyes
early disrupts olfactory development and the subseque nt coordination of vision and olfaction [27,
38]. Similarly, disrupting the developmental order of audition and vision in owls disrupts spatial
localization in both modalities [23]. One of the ingredients in building biological intelligence is
ordering the training experiences in the right way.
Several attempts to model human learning [17, 35, 40] have shown that neural networks
sometimes fail to learn the task when the entire data set is presented all at once, but succeed when
the data are presented incrementally with an easy-to-difficult ordering. These demonstrations have
been criticized by some as cheating. But to those of us who study how intelligence gets made in real
live babies, they seem to have the right idea. Of course, in real development, this ordering of training
experiences is not forced on the learner by some omnipresent teacher, but rather emerges as a
consequence of development itself.
2.3 Lesson 3: Be Physical
Not all knowledge needs to be put into the head, into dedicated mechanisms, into representations.
Some knowledge can be realized in the body, a fact dramatically illustrated by passive walkers.
Knowledge of the alternating limb movement of bipedal locomotion knowledge traditionally
attributed to a central pattern generatorappears to reside in the dynamics of two coupled
pendulums [30]. Some of our intelligence also appears to be in the interface between the body and
the world. The phenomenon of change blindness is often conceptualized in this way. People do not
remember the details of what is right before their eyes, because they do not need to remember what
they can merely look at and see [32]. Similarly, Ballard and colleagues [2] have shown that in tasks in
which people are asked to rearrange arrays of squares, they offload their short-term memory to the
world (when they can). This off loading in the interface between body and world appears a pervasive
aspect of human cognition and may be critical to the development of higher-level cognitive functions
or in the binding of mental contents that are separated in time. We briefly present some new data
that illustrate this point [45].
The experimental procedure derives from a task first used by Baldwin [1] and illustrated in
Figure 5. The participating subjects are very young children, 1
1
2
to 2 years of age. T he experimenter
sits before a child at a table, and (a) presents the child with first one object to play with and then (b)
with a second. Out of sight of the child, the two objects are then put into containers, and the two
containers (c) are placed on the table. The experimenter looks into one container (d) and says, ‘‘I see
a dax in here.’’ The experimenter does not show the child the object in the container. Later the
objects are retr ieved from the containers (e) and the child is asked which one is ‘‘a dax.’’ Notice that
the name and the object were never jointly experienced. How then can the child join the object name
to the right object? Baldwin showed that children as young as 24 months could do this, taking the
name to refer to the unseen object that had been in the bucket at the same time the name was offered.
How did children do this? How, if you were building an artificial device, would you construct a
device that could do this, that could know the name applied to an object not physically present when
the name was offered?
There are a number of solut ions that one might try, including reasoning and remembering about
which objects came out of which containers and about the likely intentions of speakers when they
offer names. The evidence, however, indicates that young children solve this problem in a much
simpler way, exploiting the link between objects and locations and space. What children do in this
task is make use of a deep and fou ndationally important regularity in the world: A real object is
perceptually distinguished from others based on its unique location; it must be in a different pla ce
from any other object. The key factor in the Baldwin task is that in the first part of the experimental
Artificial Life Volume 11, Number 1218
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
procedure, one object is presented on the right, the other on the left. The containers are also
presented one on the right, one on the left, and the name is presented with attention directed (by the
experimenter’s looking into the bucket) to one location, for example, on the right. The child solves
this task by linking the name to the object associated with that location. We know this is the case
because we can modify the experiment in several crucial ways. For example, one does not need
containers or hidden objects to get the result at all. One can merely present the target object on the
right and have children attend to and play it with it there, then present the distracter object on the left
and have children attend to and play with it there. Then, with all objects removed, with only an
empty and uniform table surface in view, one can direct children’s attention to the right and offer the
name (dax) or to the left and offer the name. Children consistently and reliably link the name to the
object that had been at that location.
Young children’s solution to this task is simple, a trick in a sense, that makes very young children
look smarter than they perhaps really are. But it is a trick that will work in many tasks. Linking
objects to locations and then using attention to that location to link related events to that object
provides an easy way to bind objects and predicates [2]. People routinel y and apparently
unconsciously gesture with one hand when speaking of one protagonist in a story and gesture with
the other hand when speaking of a different protagonist. In this way, by hand gestures and direction
of attention, they link separate events in a story to the same individual. American Sign Language
formally uses sp ace in this way in its system of pronouns. People also use space as a mnemonic,
looking in the direction of a past event to help remember that event. One experimental task that
shows this is the Hollywood Squares experiments of Richardson and Spivey [36]. People were
presented at different times with four different videos, each from a distinc t spatial location. Later,
with no videos present, the subjects were asked about the content of those videos. Eye-tracking
cameras recorded where people looked when answering these questions, and the results showed that
they systematically looked in the direction where the relevant information had been previously
presented.
This is all related to the idea of deictic pointers [2, 22] and is one strong example of how sensori-
motor behaviorswhere one looks, what one sees, where one actscreate coherence in our
Figure 5 . Events in the Baldwin task (see text for further clarification).
Artificial Life Volume 11, Number 12 19
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
cognition system, binding together related cognitive contents and keeping them separate from other
distinct contents. In sum, one does not necessarily need lots of content-relevant knowledge or
inferential systems to connect one idea to another. Instead, there is a cheaper way: by using the world
and the body’s pointers to that world.
2.4 Lesson 4: Explore
How can a learner who does not know what there is to learn manage to learn anyway? This is a more
difficult question than it might first appear. The issue is whether one needs to prespecify the lear ning
tasks and the learning goals: whether the agent or its designer has to know what needs to be learned
in order to learn. Evidence from human development gets us out of this quandary by showing that
babies can discover both the tasks to be learned and the solution to those tasks through exploration,
or non-goal-directed action. In babies, spontaneous movement creates both tasks and opportunities
for learning. One elegant demonstration concerns the study of reaching [11]. The week-by-week
development of four babies was tracked over a 3-month period as they transitione d from not
reaching to reaching. Four very different patterns of development were observed. Some babies in the
non-reaching period hardly lifted their arms at all, but sat placidly watching the world. Other babies
were more high-strung and active, flailing and flapping and always moving. These different babies
had to learn to solve very different problems in order to learn to reach out and grasp an object. The
flailer would have to learn to become less active, to lower his hands, to bring them into midline. The
placid baby would have to learn to be more active, to raise her hands, to lift them up from their usual
positions on her side. Each baby did learn, finding a solution that began with exploration of the
movement space.
The course of learning for each baby appeared to be one of arousal, exploration, and the selection
of solutions from that exploration space. In basic form, the developmental pattern is this: The
presentation of an enticing toy is arousing and elicits all sorts of nonproductive actions, and very
different individual actions in different babies. These actions are first, quite literally, all over the place
with no clear coherence in form or direction. But by acting, by movements that explore the whole
range of the movement space, each baby, in its own unique fashion, sooner or later makes contact
with the toybanging into or brushing against it or swiping it. These moments of contact select
some movements in this space, carving out patterns that are then repeated with increasing frequency.
Over weeks, the cycle repeats: arousal by the sight of some toy, action, and occasional contact. Over
cycles, increasingly stable, mor e efficient, and more effective forms of reaching emerg e. What is
remarkable in the developmental patterns of the children is that each found a solution and
eventually converged to highly similar solutionsby following individually different developmental
pathways. As they explored different movements, in their uncontrolled actions initiated by the
arousing sight of the toy, they each discovered initially different patterns; each had a different
developmental task to solve. The lesson for building intelligent agents is clear: A multimodal system
that builds reentrant maps from time-locked correlations only needs to be set in motion, to move
about broadly, even randomly, to lear n and through such exploration to discover both tasks and
solutions.
The power of movement as a means for exploration is also illustrated by an experimental
procedure known as infant conjugate reinforcement [39]. Infants (as young as 3 months) are placed on
their backs, and their ankles are attached by a ribbon to a mobile, which is suspended overhead.
Infants, of course, through their own actions, discover this link. As the infants kick their feet, at first
spontaneously, they activate the mobile. Within a few minutes they learn the contingency between
their foot kicks and the jiggling of the mobile, which presents interesting sights and sounds. The
mobile responds conjugately to the infants’ actions: The more infants kick and the more vigorous ly
they move, the more motion and sound they produce in the mobile. In this situation , infants increase
their kicking to above the baseline spontaneous levels apparent when babies simply look at a non-
moving mobile. Infants’ behavior as they discover their control is one of initial exploration of a wide
variety of actions and the selection of the optimal pattern to make the interesting eventsthe
movement of the mobileoccur.
Artificial Life Volume 11, Number 1220
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
Although this is an experimental task, and not an everyday real-world one, it is a very appropriate
model for real-world learning. The mobile provides the infant with many time-lo cked patterns of
correlations. More impor tantly, infants themselves discover the relations through their own
exploratory movement patterns. The infants themselves are moving contingently with the mobile;
the faster and harder they kick, the more vigorously the mobile jiggles and sways. This is for infants a
highly engaging task; they smile and laugh, and often become angry when the contingency is
removed. Thus, the experimental procedure, like the world, provides complex, diverse, and never
exactly repeating events, yet all perfectly time-locked with infants’ own actions. And it is exploration,
spontaneous non-task-related movement, that starts the process off. Without spontaneous move-
ment, without exploration, there is nothing to learn from the mobile.
Young mammals, including children, spend a lot of time in behavior with no apparent goal. They
move, they jiggle, they run around, they bounce things and throw them, and generally abuse them in
ways that seem, to mature minds, to have no good use. However, this behavior, commonly called
play, is essential to building inventive forms of intelligence that are open to new solutions.
2.5 Lesson 5: Be Social
Let us re-imagine the infant conjug ate reinforcement paradigm. However, in this case instead of
coupling the infant’s leg by ribbon to a mobile, we couple the infant’s face by mutual gaze to another
face, the face of a matur e partner. Many developmental researchers have observed mother-infant
face-to-face interactions, and they report a pattern of activity and learning that looks very much like
conjugate reinforcement, but with an added twist [9, 37, 43, 52]. Mothers’ facial gestures and the
sounds they make are tightly coupled to the babies’ behavior. When babies look into their mother’s
eyes, mothers look back and smile and offer a sound with rising pitch. When babies smile, mothers
smile. When babie s coo, mothers coo. Babies’ facial actions create interesting sights and sounds from
mothers, just as their kicks create interesting sights and sounds from attached mobiles. And just as in
the case of the ribbon-tethered mobiles, these contingencies create a context for arousal and
exploration. In the initial moments as infants and mothers interact, infants’ vocalizations and facial
expressions become more active, broader, and more diverse. This exploration sets up the oppor-
tunity for lear ning time-locked corresponde nces between infants’ facial actions and vocalizations and
those of the mother, such that the infants’ actions become transformed by the patterns they produce
in others.
But crucially, the social partner in these adventures offers much more than a mobile, and this
changes everything. Mature social par tners do not just react conjugately to the infants behavior; they
build on it and provide scaffol ding to support it and to transform it into conventionally shared
patterns. For example, very early infant behavior shows a natural rhythmic patter n of intense
excitement alternatin g with patterns of relative calm [9, 37, 43, 52]. Caregivers are thus able to create
a conversation-like exchange by weaving their own behavior around the child’s natural activity
patterns. Initia lly, it appears that the caregiver alone is responsible for the structure of interaction.
But babies’ behaviors are both entrained by the mother’s pattern and educated by the multimodal
correspondences those interactions create. Incrementally and progressively, the babies become active
contributors, affecting the mother by their own reactions to her behavior, and keeping up their own
end of the conversation.
Imitation provides another example of the scaffolding mature partners provide to the develop-
mental process. Although the evidence that babies reflexively imitate parental facial gestures at birth
is controversial, other research does strongly suggest that infants learn to imitate parent vocaliza-
tions. Parents provide the structure for this learning by imitatin g their babies! That is, parents do not
just respond to their infant s’ smiles and vocalizations; they imitate them. This sets up a cyclical
pattern: vocalization by the infant, imitation by the parent, repeated vocalization by the infant,
imitation by the parent, and so on. This creates opportunities for learning and fine-tuning the infant’s
facial and vocal gestures to match the adult model. In brief, the cycle works to strengthen and
highlight certain patterns of production as parents naturally select those that they take to be
meaningful [9, 29, 43].
Artificial Life Volume 11, Number 12 21
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
Mature social partners also provide multimodal supports to help ground early language learning.
When a parent introduces an object to a toddler and names it, the parent musters a whole array of
sensorimotor supports to bring the child’s attention to the object and to bind that object to the
word [1921]. Parents look at the objec t they are naming, they wave it so that the child will look at it,
and they match the intonation patterns in which they present the name to their very actions in
gesturing at or waving the object. In one study, Yoshida and Smith [54] observed that both English-
speaking and Japanese-speaking parents routinely couple action and sound when talking to young
children. For example, one parent demonstrated a toy tape measure to their child and when pulling
the tape out said, ‘‘See, you pullllllll it,’’ elongating the word pull to match the stopping and starting
of the action of pulling. This same parent, when winding the tape back in, said, ‘‘Turn it round and
round and round and round and round and round,’’ with each ‘‘round’’ coinciding with the start of a
new cycle of turning. By tying action and sound, parents ground language in the same multimodal
learning processes that undergird all of cognition, and in so doing, they capture childrens attention,
rhythmically pulling it to the relevant linguistic and perceptual events, and tightly binding those
events together.
Again the lesson for building intelligent ag ents is clear: Raise them in a social world, coupling their
behavior and learning to agents who add structure and support to those coupled interactions.
2.6 Lesson 6: Learn a Language
Language appears to begin highly grounded in the perceptual here and now, to sensorimotor and
social processes that are not specific to language but rather quite open learning systems, capable of
discovering and solving an infinite variety of tasks. But language is without a doubt a very
special form of regularity in the world, and one that profoun dly changes the learner.
First, language is an in-the-world regularity that is a shared communicative system [18]. Its shared
aspect means that it is very stable, continually constrained by the many local communicative acts of
which is composed. As an emergent product made out of many individual communicative
encounters, the structure of natural languages may be nearly perfectly adapted to the learning
community. At any rate, in the lives of humans, language is as pervasive, as ubiquitous in its role in
intelligence, as is gravity.
Second, language is special because it is a symbol system. At the level of individual words
(morphemes really), the relation between events in the world and the linguistic forms that refer to
them is mainly arbitrary. That is, there is no intrinsic similarity between the sound of most words and
their referents: the form of the word dog gives us no hints about the kinds of thing to which it refers.
And nothi ng in the similarity of the forms of dig and dog conveys a similarity in meaning. It is
interesting to ask why language is this way. One might expect that a multimodal, grounded,
sensorimotor sort of learning would favor a more iconic, pantomime-like language in which symbols
were similar to referents. But language is decidedly not like this. Moreover, the evidence suggests that
although children readily learn mappings supported by multimodal iconicity, they fail if there is too
much iconicity between the symbol and the signified.
One intriguing demonstration of this comes from the research of DeLoache [12], which is
directed not to language learning, but to children’s use of scale models. DeLoache’s experimental task
is a hiding game, and the children are 2-year-olds. On each trial, a toy is hidden in a real life-size
room, say, under a couch. The child’s task is to find the toy, and on every trial the experimenter tells
the child exactly where the toy is, using a model of some kind. This model might be a blueprint, a
drawing of the room, a photograph, a simple scale model, a richly detailed and exact scale model, or
a life-size model. Here is the very robust but counterintuitive result: Young children fail in this task
whenever the model is too similar to the real room. For example, they are much more likely to
succeed when the solution is shown in a picture than in a scale model and much more likely to
succeed when the scale model is a simplified version of the real room than an accurate rep-
resentation. Why mustn’t a symbol be too lifelike, too much like the real world? One possibility is
that children must learn what a symbol is, and to lear n what a symbol is, there must be some
Artificial Life Volume 11, Number 1222
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
properties that are common to the set of symbols, for example, the properties that distinguish
pictures from real objects or spoken words from other sounds.
The fact that all the world’s languages are symbol systems, the fact that too much similarity
between a symbol and the signified disrupts the learning of a ma pping between them, sug gests that
arbitrary symbols confer some unique and valuable computational power. T hat power might lie in
the proper ty of orthogonality. For the most par t, individual words pick out or point to unique
categories. At the very least, this is true in the lexicons of 2- to 3-year-olds [42]. We also know that
young children act as if it were true, a phenomenon sometimes referred to as the mutual exclusivity
constraint [8, 28]. More specifically, children act as if each object in the world received one and only
one name. For example, shown two novel objects and told the name of one (e.g., ‘‘This is a dax’’),
children will assume that any new name (e.g., ‘‘wug’’) refers to the second previously unnamed
object. The arbitrariness and mutual exclusivity of linguistic labels may be computationally powerful
because they pull the overlapping regularities that create perceptual categories apart, as illustrated in
Figure 6. There is evidence to support this idea that orthogonality is computa tionally powerful,
enabling children to form second-order, rule-like generalizations. To explain this developmentally
powerful aspect of language learning, we must first provide some background on children’s word
learning.
Children comprehend their first word at around 10 months; they produce their first word at
around 12 months. Their initial progress in language learning is surely built on multimodal clusters
and categories emergent in the infant’s interactions in the world. Nonetheless, progress at first is
Figure 6 . Illustration of Colunga’s [10] proposal of how orthogonal labels may pull similarities apart. (a) Word forms
become associated with members and features of object categories. (b) The orthogonality of the words leads to the
divergence of initially similar conceptual representations.
Artificial Life Volume 11, Number 12 23
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
tentative, slow, and fragile. For the 6 months or longer after the first word, children acquire
subsequent words very slowly, and often seem to lose previously acquired ones. Moreover, they seem
to need to hear each individual word in many contexts before they apprehend its range. Then,
between 18 and 20 months, most children become very rapid word learners, adding new words to
their vocabularies at the sta ggering rates of 4 to 9 a day. During this time, they seem to need only to
hear a word used to label a single object to know the whole class of things to which the word refers
[44]. This one-instance to whole-category learning is especially remarkable in that different kinds of
categories are organized in different ways. For example, animate categories are organized by many
different kinds of similarities across many modalities; artifact categories are organized by shape, and
substance categories by material.
The evidence from both experimental studies and computational models indicates that children
learn these regularities as they slowly learn their first words and that this learning then creates their
ability to learn words in one trial. Th e nature of this learning can be characterized by four steps,
illustrated in Figure 7 [41, 46]. The figure illustrates just one of the regularities that children learn:
that artifact categ ories are organized by shape. Step 1 in the learning process is the mapping of
names to objects the name ‘‘ball’’ to a particular ball and the name ‘‘cup’’ to a particular cup, for
example. This is done multiple times for each name as the child encounters multiple examples. And
importantly, in the early lexicon, solid, rigidly shaped things are in categories typically well organized
by similarity in shape [42]. This learning of individual names sets up step 2 first-order general-
izations about the structure of individual categories, that is, the knowledge that balls are round and
cups ar e cup-shaped. The first-order generalization should enable the learner to recognize novel balls
and cups.
Another higher-order generalization is also possible. Because most of the solid and rigid things
that children learn about are named by their shape, children may also learn the second-order
generalization that names for artifacts (solid, rigid things) in general span categories of similar-
shaped things. As illustra ted in step 3 of the figure, this second-order generalization requires
generalizations over specific names and specific category structures. But making this higher-order
generalization should enable the child to extend any artifact name, even one encountered for the
first time, to new instances by shape. At this point, the child behaves as if it has an abstract and
variablized rule: For any artefact, whatever its individual properties or individual shape, form a
category by a shape. Ste p 4 illustrates the potential developmental consequence of this higher-order
generalization attention to the right property, shape for learning new names for artifacts. The
plausibility of this accoun t has been demonstrated in experimental studies that effectively accelerate
the vocabulary acqu isition function by teaching children the relevant correlations and in simulation
studies with neural nets [41, 46].
How special is language’s role in enabling the formation of second-order generalizations? Perhaps
very special indeed. Recent simulation studi es by Colunga [10] suggest that the arbitrariness and
orthogonality of the linguistic labels may be critical. Neural networks that readily form second-order
generalizations and yield accelerating rates of vocabulary acquisition do n ot do this if the labels, the
words, are not orthogonal. We strongly suspect that even if orthogonality does not prove in the limit
to be necessary, it will prove to be strongly beneficial to the formation of second-order general-
izations. This work is a beginning hint at an answer to what we take to be a deeply impor tant
question for those of us who wish to understand intelligent systems: Why in a so profoundly
multimodal sensorimotor agent such as ourselves is language an arbitrary symbol system? What
computational problems are solved by language taking this form?
The advantages conferred by arbitrary symbols go well beyond those hinted at here. More
familiar are the properties of symbol systems, capacities that result from the possibility of combining
symbols. For natural languages, this is the domain of grammar. All known natural languages have
two fundamental properties of symbol systems.
First, they are at least approximately compositional. That is, in the domain of grammar, unlike the
domain of individual morphemes, language is anything but arbitrary. Compositionality permits
hearers to comprehend unfamiliar combinations of morphemes and speakers to produce combina-
Artificial Life Volume 11, Number 1224
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
Figure 7 . Four steps in the development of one-trial word learning. Step 1: mapping of names to objects. Step 2: first-
order generalizations about the structure of individual categories. Step 3: second-order generalization. Step 4: attention
to the right property in learning new names.
Artificial Life Volume 11, Number 12 25
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
tions of morphemes they’ve never produced or heard before. An English speaker who knows what a
dax is automatically knows that daxes refer s to more than one of them.
Second, words as symbols permit structured representations, in particular, those with embedding.
Embedding is possible because symbols representing relations between symbols can themselves play
the role of symbols representing objects. So we can say things like John thinks Mary doubts he likes her
and the woman who teaches the class I like.
It may be the or thogonal nature of linguistic representations, deriving ult imately from the
arbitrary nature of the form-meaning relationship at the level of morphemes, that is behind these
properties of language as well. If the representations for the words in a sentence overlapped
significantly, it would be impossible to keep them separate in composing the meanings of the words.
Orthogonal representations permit several separate items to be maintained simultaneously in short-
term memory without significant interference. This does not deny the rich, distributed representa-
tions for the concepts behind these words; it simply brings out the value of orthogonal pointers to
those representations. These pointers can be manipulated (composed, associated in structures)
without making direct reference to their meanings or their pronunciations.
But this power goes beyond the gr ammar of natural languages. This sa me potential for
composition and for structured representations holds for other symbolic processing generally and
seems to characterize human activities such as explicit planning and mathematics. It has long been
suggested that the way into symbolic processing is through language [56], and though this idea
remains controversial, we believe it is worth taking seriously. First, because sentence structure maps
onto event structure, language could teach children about how to attend to event structure in the
same way that it apparently teaches them to attend to particular dimensions of objects. Second, the
orthogonal symbols that allow language to be compositional and structured, once learned, could
provide the basis for other symbol systems, such as the one that is behind algebra.
Developing in a linguistic world makes children smarter in at least three kinds of ways. First, and
most obviously, by learning a language, children gain more direct access to the knowledge that others
have. Children can be instructed, and when they are unsure of something, they can ask questions and
can eventually search for the information in written form. While knowledge in this explicit verbal
form may not have the richness of knowledge that results from direct experience, it can supplement
the experience-based knowledge, especially in areas where children have no possibility of direct
experience.
Second, in learning a language, children are presented with an explicit categorization of the
objects, attributes, and relations in the world. Each morpheme in a natural language represents a
generalization over a range of sensory, motoric, and cognitive experiences, and by labeling the range,
morphemes function as a form of supervised category learning that is unavailable to other
organisms. Thus, one result of learning a languag e is an ontology. Not only does this permit
children to notice regularities they might miss otherwise (for example, the relevance of shape for
artifacts or motion for animates), but because the ontology is shared by the community of speakers
of the language, it guarantees a degree of commonality in the way the members of the community
will respond to the world.
Third, and as we suggested here, learning a language may be the key to becoming symbolic and by
its very nature may change the computational power of the learner. Each word associates a
distributed phonological pattern and a distributed conceptual pattern in what is apparently a local, or
at least orthogonal, fashion. It may be the largely arbitrary nature of this association that facilitates
the learning of local lexical representations; because similarity in word forms does not entail
similarity in the corresponding meanings, and vice versa, mediating representations that do not
overlap are the most efficient alternative. Whatever the reason, research on lexical access in language
production [14, 26] points, first, to the psychological reality of a distinct lexical level of representation
and, second, to the fundamentally orthogonal and competitive nature of these representations. The
advantage of these local representations is that complex reasoning can be carried out on them
directly: They can be associated with one another and even arranged in hierarchical structures,
representing symbolically what could not be achieved with the distributed overlapping representa-
Artificial Life Volume 11, Number 1226
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
tions of component concepts. Thus the p ower of symbolic reasoning pla nning, logic, and
mathematics may derive ultimately from words in their function as pointers to concepts.
3 Conclusion
Artificial life attempts to model living biological systems through complex algorithms. We have
suggested in this article that developmental psychology offers usable lessons for creating the
intelligence that lives in the real world, is connected to it, and knows about that world. Babies begin
with a body richly endowed with multiple sensory and action systems. But a richly endowed body
that is simply thrown into a complex world, even with the benefits of some pre-programming and
hardwiring by its designers, would fail to meet the standard of even a 3-year-old unless it were tuned
to the detailed statistics of that world. We have argued that embodied intelligence de velops.Inan
(embodied) human child, intellig ence emerges as the child explores the world, using its sophisticated
statistical learning abilities to pick up on the subtle regularities around it. Because the child starts
small, because its intelligence builds on the progress it has already made, because development brings
the child to different regula rities in the world, because those regularities include couplings between
the child and smart social partners, and because the world includes a symbol system, natural
language, the child achieves an intelligence beyond that of any other animal , let alone any current
artificial device. The lesson from babies is: intelligence isn’t just embodied; it becomes embodied.
Acknowledgments
The preparation of this manuscript and much of the work re ported in it were supported by NIH-
NIMH R01MH60200.
References
1. Baldwin, D. (1993). Early referential understanding: Infants’ ability to recognize referential acts for what
they are. Developmental Psychology, 29(5), 832843.
2. Ballard, D., Hayhoe, M., Pook, P., & Rao, R. (1997). Deictic codes for the embodiment of cognition.
Behavioral and Brain Sciences, 20, 723 767.
3. Barsalou, L. W. (in press). Abstraction as a dynamic construal in perceptual symbol systems. In
L. Gershkoff-Stowe & Rakison, David (Eds.), Building object categories in developme ntal time. Hillsdale, NJ:
Erlbaum.
4. Bertenthal, B., Campos, J., & Barrett, K. (1984). Self-produced motion: An organizer of emotional,
cognitive, and social development in infancy. In R. Emde & R. Harmon (Eds.), Continuities and discontinuities
(pp. 175 210). New York: Plenum Press.
5. Breazeal, C. (2002). Designing sociable robots. Cambridge, MA: MIT Press.
6. Brooks, R., Breazeal, C., Marjanovic, M., Scassellati, B., & Williamson, M. (1998). The cog project: Building
a humanoid robot. In C. Nehaniv (Ed.), Computation for metaphors, analogy and agents. Springer-Verlag.
7. Bushnell, E. (1994). A dual processing approach to cross-modal matching: Implications for development.
In D. Lewkowicz & R. Lickliter (Eds.), The development of intersensory perception (pp. 19 38). Mahwah,
NJ: Erlbaum.
8. Clark, E. (1987). The principle of contrast: A constraint on language acquisition. In B. MacWhinney (Ed.),
Mechanisms of language acquisition (pp. 1 33). Hillsdale, NJ: Erlbaum.
9. Cohn, J. F., & Tronick, E. Z. (1988). Mother-infant face to face interaction: Influence is bi-directional
and unrelated to periodic cycles in either partner’s behavior. Developmental Psychology, 24, 386 392.
10. Colunga, E. (2003). Local vs. distributed representations: Implications for language learning. In preparation.
11. Corbetta, D., & Thelen, E. (1996). The developmental origins of bimanual coordination.
Journal of Experimental Psychology: Human Perception and Performance, 22, 502522.
12. DeLoache, J. (2002). The symbol-mindedness of young children. In W. Hartup & R. Weinberg (Eds.),
Artificial Life Volume 11, Number 12 27
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
Child psychology in retrospect and prospect: In celebration of the 75th anniversary of the Institute of Child Development
(pp. 73 101). Mahwah, NJ: Erlbaum.
13. Diamond, A. (1990). Developmental time course in human infants and infant monkeys and the neural bases
of inhibitory control in reaching. In A. Diamond (Ed.), The development and neural bases of higher cognitive functions
(pp. 637 676). New York: New York Academy of Sciences Press.
14. Dell, G. S, Juliano, C., & Govindjee, A. (1993). Structure and content in language production: A theory of
frame constraints in phonological speech errors. Cognitive Science, 17, 149 195.
15. Edelman, G. (1987). Neural Darwinism. New York: Basic Books.
16. Ellis, R., & Tucker, M. (2000). Micro-affordance: The potentiation of components of action by seen objects.
British Journal of Psychology, 91(4), 451471.
17. Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small.
Cognition, 48, 7199.
18. Freyd, J. (1983). Shareability: The social psychology of epistemology. Cognitive Science, 7(3), 191210.
19. Gogate, L., & Bahrick, L. (2001). Intersensory redundancy and 7-month-old infants’ memory for arbitrary
syllable-object relations. Infancy, 2(2), 219231.
20. Gogate, L., & Walker-Andrews, A. (2001). More on developmental dynamics in lexical learning.
Developmental Science, 4(1), 31 37.
21. Gogate, L., Walker-Andrews, A., & Bahrick, L. (2001). The intersensory origins of word comprehension:
An ecological-dynamic systems view. Developmental Science, 4(1), 118.
22. Hurford, J. (in press). The neural basis of predicate-argument structure. Behavioral and Brain Sciences.
23. Knudsen, E. (2003). Instructed learning in the auditory localization pathway of the barn owl. Nature,
417(6886), 322 328.
24. Lakoff, G. (1994). What is a conceptual system? In W. F. Overton & D. S. Palermo (Eds.), The nature and
ontogenesis of meaning. The Jean Piaget symposium series ( pp. 4190). Hillsdale, NJ: Erlbaum.
25. Landau, B., & Gleitman, L. (1985). Language and experience. Cambridge, MA: Harvard University Press.
26. Levelt, W. J. M. (2001). Spoken word production: A theory of lexical access. Proceedings of the National Academy
of Sciences, 98, 13,464 13,471.
27. Lickliter, E. (1993). Timing and the development of perinatal perceptual organization. In G. Turkewitz &
D. Devenney (Eds.), Developmental time and timing (pp. 105123). Hillsdale, NJ: Erlbaum.
28. Markman, E., & Wachtel, G. (1988). Children’s use of mutual exclusivity to constrain the meaning of words.
Cognitive Psychology, 20(2), 121157.
29. Masur, E., & Rodemaker, J. (1999). Mothers’ and infants’ spontaneous vocal, verbal, and action imitation
during the second year. Merrill-Palmer Quarterly, 45(3), 392412.
30. McGeer, T. (1990). Passive dynamic walking. International Journal of Robotics Research, 9(2), 62 82.
31. Mendelson, M. J., & Haith, M. M. (1976). The relation between audition and vision in the human newborn.
Monographs of the Society for Research in Child Development, 41(4); Serial No. 167.
32. O’Regan, J. K., & Noe
¨
, A. (in press). A sensorimotor account of vision and visual consciousness. Behavioral
and Brain Sciences, 24, 939 973.
33. Pfeifer, R., & Scheier, C. (1999). Understanding intelligence. Cambridge, MA: MIT Press.
34. Piaget, J. (1963). The origins of intelligence in children. New York: Norton.
35. Plunkett, K., & Marchman, V. (1991). U-shaped learning and frequency effects in a multi-layered
perceptron: Implications for child language acquisition. Cognition, 38, 1 60.
36. Richardson, D., & Spivey, M. (2000). Representation, space, and Hollywood Squares: Looking at things that
aren’t there anymore. Cognition, 76, 269295.
37. Rogoff, B. (1990). Apprenticeship in thinking: Cognitive development in social context. Oxford, UK: Oxford
University Press.
38. Rosenblatt, J. S., Turkewitz, G., & Schneirla, T. C. (1969). Development of home orientation in newborn
kittens. Transactions of the New York Academy of Sciences, 31, 231 250.
Artificial Life Volume 11, Number 1228
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser
39. Rovee-Collier, C., & Hayne, H. (1987). Reactivation of infant memor y: Implications for cognitive
development. In H. Reese (Ed.), Advances in child development and behavior, Vol. 20, ( pp. 185238).
San Diego, CA: Academic Press.
40. Rumelhart, D. E., & McClelland, J. L. (1986). On learning the past tense of English verbs. In
J. L. McClelland & D. E. Rumelhart (Eds.), Parallel distributed processing: Explorations in the microstructure
of cognition, Volume 2, Cambridge, MA: MIT Press.
41. Samuelson, L. (2002). Statistical regularities in vocabulary guide language acquisition in connectionist
models and 1520-month olds. Developmental Psychology, 38, 10111037.
42. Samuelson, L. K., & Smith, L. B. (1999). Early noun vocabularies: Do ontology, category structure and
syntax correspond?. Cognition, 73(1), 133.
43. Schaffer, H. R. (1996). Social development. Oxford, UK: Blackwell.
44. Smith, L. B. (2000). How to learn words: An associative crane. In R. G. K. Hirrsh-Pasek (Ed.), Breaking the
word learning barrier ( pp. 5180). Oxford, UK: Oxford University Press.
45. Smith, L. B. (2003). How space binds words and referents. In preparation.
46. Smith, L. B., Jones, S. S., Landau, B., Gershkoff-Stowe, L., & Samuelson, L. (2002). Object name learning
provides on-the-job training for attention. Psychological Science, 13(1), 13 19.
47. Smith, L. B., Quittner, A. L., Osberger, M. J., & Miyamoto, R. (1998). Audition and visual attention: The
developmental trajectory in deaf and hearing populations. Developmental Psychology, 34(5), 840 850.
48. Smith, L. B., Thelen, E., Titzer, R., & McLin, D. (1999). Knowing in the context of acting: The task
dynamics of the A-not-B er ror. Psychological Review, 106(2), 235 260.
49. Spelke, E. S. (1979). Perceiving bi-modally specified events in infancy. Developmental Psychology, 15, 626 636.
50. Sutton, R. S., & Barto, A. G. (1998). Reinforcement lear ning: An introduction. Cambridge, MA: MIT Press.
51. Thelen, E., Schoener, G., Scheier, C., & Smith, L. B. (2001). The dynamics of embodiment: A field theor y of
infant perseverative reaching. Behavioral & Brain Sciences, 24(1), 1 86.
52. Trevarthen, C. (1988). Infants trying to talk. In R. So
¨
derbergh (Ed.), Children’s creative communication. Lund,
Sweden: Lund University Press.
53. Turkewitz, G., & Kenny, P. A. (1985). The r ole of developmental limitations of sensory input on
sensory/perceptual organization. Journal of Developmental and Behavioral Pediatrics, 6, 302 306.
54. Yoshida, H., & Smith, L. B. (2003). Sound symbolism and early word learning in two languages. Submitted
to Annual Conference of the Cognitive Science Society.
55. Titzer, R., Thelen, E., & Smith, L. B. (2003). Learning about transparency. Unpublished manuscript.
56. Vygotsky, L. S. (1962). Thought and language. New York: MIT Press and Wiley.
57. Wertheimer, M. (1961). Psychomotor coordination of auditory-visual space at birth. Science, 134, 1692.
Artificial Life Volume 11, Number 12 29
The Development of Embodied Cognition: Six Lessons from BabiesL. Smith and M. Gasser