Research Limitations

4. Research Limitations

The models mentioned in the previous pages make very weak assumptions about the transmission process itself: no language is easier or harder for learners to acquire than any other. The need for fitness – being able to communicate effectively with others – can be said to be the driving force in the dynamics of, rather than learning.

In addition, computational models might not be able to match the abilities and biases of real human learners to a realistic extent. The use of modern humans is inevitable in laboratory experiments; it is hard for us to determine what humans were like at the point when language emerged. However, it is worth noting replicating the evolutionary history of language is not the objective of such experimental approaches.

Part II: Computational Models

3. Computational models

When studying a social phenomenon such as language, there are certain “predictable” patterns in human behavior that we can try to model with statistical physics or mathematics to represent large-scale population behavior (Loreto & Steels, 2007). As mentioned in Chapter 4, agent-based modelling is a new analytical method for social sciences. It enables one to build models where individual entities and their interactions are directly represented. The collective result (“macro” phenomena) of interactions of interactions among individual agents (“micro” dynamics) can be observed, or inferred from these modeling efforts (Castellano, Fortunato, & Loreto, 2009). Basically, it allows one to represent multiple scales of analysis in a natural and efficient way. In linguistics, we can use agent-based modelling to study the emergence and evolution of language (Sierra-Santibánez, 2015).

[Back to Table of Contents]

3.1 Literature on agent-based modeling

The origins of agent-based modeling started in the 1940s when Von Neumann and Ulam created the concept of cellular automata (Neumann, 1966; Ulam, 1960). In linguistics, earlier work in this area sought to explain the role of interaction and negotiation, or biases of learners in shaping communication systems, focusing mainly on the conditions under which communicatively optimal, socially learnt communication systems would emerge (Kirby, Griffiths, & Smith, 2014). Thereafter, researchers tried to find out how linguistic structure can arise from iterated learning (Kirby, 2001). Emphasis was given on the role of bottleneck learning, which was thought to be the driving force behind the evolution of structure, since language learners must try to learn an infinitely expressive linguistic system on the basis of a small set of linguistic data (Kirby, 2001). A major finding is that compositional languages emerge from unstructured languages due to repeated transmission through the learning bottleneck – language structure appears as an adaptive response by language per se to the problem of being transmitted through a narrow bottleneck, since the presence of compositional rules enables a learner to infer from a small sample rules underpinning the whole language (Kirby, 2001).

There is another model that represents the emergence of systematicity in phonological systems through communicative interaction and iterated learning. One example would be De Boer who looked at the cultural evolution of vowel systems, showing that the universal features of the organisation of all the vowels in the world can arise through repeated interaction between simulated agents under certain reasonable articulatory and perceptual constraints.

Besides the findings on compositionality and vowel systems, there are other areas of communication and language that we can look at using computational models. In the sections below, we present how computational models (in the form of autonomous computer programs in physical robot bodies called “embodied agents”) demonstrate the emergence of certain behaviors that underscore communication (also see: Chapter 4). Computational models can help us understand how a basic level of cooperation can be achieved (via acoustic signalling) among embodied agents with a case study (Ampatzis, Tuci, Trianni, & Dorigo, 2010). This occurs prior to the emergence of language. Additionally, we present an experiment by Luc Steels (2010) to show how embodied agents can form an inventory of spatial categories to communicate by switching perspectives between itself and corresponding agents.

[Back to Table of Contents]

3.2 Embodied agents in communication studies

Embodied agents come in two forms – physical and virtual. A physical embodied agent possesses a physical body, unlike that of a virtual agent. Some of these physical agents are built with human-like or animal-like features which provide them with the physical capabilities to perform certain tasks or actions (Sukthankar, 2008). A humanoid robot such as the iCub robot (Parmiggiani et al., 2015) is an example of a physical embodied agent that resembles a 3-year-old child.

On the other hand, the virtual embodied agent, also known as the interface agent, is represented by a simulated avatar that one sees on a computer screen (Serenko, Bontis, & Detlor, 2007). The computer simulation is achieved with the use of relevant software and artificial intelligence (Serenko, Bontis, & Detlor, 2007). For example, one is able to interact with a conversational agent that provides students who attend online lessons with tutoring services.

These embodied agents form a communication system that resembles human language (Parisi, 2010). In constructing these artificial organisms that behave like real-life organisms, we can further improve our understanding on the behaviours of the latter that aid in linguistic and other scientific research (Parisi, 2010). Before researchers can further progress into the aspects of grammar and lexicon in language evolution using embodied agents, it is important to maximise the potential of these agents’ communication capabilities. Hence, the emergence of communication in embodied agents is a crucial phase in the evolution of language itself. Further, devices or other technological equipment can be created or improved on to assist humans in their daily activities (Parisi, 2010).

[Back to Table of Contents]

3.2.1 General framework of embodied cognition

To enable embodied agents to communicate, Mirolli and Nolfi (2010) emphasise on the importance of the general framework of embodied cognition. The theory is primarily a collection of different ideas in understanding behaviour but are related in challenging the classical cognitive science paradigm where the physicality of these agents and their environments are not taken into consideration (Mirolli & Nolfi, 2010). Hence, they discuss three important aspects with relevance to the physicality of these agents and their environments.

Firstly, the aspect of “situatedness” refers to the environment in which the agent is located in (Mirolli & Nolfi, 2010). Parameters must be clearly defined to regulate the interaction between the agents and their external environment and for some cases, the interaction with other agents in the same environment (Mirolli & Nolfi, 2010). In short, “situatedness” provides the agent with the details of the activity and environment (Mirolli & Nolfi, 2010).

Secondly, the aspect of “embodiment” refers to physical properties or characteristics of the body of an agent (Mirolli & Nolfi, 2010). The important characteristics include the agent’s weight, height, shape and size and the type, position, and number of its actuators and sensors (Mirolli & Nolfi, 2010). These properties will affect how the agent behave and solve problems (Mirolli & Nolfi, 2010). In addition, the control system, or its “brain” fundamentally influences the agent’s behaviour. When the agent is required to possess similar characteristics as natural organisms, control systems such as the artificial neural networks are preferred and for agents which are simpler and less bio-mimetic, look-up tables or production rules can be used instead (Mirolli & Nolfi, 2010).

Lastly, the aspect of “adaptivity” refers to the understanding that communication is not the sole action performed by the agents (Mirolli & Nolfi, 2010). The agent’s adaptive value needs to be taken into consideration where communication is investigated as a way of sub-serving other non-communicative behaviours (Mirolli & Nolfi, 2010). It aids in studying the merging development and adaptation between communicative and non-communicative behaviours (Mirolli & Nolfi, 2010).

While these three aspects are crucial in understanding how agents should behave and communicate, it is pivotal to note that they do not more form a clear-cut dichotomy and in setting-up experiments with embodied agents, a continuum exists where all three aspects are either fully present or not present at all (Mirolli & Nolfi, 2010).

[Back to Table of Contents]

3.2.2 Evaluation criteria in the assessment of communication

Based on these agents’ sensory-motor experiences, researchers are able to further study how signals and meanings originate (Mirolli & Nolfi, 2010). These signals play a role in monitoring the progress of communication in embodied agents such as their expressive power and organizational complexity. The number of signals produced will allow us to understand the adaptiveness of the agents. Further, the type of signals emitted provides information on the meaning of the message. For instance, the comparison between deictic and displaced signals presents referential information on the current context experienced by the issuer and the recipient. A set of rules is also necessary in regulating how signals are exchanged among agents whereby the agents are able to function ideally according to the circumstance or environment. While the structure of signals is an important dimension in the evaluation, the development of structured forms of communication yet to be initiated by embodied agents themselves and will represent as an achievement in this field of the research.

Other criteria in monitoring the agents’ communicative performance include understanding the adaptive role of the agents, recording the level of robustness or stability of the communication system, and modelling original or new theories (Mirolli & Nolfi, 2010).

[Back to Table of Contents]

3.2.3 Case study of the evolution of signalling in a multi-robot system

Fig. 1. The set-up of the experiment (a) Environment A and (b) Environment B (Ampatzis, Tuci, Trianni, & Dorigo, 2010, p. 163).

In this experiment (Ampatzis, Tuci, Trianni, & Dorigo, 2010), there are two circular zones created with a diameter of 120 cm – namely environment A and environment B. In each environment, there is a presence of a light source and two s-robots are placed randomly at 75 and 95 cm away from the light source. The light sources in both environments are surrounded by a coloured band each and these colours represent the “danger” zone. However, in environment A, a section of the band is removed to create a “way in” zone which is a path designated for the robots to travel on to reach the light source whereas in environment B, there is an absence of the “way in” zone. The ultimate goal of the experiment is for both robots to discriminate between two different environments where they are able to move towards the light source safely in environment A and away from the danger zone in environment B.

Equipped with light sensors, floor sensors and a sound signalling system, the s-robots will first navigate around the environment with their wheels. When they are faced with “danger”, they will emit a sound to indicate that they are aware of the “danger” and begin to move away accordingly. It was found that these robots did not only communicate with the experimenter via their sound signals but the other robot in the same zone also responded to the sound signal and moved away from the coloured band. The s-robots were not just communicating via sound signalling but were behaving socially. Furthermore, the sound signal shows that it is an encoding sensory information that is integrated over time and contributes to the increase in the reliability of the categorisation process.

[Back to Table of Contents]

3.3 Embodied agents in language studies

Besides communication studies, certain aspects of language such as name formation, spatial inventories, and grammatical case can also be investigated using embodied agents. Due to the constraints in this wiki chapter, only the spatial language and perspective reversal game (Steels, 2010) will be briefly introduced.

Consider this example: When two people are standing opposite each other, they see the world differently because of the different vantage points that each person is adopting. However, the body of the self remains a constant landmark that can be used as a reference point. Thus, one could use various spatial terms such as left, right, up, and down (with respect to her own body) to communicate with the other about directions. How can embodied agents with no predefined inventory of spatial terms convey meaningful utterances to another?

[Back to Table of Contents]

3.3.1 Methodology

In this experiment, the researchers test the assumption of egocentric perspective transformation (EPT) (Steels, 2010). EPT is the agent’s ability to transform certain features of an object with respect to the position of another object. Over the course of approximately 5000 games, the experiments revealed that the agents are able to self-organize a communication system that includes the formation of an inventory of spatial categories (Steels, 2010). The embodied agents used in the study were five AIBO robot dogs that could move freely within an indoor laboratory.

Fig. 2. Graph of experimental results comparing the communicative successes of the AIBO robots (a) without any need for EPT, (b) required EPT, but could not perform it (c) could perform EPT but did not have spatial language lexicon and (d) could perform EPT and had spatial categories marked in language (Steels, 2010, p. 251).

[Back to Table of Contents]

3.3.2 Results

It was found that EPT in these robots were highly required for communicative success because EPT reduced the amount of cognitive effort needed for language users (Steels, 2010). A seen in the graph above, robots without EPT (graph b) only achieved 15% communicative success. In graph c, we see that robots with EPT unmarked in language achieved greater success, at the expense of greater cognitive effort because the hearer needs to adopt the speaker’s perspective and then perform EPT in the hearer’s world model (Steels, 2010). Robots with EPT marked in language (graph d) have an added “perspective indicator” in their conceptualization. Cognitive effort drops significantly, and the hearer knows which perspective to use instantly (Steels, 2010).

[Back to Table of Contents]

3.3.3 Discussion

EPT was found to be essential for communicative success (Steels, 2010). Spatial categories emerged as predicates in the robots’ conceptualization which helped reduce the cognitive effort of the hearer. This finding is significant for language evolution studies. From an evolutionary perspective, humans come from the superfamily of Hominoidea, of which the members are characterized by having no tails, being biped, and most significantly, having large brains. This comparatively large brain consumes a quarter of the body’s energy even when at rest; thinking was thus, a “costly” activity (Harari, 2014). It is possible that marking spatial categories in language was an evolutionarily more efficient way of using the brain.

[Back to Table of Contents]

Part I: Iterated Learning Model (ILM)


2. Iterated Learning Model (ILM)

2.1 Theory of Iterated Learning


As the key mechanism of the cultural evolution of language, the aforementioned models are largely based on iterated learning.

Iterated learning refers to the process by which an individual acquires a behavior by observing a similar behavior in another individual who acquired it in the same way.

Early studies of iterated learning that observed human behaviour in a laboratory setting were designed to learn about cultural transmission. One of the pioneer studies in iterated learning is Bartlett’s (1937) ‘serial reproduction’ experiment, in which participants were exposed to some stimulus (such as drawings) and were then asked to reproduce the same material from memory. Their reproduced work served as the stimulus for a second participant, and so on. Bartlett observed that the material that was transmitted in this manner had changed as participants impressed their expectations about what they deemed was the right and appropriate content onto the material, thus causing it to be restructured.  For example, if one was shown a picture of an apple and told to draw it, another would observe the resultant drawing and produce a new drawing of an apple (as pictured below). An interesting observation that came out of the study was that drawings could change toward conventional, prototypical forms of the object drawn. (Bartlett, 1932)


Spoken and signed languages, birdsong, and music are transmitted via iterated learning as opposed to explicit teaching. One’s linguistic behavior is thus a product of one’s observation of others’ similar behavior, which was induced by that of those who came before. The chain of diffusion occurs in not only cultural transmission, but also horizontal negotiation of conventions between peers of the same generation. Iterated learning is thus manifested both along a cross-generational chain of different individuals and back-and-forth within a dyad.

Iterated learning can have profound effects on linguistic structure. A study on such effects involved the learning of an artificial language by participants who were organized into diffusion chains. Such studies show that iterated is an adaptive process, in which the linguistic behavior being transmitted gains input from each generation to overcome key constraints. Some constraints to a language’s transmission over time include error rate and ambiguity.

[Back to Table of Contents]

2.2 Simulating cultural transmission of language


The question of how language emerged concerns itself with both the biological evolution of the various cognitive capacities deemed necessary for language and the cultural evolution of languages, beginning from theorized proto-languages. It should be noted that cultural evolution should not be considered in isolation from its biological counterpart, as the cognitive adaptations an individual is equipped with bear implications for social interaction and learning.

Upon establishment, a language requires to be learned by subsequent generations, each from a prior generation, in a cross-generational manner, also known as vertical cultural transmission.

People learn a language from other people who once learned that language themselves.

Studies on cultural transmission seek to explain the changes an emergent language system undergoes.

The cultural transmission of a language can be said to give rise to design without intention and designer.

Exposure to linguistic behavior exhibited by members of one’s speech community induces one’s production of particular language properties. The resulting language used by one in turn translates to observable linguistic behavior which shapes the language of further members. Cultural evolution of the language is thus enabled by this cycle of repeated induction and elicitation of linguistic behavior.

Simulations of cultural transmission are based on the belief that when language is culturally transmitted, it develops:

  • Structure
  • Key design features unique to language over other communication systems
  • Enhanced learnability through minimization of errors

The models of reserach include computational agent-based simulations, mathematical models, and most recently, laboratory experiments.

Pioneer work on agent-based simulations sought to explain how negotiation due to learners’ biases and interaction between them influence communication systems greatly. Subsequent studies focused largely on the development of linguistic structure as a byproduct of cultural learning (specifically iterated learning), despite poverty of the stimulus. Work on mathematical models followed, supplementing the findings of such agent-based simulations through mathematical characterizations of changes effected by cultural transmission.

To support prior computational and mathematical models empirically, laboratory experiments aim to demonstrate how cumulative, adaptive, and non-intentional the cultural evolution of language is, by using human participants.

[Back to Table of Contents]

2.2.1 Challenges

One significant challenge faced by studies on cultural evolution remains to be the arguably reductionist approach taken. Any given language is constituted by thousands of language systems (capturing pragmatic, semantic, morphological, and phonological distinctions) and language strategies, of which all are intertwined. No model can hope to closely replicate all aspects of language evolution through simulation.

Despite availability of real-life observations of cultural transmission with the likes of Nicaraguan Sign Language, study of genuine emergence remains limited by the lack of direct, natural, data. Hence, only indirect evidence can be drawn.

[Back to Table of Contents]

2.3 Laboratory experiments with human subjects

A couple of studies have combined iterated learning techniques with artificial language learning or communication game paradigms in a bid to explore how languages and other communication systems evolve through learning and use. In language evolution, iterated learning has become a paradigm which involves experimentation with artificial languages. Human participants learn a set of items in the language, and then produce linguistic behavior which subsequent individuals learn from and so on. This was introduced by researchers Kirby S, Cornish H, and Smith K. A learning bottleneck is also imposed on transmission: participants are asked to learn a target language based on exposure to a smaller set of language items from the original set of stimuli, with the language produced by the nth participant. In this chain, the nth participant provides the input to participant n +1 and so forth. In short, participants have access to only a limited set of data.

[Back to Table of Contents]

2.3.1 Literature on laboratory experiments with humans

Many types of experiments relating to exploring the phenomenon of language evolution have been conducted in the laboratory. They are divided into experiments focusing on signal creation, the emergence of communication systems, and cultural transmission itself.

Signal Creation

Understanding the origins of language will involve the uncovering of the necessary cognitive capacities used for linguistic communication and detecting communication intentions. Studies on signal creation investigate how individuals recognise the communicative nature of certain behaviour, before even questioning how meaning is created from the signals. Scott-Phillips et al. (2009) ‘s embodied communication game (ECG) is a two-player game designed to serve this investigation. It requires participants to travel around a 2X2 grid with movement as their only communicative resource. This forces participants to find ways in revealing which movements are communicative in nature rather than acts of travel. The difficulty of this task revealed that common ground serves is especially key to the emergence of communication channels. Related work (as cited in Scott-Phillips & Kirby, 2010) also lead to similar conclusions. The challenge of these games is highlighted in the difficulty of communicating one’s communicative intent.

Emergence of communication systems

Once communicative intent is established, individuals now face the task of negotiating the forms and meanings of symbols to create a communication system. In a pioneer study done by Galantucci (2005), pairs of participants were tasked to invent and agree on a set of signs to use to solve a coordination problem. The study aimed to illustrate how human communication can be understood as a form of joint action. An example of a line of research that has been spawned from the pioneering studies on the role of interaction in the emergence of communication system is the use of graphical communication tasks. Such tasks are advantageous in its provision of a medium which allows the invention of new signs to be used in an interactive context.  Examples of studies that illustrates this include those done by Garrod et al (2007, 2010).

Cultural transmission

Following the establishment of some sort of language, cultural transmission, which is an instance of iterated learning, takes place. In Kirby et al’s (2008) experiment, participants were asked to learn labels for coloured moving shapes, where the initial artificial language provided a randomly generated, unique label for the shapes. Their findings revealed a structured language that developed from the initial unstructured set of meaning-signal associations as a result of the iterated learning. By the 10th participant, each label had consisted of a prefix which specified colour (e.g. ne– for black, la– for blue), a stem for shape (e.g. –ho– for circle, ­-ki­- for triangle) and an affix specifying motion (e.g. –plo for bouncing, –pilu for looping). As predicted by mathematical modelling results, languages developed over time to ones that facilitated generalisations and rules. Compositional languages also developed where components or sub-parts of each complex label or word specified components of the picture that the label had referred to.

[Back to Table of Contents]

Chapter 8 – Language Evolution in the Laboratory

2019: Nur Amirah Bte Rosman, Joanne Tan Hui San, Cheng Wei Cong Jonathan
2014: Goh Xiao-Qing, Ng Si, Ning You Jing


This is a site dedicated to Language Evolution, mainly focusing on the topic of Language Evolution in the Laboratory. We’ve divided this topic into pages in a chronological order. Hope you’ll have a good read!

Start reading about experiments on language evolution and the ideas behind them now!

1. Introduction

1.1 Observing Language Transmission

The origins of natural language cannot be observed directly. In recent years, evolutionary linguists have designed experiments in the laboratory to study the human cognitive capacities necessary for language and the emergence of new languages. There are three main methodologies to understand how the acquisition of language occurs among populations of individuals. Firstly, computational/robotic models where embodied agents (see also: Chapter 4) interact with one another in simulated environments (Kirby, 2012). Less commonly, there are also mathematical models which focus on mathematical techniques. Lastly, the iterated learning model (ILM) which uses human subjects, is based on the principle that individuals learn by observing instances of others’ behavior and population level behavior is a collective result of the interaction of individual subjects (Kirby, 2012). These models represent and explore the plausible hypotheses about the historical origins of languages.

1.2 Research Significance

Language evolution experiments thus focus on the emergence of new languages and how they are used by human participants. A central theme in language evolution laboratory research is the transformation of individual-level behaviours observed in each participant into linguistic phenomena that can occur at the level of an entire population (Kirby, 2012). The emergence of languages or linguistic phenomena cannot be explained only by reference to the evolution of our biological capacities such as our cognitive mechanisms. As such, we need to consider factors of interaction that occur from generation to generation such as cultural transmission and feedback in providing a considered account for the evolution of language. Research in the laboratory is thus able to provide with empirical data that can further the investigation on the role of cultural transmission and feedback.

In Part I of this chapter, we will explain iterated learning as the underlying concept of laboratory experiments on language evolution. It explores the main hypotheses of and previous work on the cultural transmission of language. In Part II, we present some studies of computational models which help us corroborate the validity of certain theories in language. Lastly, the limitations faced by this area of research are briefly discussed.

Conclusion and References

5. Conclusion

We have introduced gestures in this topic of language evolution by exploring the different types of gestures, namely representational gestures, beat gestures and interactive gestures. In both infants and animals, imitating gestures is a way of learning how to communicate. Infants imitate to match what they see to what they do, while animals imitate as a form of social learning – by imprinting behaviors from an adult or peer to a child.

We also discussed about the popular Gestural Theory in the evolution of language, comparing that to the Theory of Language in terms of human communication. To sum up, The Gestural Theory provides a wide range of evidence accounting for the precedence of gestures over speech in the origins of language, garnering support and backing by many linguists and evolutionists. For example, neurophysiological evidence provided by mirror neurons supports the development of language from gestures. The vast amount of research done on gesturing in animal communication and the homology in mirror systems in the brain have supported that gestures served as an important means of primate communication before vocalizations ultimately led to speech. Gestures in animals, in particular non-human primates, are hence important in providing evidence of this form of primate communication before vocalizations were possible. Studies done on both apes and monkeys support the communicative functions of gesturing in animals, with its wide variety of gesture repertoires providing clues to speech repertoires today.

The case study on Nicaraguan Sign Language (NSL) provides an interesting insight on how home sign systems and gestures have resulted in the birth of a new language across two cohorts of children, following the formation of a deaf community in Nicaragua. It shows just how impactful gestures and signs are in our world, where languages are constantly evolving over time.

We hope you have gained greater insight on gestures as a mode of communication present not only before vocalization and speech, but have also been integrated into our communication systems today as a supplement to speech.


6. References

Alibali, M. W., Heath, D. C., & Myers, H. J. (2001). Effects of visibility between speaker and listener on gesture production: Some gestures are meant to be seen. Journal of Memory and Language, 44(2), 169-188. doi:10.1006/jmla.2000.2752

Arbib, M. A., Liebal, K., & Pika, S. (2008). Primate vocalization, gesture, and the evolution of human language. Current Anthropology, 49(6), 1053-1076. doi:10.1086/593015

Bekkering, H., Wohlschlager, A., & Gattis, M. (2000). Imitation of gestures in children is goal-directed. Quarterly Journal of Experimental Psychology Section A, 53(1), 153-164. doi:10.1080/027249800390718

Brooks-Pollock, T. (2014). The 66 gestures which show how chimpanzees communicate. Retrieved April 2, 2015, from

Corballis, M. C. (2002). From hand to mouth, the origins of language. Princeton: Princeton University Press.

Fay, J. M. (1989). Hand-clapping in western lowland gorillas (Gorilla gorilla gorilla). Mammalia, 53(3).

Fogassi, L., & Ferrari, P. F. (2004). Mirror neurons, gestures and language evolution. Interaction Studies, 5(3), 345-363. doi:10.1075/is.5.3.03fog

Gillespie-Lynch, K., Greenfield, P. M., Feng, Y., Savage-Rumbaugh, S., & Lyn, H. (2013). A cross-species study of gesture and its role in symbolic development: Implications for the gestural theory of language evolution. Frontiers in Psychology, 4.

Goodall, J. (1986). The Chimpanzees of Gombe: Patterns of behaviour. Cambridge: Belknap.

Hobaiter, C. & Byrne, R. W. (2014). The Meanings of Chimpanzees Gestures. Current Biology, 24(14), pp. 1596-1600.

Kohler, E., Keysers, C., Umilta, M. A., Fogassi, L., Gallese, V., & Rizzolatti, G.(2002). Hearing Sounds, Understanding Actions: Action Representation in Mirror Neurons. Science, 297, pp. 846-858.

Kimura, D. (1993). Neuromotor mechanisms in human communication. Oxford: Oxford
University Press.

Kummer, H. (1968). Social organization of hamadryas baboons. Chicago: University of Chicago Press.

Maestripieri, D. (1999). Primate social organization, gestural repertoire size, and communication dynamics. The origins of language: What nonhuman primates can tell, pp. 55-77.

McLane, J. (1996). The voice on the skin: Self-mutilation and merleau-ponty’s theory of language. Hypatia, 11(4), 107-118.

McNeill, D. (1985). So you think gestures are nonverbal? Psychological Review, 92(3), 350-371. doi:10.1037//0033-295X.92.3.350

Meltzoff, A. N., & Moore, M. K. (1977). Imitation of facial and manual gestures by human neonates. Science, 198(4312), 75-78. doi:10.1126/science.897687
Morgan, G., & Kegl, J.(2006). Nicaraguan Sign Language and Theory of Mind: the issue of critical periods and abilities. Journal Of Child Psychology & Psychiatry, 47(8), 811-819. doi:10.1111/j.1469-7610.2006.01621.x

Ogden, J. & Schildkraut, D. (1991). Compilation of gorilla ethograms. Atlanta: Gorilla Behavior Advisory Group.

Paget, R. A. S. (1963). Human speech: Some observations, experiments and conclusions as to the nature, origin, purpose and possible improvement of human speech.

Parnell, R. J., & Buchanan-Smith, H. M. (2001). Animal behaviour: An unusual social display by gorillas. Nature, 412(6844).

Pika, S. (2008). Gestures of apes and pre-linguistic human children: Similar or different? First Language, 28(2), 114-140. doi:10.1177/0142723707080966

Redshaw, M. & Locke, K. (1976). The development of play and social behaviour in two lowland gorilla infants. Journal of the Jersey Wildlife Preservation Trust, Thirteenth Annual Report, pp. 71-86.

Senghas, A., & Coppola, M. (2001). Children Creating Language: How Nicaraguan Sign Language Acquired a Spatial Grammar. Psychological Science (Wiley-Blackwell), 12(4),

Senghas, A., Kita, S., & Özyürek, A. (2004). Children Creating Core Properties of Language: Evidence from an Emerging Sign Language in Nicaragua. Science, 305(5691), 1779-1782.

Skoyles, J. R. (2000). Gesture, language origins, and right handedness. Psycoloquy, 11(24).

Zentall, T. R., & Akins, C. (2001). Imitation in animals: Evidence, function, and mechanisms. Cybernetics and Systems. doi:10.1080/019697201300001812

Part IV: Nicaraguan Sign Language

5. From Simple Signs to a Full Sign Language: The Nicaraguan Sign Language

How do individual simple signs become a full-blown language over time?

A particular case study that shows the development of language evolution is the emergence of the Nicaraguan Sign Language. The Nicaraguan Sign Language (NSL) was created through the interaction between previously isolated deaf people and subsequent cohorts of children exposed to this initial gestural communication (Morgan & Kelg, 2006). With the sequential cohorts of learners, the community has shown the systemization of its grammar over the past years. According to Senghas & Coppola (2001), language systematicity in NSL stems from children aged ten and younger, indicating that young children collectively possess the capacity to learn and create language. This is substantiated by one of NSL’s structural complexity feature, spatial modulation.

One of the signs for “Nicaragua”.

Spatial modulations are the building blocks of grammars in sign languages. In developed sign languages, spatial modulations perform functions that provide grammatical relationships like subject and object, deictic, locative, temporal information etc. (Senghas & Coppola, 2001). Senghas & Coppola’s study (2001) aimed to understand the role of spatial modulations in terms of prevalence, function and production rate in NSL grammar systematicity. The results showed that spatial modulations are signed frequently in the early-exposed signers of the second cohort than those of the first cohort, revealing that the second NSL cohort did not merely reproduce the language by their predecessors, but also modified the language as they learned it. Spatial modulations are also found to be the medium of enabling long-distance grammatical relationships among words in NSL, similar to other established sign languages. Additionally, they are communicated with an increase in overall fluency.

In Senghas et al.’s research on NSL (2004), they focused on two of Hockett’s design features of a language: discreteness and combinatorial patterning. These properties arose naturally as a product of the language-learning mechanism although they were not available in the surrounding environment. They found that there was a predisposition for linear sequencing and segmental approaches to bundles of information.

For example, to describe complex motions, like rolling down a hill, participants of the second NSL cohort signed the manner and path sequentially whereas the first cohort articulated the manner and path simultaneously. According to Senghas et al. (2004), when representations express manner and path separately, the iconicity of the simultaneous movement is no longer clear. However, this combinatory change in communication, which allows for more potential and ambiguity, denotes a shift from gestural to more language like expressions in NSL Such combinatory changes that enable the production of infinite utterances from a finite set of elements, explain various core, universal properties of mature languages like how discrete elements (words and morphemes) are combined to form hierarchically organized constructions (phrases and sentences). These changes are further reinforced in the newer cohorts. Likewise, word order regularities driven by children are well documented in creoles and similar sequencing elements are also identified.

Simultaneously manner and path vs. Sequential manner and path in NSL

Part III: Speech Before Gestures – Is It Possible?

4. Speech Before Gestures – Is It Possible?

In many studies of child acquisition of language, gestures pave the way for children’s early nouns. One simple illustration of this is a child producing a deictic gesture for a particular object, for example a dog, approximately 3 months before they are able to verbally label it (Iverson & Goldin-Meadow, 2005). However, does this always mean that gestures precede speech?

In a Özçalışkan et al.’s (2013) study conducted on iconic gestures and children’s early verbs, it was revealed that the use of deictic gestures begins in children at 10 months, preceding the production of verbs in children by six months. However, the onset of iconic gestures conveying action meanings follows, rather than precedes the child’s first verbs. Examples of iconic gestures include flapping the arms to depict a bird flying or moving an empty fist forcefully forward to convey the meaning of “throwing”. Unlike pointing gestures, iconic gestures involve the representation of referent with a particular symbol, thus, imposing greater cognitive demands than deictic gestures. Eventually, gestures have come to complement vocalized ideas. These dynamic iconic gestures hence suggests the need for a inner verbal system before the communication of the an idea, be it in gestures or speech.

The findings of the study also suggest that children use gestures to expand their repertoire of action meanings, but only after they have begun to acquire the verb system that is underlying their language. Perhaps the theory suggesting that gestures precede speech only holds for gestures of less demanding cognitive functions of the brain, such as deictic gestures. Acquisition of verb and nouns is ultimately crucial to producing meaningful gestures that express relational concepts. Therefore, although the investigation of speech occurring before gestures seems hardly possible, it is arguable that an early inner verbal system that is fundamental for communication may have preceded gestures in human communication, as shown in studies conducted on children’s early verbs.

Gestures as primate communication – before speech. Photo: Dr. Catherine Hobaiter

Part II: Gestural Communication in Animals

3. Gestural Communication in Animals

There are numerous studies that support that non-human primates, who are of closest relation to humans are able to gesture as part of their communication systems. The pioneering studies of Goodall (1986) and Kummer (1968) observed different gestures used by monkeys and apes.

The studies of gestural communication in apes both in captivity and in the wild reveal the following:

    1. The use of communicative gestures is common across the species,
    2.  There is considerable variability in gesture repertoire from group to group,
    3. Gestures are used flexibly in different contexts, depending on the behavior of the recipient.

Gesturing by primates is argued to be partially genetic, with apes performing gestures that are characteristic of their species even without prior observation of the gesture. An evidence of this is the “chest beat”, a display of threat, being performed by two gorillas that have not witnessed the performance of this gesture before (Redshaw & Locke, 1976). Similarly, chimpanzees from peer groups that essentially had no opportunity to observe older conspecific develop many of the same play gestures performed by individuals from a more natural group composition (Berdecio & Nash, 1981). Hence, the Gestural Theory suggests the importance and intrinsicality of gestures in primate communication, holding that gestures were the precursors to language and communication.

A gorilla’s gesture of “chest beat” is an intrinsic part of primate communication that conveys a message of threat.

3.1 Gestural Repertoires in Non-Human Primates

Pika (2008) explained that primates are able to produce different types of gestures, namely,

  1. Auditory gestures – dependent on the sound produced
  2. Tactile gestures – dependent on physical contact
  3. Visual gestures – dependent on visual information

Gestural repertoires vary depending on the individual’s age, sex and group affiliation. For example, there are at least twenty different gestures observed among siamangs, a group representative of small apes and gibbons. Such gestures include “embrace” and “offer body part”. The production of species-typical gestures may be due to genetic dispositions (Arbib et al., 2008), as chimpanzees from peer groups that had no opportunity to observe older chimpanzees developed many of the same play gestures as individuals from relative groups.

3.2 Monkeys and Apes

Pika’s research (2008) revealed that monkeys are able to notify fellow members in the family. This was determined when one monkey had eye-to-eye close contact with another monkey before moving onto another task. It is possible that the first monkey was emphasizing a point to the other monkey. The research also revealed that monkeys were able to attract a fellow monkey’s attention by conducting the action of slapping the ground. Different species of monkeys have also been found to use different varieties of manual gestures and postures, varying as a function of social rank and contexts (Maestripieri, 1999).

Pika (2008) also showed that apes produced communicative gestures that were specific to its species. Ogden & Schildkraut (1991) suggested that gorillas are able to produce a combination of auditory and visual gestures. Some specific gestures include the splash display (Parnell & Buchanan-Smith, 2001) and the hand clapping (Fay, 1989). Arbib et al. (2008) revealed that gorillas in captive are able to utilize a variety of at least 30 different tactile, visual and auditory gestures.

A recent study conducted by the University of St Andrews also records up to 66 different gestures that show how chimpanzees communicate, such as the tapping of another chimpanzee to indicate ‘stop that’ and the raising of an arm to indicate ‘give me that’ (Brooks-Pollock, 2014). Even with such a “gestural lexicon” on chimpanzees created, many variances were observed in the communication of a certain meaning, suggesting that different groups of monkeys develop their respective sets of gestural repertoires. Nonetheless, Hobaiter & Bryne (2014) explained that just with human words, some gestures by the chimpanzees have several senses, but the meanings remain the same irrespective of who uses them.

This information on the gestural system in non-human primates suggests the primitive origins of gestures as a form of communication even with the absence of speech, providing additional insight to the gestural-first theory in the study of evolutionary linguistics, particularly in the recent years.

Here’s a video that shows how the “chest beat” (as performed innocently by the little girl) has agitated a gorilla, due to the gesture signifying an act of dominance and aggression:

Part I: Gestural Theory

2. Gestural Theory

McLane (1996) described gesturing as a means to communicate an experience. While this might encourage a person to speak, there are times where speech just simply cannot be produced, for example, in situations such as taboo topics or situations where the mind does not have words to express what is intended. Based on the research she conducted on trauma patients, McLane suggests that physical and psychological trauma situations are examples in which communication by speech is challenging. As a result, people may turn to gestures as a way of conveying their message.

The theory of language suggests that people use language as an extension of us due to a need to communicate our experiences, in other words, people have the need to share about things that had happened to them (McLane, 1996). This is what supports communication, as people “communicate to hear and to be heard”. As McLane (1996) had mentioned, “We will say our lives in order to have or live our lives” (p. 107). This quote illustrates that in order to experience life, there is a need to express the experiences.

The Gestural Theory states that human language was developed from gestures that were a primitive form of communication, as opposed to the vocal signals that might have been adopted by non-human primates. According to Gillespie-Lynch et al. (2013), bipedalism might be an influencing factor on gesturing, as walking on two feet allows both hands to be available for gesturing. To date, this theory that hypothesizes that gestures preceded speech in human language remains a popular topic of discussion by both evolutionists and linguists. Numerous anatomical and neurophysiological data have supported the stance that human language had evolved from gestural communication (Paget, 1963; Corballis, 2002; Kimura, 1993).

We will be focusing on the Gestural Theory to further our discussion on gestures in the evolution of language.

2.1 Imitating Gestures

Through the years, animals and humans have learned from imitation. This method encourages more efficiency in learning as it bypasses the need for time-consuming trial and error method as a form of learning. This is despite humans making conscious efforts to steer away from this mentality especially when it comes to creating new innovations or starting something new, all of which requires some measure of trial and error. In contrast, a colloquial phrase “Monkey See, Monkey Do” illustrating the imitation of one another in animals is frequently used.

In humans, imitation is most commonly seen among children. A study by Meltzoff & Moore (1977) suggests that infants can do so from the as early as twelve days old, from the mimicking of actions to facial gestures. The research also illustrates that imitation occurs when infants attempt to match what they see (visual input) to their actions (motor output). Another proposed theory is that imitation in children is goal orientated, meaning infants are assumed to react by imitation to a stimulus.

A series of experiments conducted by Bekkering et al. (2000) showed that infants react in the following ways:

    1. Towards an object – when reaching to touch something
    2. Towards an agent – when reaching towards an interactive object such as an adult’s fingers
    3. Towards a movement path – when reaching towards and along in a given direction
    4. Towards salient features – when reaching towards gestures such as arms crossing.

Similarly, imitation is frequently seen in animals. Although it is less apparent as compared to gestures, animals seem to imitate more in commonly in terms of behavior. Zentall & Akins (2001) speculated in their study that one determining factor for imitation to occur could be the need for social learning among animals. Imprinting is seen as a social learning method that describes how one animal imitates the other. This form of phase-sensitive learning is independent of the consequences of the behavior. One common example is how ducklings follow after their mother duck as they move from one place to another. This form of social learning in animals is also hypothesized by Lorenz (1935) to have a critical period of 13 to 16 hours after hatching, just as language acquisition in humans follow the Critical Age Hypothesis.

Like languages, imprinting as a form of social learning imitation method in animals, is hypothesized to have a critical period for learning.

Hence, gesturing can be seen through the imitation of actions by both animals and humans, suggesting the presence of an indirect mode of communication that does not require speech.

2.2 Mirror Neurons

Neurophysiological evidence supports the theory that gestural communication serves as the precursor of human language. Fogassi & Ferrari (2014) investigated the motor cortex in monkeys, also known as area F5, where mirror neurons are located. These mirror neurons are activated when an animal executes or observes a goal-related action performed by another. According to Skoyles (2000), mirror neurons are able to explain how signs are produced and interpreted. Mirror neurons are also able to support the emergence of spoken language after the demise of the primitive gesturing systems as a means of communication.
According to research done by Gallese et al. (1996) and Rizzolatti et al. (1996), these visuomotor neurons (mirror neurons) became activated in the area F5 of the brain when monkeys perform hand actions, or when they observe another individual producing a similar action.


There are two categories that depict the connection between mirror neurons and communication.

  1. Audiovisual mirror neurons becomes activated when monkeys not only observe, but also hear the sound of an action (Kolher et al, 2002)
    • For example, a monkey would respond to the sound of a peanut being broken open when the action is either observed, heard, or both. However, a monkey would not respond to the vision or sound of another irrelevant action.
  2. Mouth mirror neurons become activated when a monkey observes and executes mouth ingestive actions such as biting, sucking, licking etc.
    • For example, a monkey would respond and react in the presence of food or in anticipation of food.

The activation of the mirror neurons seems to have a direct correlation to the seen and produced actions. Hence, there is a possible impact on the interpretation of actions by action observation and execution.

Monkey See, Monkey Do? Fogassi et al. has revealed that mirror neurons in monkeys’ brains are activated when observing and executing both hand and mouth actions, suggesting its responsibility for the emergence of speech during evolution.

Fogassi et al. (2004) suggests that the properties of mirror neurons are part of the basic neural mechanism that associates gestures with meaningful sounds. This also suggests that this pre-adaptation subsequently led to an emergence of speech.

In that case, how do mirror neurons link to the emergence of language in humans?
Skoyles (2000) explains that mirror neurons are found in an area known as area 44, located in the Broca’s area. Broca’s area is a region in the frontal lobe of the left hemisphere of the brain, responsible for language processing and speech production. Both area F5 and Broca’s area were activated during the observation of hand and mouth actions based on demonstrations by brain imaging experiments. With this homology of area 44 near Broca’s area and F5 region (where mirror neurons in monkeys were located), Fogassi & Ferrari (2004) proves the existence of a mirror system for action understanding, just like the activation of mirror neurons in monkeys.

Homology of area 44 in humans (left) and F5 region in a monkey’s brain shows that humans have a mirror system of understanding based on action observation and executive, similar to that of the dyadic communication in monkeys.

These observations in the cortical region precursor of Broca’s area, area 44, revealed its capacity to execute and understand hand and mouth actions, indicating primitive forms of dyadic communication (Fogassi & Ferrari, 2004). These homologies found based on neurophysiological evidence supports the gestures-first hypothesis – that human language evolved from a gesture performance and understanding system implemented a mirror neurons, allowing language to encompass features such as action-understanding, imitation-learning, and simulation of others’ behaviors as communication evolved over time.

Chapter 7 – Natural laboratories for language evolution: Pidgins, Creoles, and Sign Language

2015: Esther Wong Rui Li, Low Lin Yi, Lyn
2014: Jessica Chua, Lee Mui Wei, Lew Xu Hong


Hi there! Welcome to Gestures, Speech and Sign Language in Language Evolution. In this chapter, you will learn more about gestures and its role in the evolution of language.

1. Introduction to Gestures

Gestures are a type of non-verbal communication used by a speaker to aid communication. McNeill (1985) interprets gestures as the second channel of communication, in which the first channel of communication is actual speech, thus implying that gestures add on to speech production. Alibali et al. (2001) conducted a series of experiments to see whether speakers would use gestures differently depending on the need to see gestures when communicating. The results revealed that speakers would continue to use gestures when communicating regardless of the listener’s visibility of their gestures.

In this chapter, we will discuss the role of gestures in language evolution by focusing on the Gestural Theory and how gestures have contributed to an emergence of a new sign language today.

1.1 Representational Gestures

Representational gestures are gestures that carry some form of speech content. This form of gestures is reliant on the visibility of the gestures, in other words, the listener has to be in the line of sight of the speaker for the message to be conveyed successfully. There is also a higher frequency of use of representational gestures if the speakers are able to see their listeners during communication (Alibabi et al., 2001).
What was interesting was that speakers still produced representational gestures when communicating even when they could not physically see their listeners, suggesting that representational gestures are a part of speech production. The visibility of the listener merely varies the frequency of gesture production, but does not affect gesture production. Hence, this form of gesturing has become a subconscious form of communicating ideas.

Iconic gestures are a form of representational gesture that refer to a concrete referent. For example, a kicking motion with the foot conveys the meaning of the action “to kick”. Speakers use iconic gestures to emphasize what they are talking about, as iconic gestures refers specifically to an action or object that the speaker intends to communicate.

1.2 Beat Gestures

Beat gestures are gestures that do not carry any speech content. They convey non-narrative content and are more in tune with the rhythm of speech. According to Alibali et al. (2001), beat gestures are used regardless of whether the speaker could see the listener or not. Therefore, beat gestures accentuate the topic that is being conveyed without directly referring to the topic, emphasizing certain words and phrases during speech.

Still unsure about what beat gestures are? Here’s a video to show an example of beat gestures!

Interactive Gestures

Interactive gestures are gestures that are a combination of both representational and beat gestures. This form of gestures is commonly seen in dialogues that consists of a back and forth communication flow between speakers.