Research Overview
Natural languages are hybrid systems, products of both a common biological endowment (shared across languages) and a particular ecological niche (specific to a particular language). That shared endowment – the architecture of the human nervous system – serves as a powerful constraint on how languages vary and evolve. Nevertheless, the world’s languages exhibit remarkable diversity in sound, meaning, and structural organization. In my work, I take the view that human languages are the end point of complex processes of cultural evolution, occurring over generations, and that their features can thus be analyzed as adaptive solutions to a complex constraint-satisfaction problem.
In particular, my research seeks to understand 1) how cognitive principles of learning and memory serve to constrain cross-linguistic variation, 2) how social and historical contingencies select for certain designs, and 3) how different design choices can incur trade-offs between early acquisition and adult processing. These questions are approached from the dual perspectives of adaptation, which assesses the fit between a language’s feature and the mechanisms of mind and structure of the environment, and development, which assess the impact of transmission across generations of learners.
Projects:
In particular, my research seeks to understand 1) how cognitive principles of learning and memory serve to constrain cross-linguistic variation, 2) how social and historical contingencies select for certain designs, and 3) how different design choices can incur trade-offs between early acquisition and adult processing. These questions are approached from the dual perspectives of adaptation, which assesses the fit between a language’s feature and the mechanisms of mind and structure of the environment, and development, which assess the impact of transmission across generations of learners.
Projects:
Variation and Fitness
On the cultural transmission model (Tomasello, 2003), a language’s structural features are subject to selection pressures, and variation among languages results both from random drift and selective adaptation to variable circumstances. A central question is how to establish the ‘fitness’ of a particular linguistic feature. Information theory (Shannon, 1948) supplies an answer: Its theorems, and their correlates, specify how to construct a maximally efficient code (such that communication proceeds as rapidly and reliably as possible) and how to quantify the extent to which a given code deviates from this theoretical maximum. A feature’s fitness can thus be measured in terms of its communicative efficiency.
In this work, a guiding principle is that simple model systems can do much to illuminate more complex phenomena. For instance, we might wonder – why is it that when we want to express an idea, we chain words together in sequence? Why not instead simply invent more words to capture all the thoughts we might want to express? These questions get to the heart of one of language’s core design principles, which sets it apart from animal communication systems—that it is combinatorial (Hockett, 1960). To better understand why, we selected personal name grammars as a basis for investigating how the lexicon is structured to permit its growth and development, and how it balances the pressure to expand against cognitive and physiological limitations. The theoretical branch of this work revealed why combinatorial systems are a necessary solution to the problem of naming, and illustrated how information used for identification can be effectively distributed over a signal. The experimental component showed how an information theoretic framing can then be used to generate predictions across a variety of classic memory paradigms. Such empirical demonstrations help quantify the costs of deviating from principles of rational coding—costs that have real world social consequences.
A closely related project applied a similar logic to a different model system. In that investigation, we undertook a comparative analysis of nominal systems in German and English. This work illustrated how these two closely related languages have adopted alternative solutions to the shared problem of noun retrieval with concomitant costs and benefits. In particular, while German relies on grammatical gender to facilitate noun selection, English employs prenominal adjectives. Notably, one system is deterministic, the other probabilistic. Despite their differences, both systems act to efficiently smooth information over discourse, making nouns more equally predictable in context. The choice of solution reflects an interaction between the universal (cognitive constraints) and the particular (social and historical forces). While both languages have adopted solutions that strike a balance between efficiency and learnability, the precise nature of this compromise reflects the peculiar demands of different populations of speakers.
In this work, a guiding principle is that simple model systems can do much to illuminate more complex phenomena. For instance, we might wonder – why is it that when we want to express an idea, we chain words together in sequence? Why not instead simply invent more words to capture all the thoughts we might want to express? These questions get to the heart of one of language’s core design principles, which sets it apart from animal communication systems—that it is combinatorial (Hockett, 1960). To better understand why, we selected personal name grammars as a basis for investigating how the lexicon is structured to permit its growth and development, and how it balances the pressure to expand against cognitive and physiological limitations. The theoretical branch of this work revealed why combinatorial systems are a necessary solution to the problem of naming, and illustrated how information used for identification can be effectively distributed over a signal. The experimental component showed how an information theoretic framing can then be used to generate predictions across a variety of classic memory paradigms. Such empirical demonstrations help quantify the costs of deviating from principles of rational coding—costs that have real world social consequences.
A closely related project applied a similar logic to a different model system. In that investigation, we undertook a comparative analysis of nominal systems in German and English. This work illustrated how these two closely related languages have adopted alternative solutions to the shared problem of noun retrieval with concomitant costs and benefits. In particular, while German relies on grammatical gender to facilitate noun selection, English employs prenominal adjectives. Notably, one system is deterministic, the other probabilistic. Despite their differences, both systems act to efficiently smooth information over discourse, making nouns more equally predictable in context. The choice of solution reflects an interaction between the universal (cognitive constraints) and the particular (social and historical forces). While both languages have adopted solutions that strike a balance between efficiency and learnability, the precise nature of this compromise reflects the peculiar demands of different populations of speakers.
- Dye, M., Milin, P., Futrell, R., & Ramscar, M. (2018). Alternative solutions to a language design problem: The role of adjectives and gender marking in efficient communication. Topics in Cognitive Science. **CogSci proceedings paper awarded the Marr Prize for Best Student Paper, 2017**
- Dye, M., Milin, P., Futrell, R., & Ramscar, M. (2017). A functional theory of gender paradigms. In F. Kiefer, J.P. Blevins, & H. Bartos (Eds.) Morphological Paradigms and Functions. Brill: Leiden.
- Dye, M., Johns, B. T., Jones, M.N., & Ramscar, M. (2016). The structure of names in memory: Deviations from uniform entropy impair memory for linguistic sequences. Proceedings of the 38th Annual Conference of the Cognitive Science Society, Philadelphia, PA.
- Riordan, B., Dye, M., & Jones, M.N. (2015). Grammatical number processing and eye movements in English spoken language comprehension. Frontiers in Psychology. doi: 10.3389/fpsyg.2015.00590
- Ramscar, M., Smith, A.H., Dye, M., Futrell, R., Hendrix, P., Baayen, H. & Starr, R. (2013). The ‘universal’ structure of name grammars and the impact of social engineering on the evolution of natural information systems. Proceedings of the 35th Meeting of the Cognitive Science Society, Berlin, Germany.
The Limits of Information
Adopting an information theoretic model of communication requires, at a minimum, a reworking of traditional conceptions of language descended from structuralism (Ramscar & Baayen, 2013). On the structuralist account, language is a means of direct material exchange, in which messages are passed back and forth, much like a parcel delivery system (de Saussure, 1916). An information theoretic framework, by contrast, adopts the indirect model of telegraphy, in which the sender first translates the message into a signal that can be sent across a physical channel. Successful communication relies on the receiver’s ability to reconstruct the original message from the received signal using a common code.
A starting assumption, then, is that the forms of the language – speech, writing, and gesture – act as signals that discriminate among the repertoire of concepts and experiences available to the speaker. However, there is a significant wrinkle to consider: In the electronic systems that information theory was designed to describe, the sender and receiver share identical repertoires. In natural language, by contrast, conceptual and linguistic knowledge is never strictly identical across speakers, nor is it fixed across time (Labov, 1972). The extent to which communication succeeds depends, in part, on the extent to which the conceptual stores and the codes for encoding and decoding signals align. A rich literature attests to the variability in speakers’ repertoires, as well as the ways in which they may be motivated to align them in service of more effective communication—coordinating aspects of their speech, gaze, and gesture.
The picture of natural language provided by information theory is thus incomplete, recalling George Box’s injunction that “all models are wrong, but some are useful”. Notably missing from this account is how the system is first acquired in childhood, and how it can continue to flexibly develop and adapt over the lifespan. Computational models of learning are thus a necessary complement to the picture supplied by Shannon. My research program touches on each of these areas, asking:
A starting assumption, then, is that the forms of the language – speech, writing, and gesture – act as signals that discriminate among the repertoire of concepts and experiences available to the speaker. However, there is a significant wrinkle to consider: In the electronic systems that information theory was designed to describe, the sender and receiver share identical repertoires. In natural language, by contrast, conceptual and linguistic knowledge is never strictly identical across speakers, nor is it fixed across time (Labov, 1972). The extent to which communication succeeds depends, in part, on the extent to which the conceptual stores and the codes for encoding and decoding signals align. A rich literature attests to the variability in speakers’ repertoires, as well as the ways in which they may be motivated to align them in service of more effective communication—coordinating aspects of their speech, gaze, and gesture.
The picture of natural language provided by information theory is thus incomplete, recalling George Box’s injunction that “all models are wrong, but some are useful”. Notably missing from this account is how the system is first acquired in childhood, and how it can continue to flexibly develop and adapt over the lifespan. Computational models of learning are thus a necessary complement to the picture supplied by Shannon. My research program touches on each of these areas, asking:
- How the nature of learning in childhood guarantees the development of common predictive codes among speakers of the same language,
- Whether sequences that are efficient for processing are similarly well-suited for acquisition,
- How interlocutors with different priors coordinate their communicative efforts to converge on the same message.
The Nature of Learning in Childhood
For communicative systems to function as such, they must be learnable and transmissible over generations of speakers. In the language acquisition literature, a central question is the extent to which the acquisition process is guided by deep-seated instincts and hardwired rules. To address this question, my research with Prof Michael Ramscar and colleagues has focused on what a simple classical conditioning model (Rescorla & Wagner, 1972) can and cannot explain about linguistic development. With a two-layer network, we have elegantly captured several of the key phenomena in early word learning, without recourse to innate constraints, or a more complex learning architecture. One of the virtues of this modeling approach is its interpretability: Given some set of assumptions about the state of the learner and the environmental input, the model can establish whether a complex behavior is, in principle, learnable without substantive priors, and if it is, what parameters influence its learning trajectory.
This approach has offered insight into questions like: How do kids learn the meanings of words in ambiguous contexts? Why do preschoolers over-regularize plural forms (saying "mouses" instead of "mice") and how do they recover from these errors? How do children abstract the properties of colors and numbers, and why are these categories so hard for them to learn? In a similar vein, we found evidence that the use and interpretation of noun compounds and inflected verbs are best explained in terms of learned semantic conventions, rather than deterministic rules. Taken together, these findings highlight how – in typical development – the richness of the input available to young learners drives canonical patterns of acquisition.
This approach has offered insight into questions like: How do kids learn the meanings of words in ambiguous contexts? Why do preschoolers over-regularize plural forms (saying "mouses" instead of "mice") and how do they recover from these errors? How do children abstract the properties of colors and numbers, and why are these categories so hard for them to learn? In a similar vein, we found evidence that the use and interpretation of noun compounds and inflected verbs are best explained in terms of learned semantic conventions, rather than deterministic rules. Taken together, these findings highlight how – in typical development – the richness of the input available to young learners drives canonical patterns of acquisition.
- Ramscar, M., Dye., M, Blevins, J., & Baayen, H. (2015). Morphological development. In A. Bar On & D. Ravit (Eds.) Handbook of Communication Disorders. De Gruyter Mouton.
- Ramscar, M., Dye, M. & McCauley, S. (2013). Error and expectation in language learning: The curious absence of ‘mouses’ in adult speech. Language 89(4): 760-793.
- Ramscar, M., Dye, M. & Klein, J. (2013). Children value informativity over logic in word learning. Psychological Science 24(6): 1017-1023.
- Ramscar, M, Dye, M., & Hubner, M. (2013). When the fly flied and when the fly flew: How semantics can make sense of inflection. Language and Cognitive Processes 28(4): 468-97.
- Ramscar, M., & Dye, M. (2011). Learning language from the input: Why level ordering can’t explain noun-compounding. Cognitive Psychology 62(1): 1-40.
- Ramscar, M., Dye, M., Popick, H. & O’Donnell-McCarthy, F. (2011). The enigma of number: Why children find the meanings of even small number words hard to learn and how we can help them do better. PLoS ONE 6(7): e22501. Winner of the IES Prize for Excellence in Research on Cognition & Student Learning.
- Dye, M. (2010). Why Johnny can't name his colors. Scientific American Mind, 22(2): 48-51.
- Kosaraju, R., Dye, M. & Ramscar, M. (2010). Of frames and frequencies: How early language production is influenced by the distribution. Poster presented at the 36th Meeting of the Society for Philosophy & Psychology, Portland, Oregon.
Our research suggests that associative learning mechanisms have far greater explanatory power than is often supposed. However, this cannot be the whole story: Human languages are markedly more complex and expressive than other animal communication systems. What explains human uniqueness? Prefrontal maturation is clearly an important piece of the puzzle : Young humans undergo a much longer period of cognitive immaturity than other animals, in which they are unable to actively direct their actions or attention. This extended developmental timetable appears to be crucial to the transmission of language and culture, as it enforces a sensitive period in which children sample their environments in broadly consistent ways and systematize variation in their input. At the same time, it exacts a steep cost, leaving these young learners vulnerable and relatively fixed in their behaviors.
Given this, one of the challenges of studying language as it develops, is that a given behavioral output at some point represents both 1) past learning history and 2) present cognitive function – contributions which are easily confounded. To tease these apart, we have shown how very young children, who cannot yet exert cognitive control online, can nevertheless learn to match their behavior to context in surprisingly sophisticated ways, in classic tasks like the A-not-B task and the Dimensional Change Card Sort. Counterintuitive as it may seem, the fact that children typically struggle with these tasks is reflective of a developing system well-adapted for learning.
Given this, one of the challenges of studying language as it develops, is that a given behavioral output at some point represents both 1) past learning history and 2) present cognitive function – contributions which are easily confounded. To tease these apart, we have shown how very young children, who cannot yet exert cognitive control online, can nevertheless learn to match their behavior to context in surprisingly sophisticated ways, in classic tasks like the A-not-B task and the Dimensional Change Card Sort. Counterintuitive as it may seem, the fact that children typically struggle with these tasks is reflective of a developing system well-adapted for learning.
- Ramscar, M., Dye, M., Gustafson, J.W., & Klein, J. (2013). Dual routes to cognitive flexibility: Learning and response conflict resolution in the Dimensional Change Card Sort task. Child Development 84(4): 1308-23.
- Popick, H., Dye, M., Kirkham, N. & Ramscar, M. (2011). Investigating how infants learn to search in the A-not-B task. Proceedings of the 33rd Meeting of the Cognitive Science Society, Boston, MA.
- Dye, M. (2010). The advantages of being helpless: Why human brains are slow to develop. Scientific American Mind.
Learnability vs. Processing
Learning models can tell us something about how language is learned; information theoretic models can tell us something about how language, once learned, is processed. In bridging between these accounts, an important question is whether linguistic features that scaffold early learning are similarly well-suited for adult processing. In other words, are the demands of learnability and processing one and the same, or are they different?
To answer this, an ongoing project with Profs Michael Ramscar and Michael N. Jones seeks to highlight parallels between learning and information theory. Information theory dictates that well-designed verbal sequences should follow a characteristic tree-branching structure, which maintains a constant entropy rate over elements. Inverting this structure has the consequence of concentrating entropy over the first element. This much is is known. What has gone unrecognized, is that these alternative sequencings directly correspond to alternative schemas in discriminative learning (Osgood, 1949; Ramscar et al., 2010). In particular, differences in ‘optimal’ and ‘suboptimal’ coding map neatly onto ‘convergent’ and ‘divergent’ learning schemas, which have been shown to produce markedly different behavioral outcomes.
Practically, what this means, is that different sequence types should pose a trade-off for learning and processing. The existence of such a trade-off is corroborated across a range of experiments, which reveal that while optimally coded sequences are more efficiently processed and better recalled, ‘suboptimal’ sequences are actually better structured for semantic learning. Since languages are designed to be both useable and learnable, their structures must necessarily reflect a compromise between these desiderata; hence, the applicability of information theoretic principles depends closely on the communicative goal.
To answer this, an ongoing project with Profs Michael Ramscar and Michael N. Jones seeks to highlight parallels between learning and information theory. Information theory dictates that well-designed verbal sequences should follow a characteristic tree-branching structure, which maintains a constant entropy rate over elements. Inverting this structure has the consequence of concentrating entropy over the first element. This much is is known. What has gone unrecognized, is that these alternative sequencings directly correspond to alternative schemas in discriminative learning (Osgood, 1949; Ramscar et al., 2010). In particular, differences in ‘optimal’ and ‘suboptimal’ coding map neatly onto ‘convergent’ and ‘divergent’ learning schemas, which have been shown to produce markedly different behavioral outcomes.
Practically, what this means, is that different sequence types should pose a trade-off for learning and processing. The existence of such a trade-off is corroborated across a range of experiments, which reveal that while optimally coded sequences are more efficiently processed and better recalled, ‘suboptimal’ sequences are actually better structured for semantic learning. Since languages are designed to be both useable and learnable, their structures must necessarily reflect a compromise between these desiderata; hence, the applicability of information theoretic principles depends closely on the communicative goal.
- Dye, M., Jones, M., Yarlett, D., & Ramscar, M. (2017). Refining the distributional hypothesis: A role for time and context in semantic representation. Proceedings of the 39th Annual Conference of the Cognitive Science Society.
- Ramscar, M., Yarlett, D.G., Dye, M., Denny, K., & Thorpe, K. (2010). The effects of feature-label-order and their implications for symbolic learning. Cognitive Science 34(6): 909-957.
- Ramscar, M., Suh, E. & Dye, M. (2011). How pitch category learning comes at a cost to absolute frequency representations. Proceedings of the 33rd Meeting of the Cognitive Science Society, Boston, MA.
- Dye, M. & Ramscar, M. (2009). No representation without taxation: The costs and benefits of learning to conceptualize the environment. Proceedings of the 31st Meeting of the Cognitive Science Society, Amsterdam, Netherlands.
Adaptive Communication
Languages are spoken and understood by people from very different backgrounds. Indeed two speakers of the same language may have significantly different representations of that language, even if they are perfectly capable of making themselves understood. An abiding question is how individual speakers develop these distinct representations as a function of their exposure over a lifetime, and how such differences play out in processing and production. For instance: Do differences in their daily reading materials mean that a doctor and a lawyer will process a newspaper article differently? Can readers flexibly adapt their expectations on the basis of genre, in the same way that some bilingual speakers can code-switch?
- Dye, M., Jones, M.N., & Ramscar, M. (in prep). Sensitivity to linguistic structure emerges as a function of expertise.
- Samson, K., Dye, M., & Sherman, S.J. (2014). Is chocolate as bittersweet as nostalgia? Cross-modal priming effects for taste and emotion. Poster presented at the 36th Annual Meeting of the Cognitive Science Society, Quebec, Canada.
- Dye, M., Cox, G., Frey S. (2012). Detours in understanding: The temporal dynamics of malapropisms. Paper presented at the International Society for the Empirical Study of Literature and Media, Montreal, QC, Canada.
- Kao, J., Ryan, R., Dye, M. & Ramscar, M. (2010). An acquired taste: How reading literature affects sensitivity to word distributions when judging literary texts. Proceedings of the 32nd Meeting of the Cognitive Science Society, Portland, OR.
- Ramscar, M., Matlock, T., & Dye, M. (2010). Running down the clock: The role of expectation in our understanding of time and motion. Language and Cognitive Processes 25(5): 589-615.
Learning in Context
Words are rarely experienced in isolation. Rather, they are heard and read in particular verbal and situational contexts, which are integral to our ability to learn a functional lexicon. In a series of experiments with Profs Michael N. Jones, Brendan Johns, and Richard Shiffrin, we have investigated how the particulars of our experience with a lexical item can affect subsequent recognition and comprehension processes. We have specifically been interested in assessing and piecing apart the contributions of recency, frequency, and contextual diversity, using both experimental manipulations and quantitative analyses of verbal stimuli.
- Jones, M. N., Dye, M., & Johns, B. T. (2017). Context as an organizing principle of the lexicon. In B. Ross (Ed.), The Psychology of Learning and Motivation (pp. 239-283).
- Dye, M., Jones, M., & Shiffrin, R. (2017). Vanishing the mirror effect: The influence of prior history & list composition on recognition memory. Proceedings of the 39th Annual Conference of the Cognitive Science Society.
- Dye, M., Jones, M., Yarlett, D., & Ramscar, M. (2017). Refining the distributional hypothesis: A role for time and context in semantic representation. Proceedings of the 39th Annual Conference of the Cognitive Science Society.
- Dye, M., Ramscar, M., & Jones, M. (2017) Representing the richness of linguistic structure in models of episodic memory. Proceedings of the 39th Annual Conference of the Cognitive Science Society.
- Johns, B.T., Dye, M., & Jones, M.N. (2015). The influence of contextual variability on word learning. Psychonomic Bulletin & Review 23: 1214–1220.