Causal approaches to investigating language evolution:
New studies of the association between phonology and climate

May 18th 2024

Workshop at the International Conference on the Evolution of Language

Organised by Seán Roberts and Frederik Hartmann

Register for the workshop

(Registration for the workshop does not require registering for the full EvoLang conference)

How can we use currently available cross-cultural data to make inferences about the evolution of linguistic features over long time periods?

Demonstrating evidence for evolutionary processes in cultural systems has been a central task for evolutionary linguistics and a common theme for presentations at EvoLang. Evidence of selective pressures at work in real linguistic data provides a bridge between the linguistics of present-day languages and inferences about the distant past, and designing empirical methods to test hypotheses has acted as a catalyst for theoretical development.

One of the most debated hypotheses, the topic of the inaugural special issue of the Journal of Language Evolution, is the idea that humidity affects the cultural evolution of lexical tone (Everett, Blasi & Roberts, 2016). Motivated by similar investigations of the relationship between linguistic and extra-linguistic phenomena in global data (Fought et al., 2004; Ember & Ember, 2007; Dediu & Ladd, 2007; Lupyan & Dale, 2010; Maddieson, 2011; Atkinson, 2011), these studies exposed theoretical and methodological questions that are still at the forefront of research today (Maddieson, 2023). How can we access the best data for cross-cultural comparison? How can we control for the influence of inheritance and borrowing? How do we model complex dynamic relationships between linguistic features? And what are the implications for our understanding of languages in the deep past?

New directions

This year, several exciting studies have broadened the debate about the relationship between linguistic features and the physical environment in which they are used. These are collected in this workshop to present the cutting edge of progress in the field. This includes improvements in the measurement of climatic and linguistic variables. For example, Maddieson & Benedict (2023) include advanced geospatial methods for defining the geographic ranges of languages and integrating the temporal variability of environmental variables. They use this to measure the correlation between various linguistic and climatic features, including a replication of the correlation between tone and humidity.

Furthermore, there is new evidence that the evolution of tone is bound up with many other cultural features. Wu et al. (2023) investigate the co-evolution of tone and other phonological features, finding strong phylogenetic signals. Furthermore, languages with more lexical tone have shorter word lengths (Wichmann, 2023), and shorter word lengths are associated with larger population sizes across macroareas (Wichmann & Holman, 2023).

In another innovative study, Liang et al. (2023) analyse the relationships between humidity, voice quality, and number of tones in 997 language varieties in China from over a million voice recordings. The results show that lower humidity is associated with poorer voice quality (more acoustic jitter and shimmer), and that poorer voice quality is associated with the variety having fewer contrastive tones.

There are also methodological developments: Hartmann et al. (under review) use historical climate models to map historical estimates of humidity onto a geo-phylogenetic tree in order to test diachronic change; and Grollemund et al. (2023) continue to integrate what we know about genetic and linguistic variation to make inferences about historical population movements through different climatic regions. Finally, there are theoretical contributions: Everett (2021) connects what we know about current language/climate relations to the attempt to reconstruct prehistoric speech profiles.

For a long time, language has been understood to be shaped by our biology. Conferences such as EvoLang have helped reveal the additional role of culture. The recent studies present a great opportunity to share work on the next frontier: how language is shaped by our wider ecology.

Confirmed speakers

All times are in local Madison, Wisconsin time (Central Daylight Time, UTC−05:00).

09:15 Tonogenesis as a complementary mechanism in the structural evolution of Sino-Tibetan languages Baihui Wu & Menghan Zhang

The origin of tone, known as tonogenesis, has fascinated researchers studying language evolution and human cognition for a considerable period. Linguistic investigations of tonal languages have proposed various hypotheses regarding the origin of tone[1-3], but these hypotheses have not been quantitatively tested in an evolutionary framework. In this study, we evaluated these tonogenetic hypotheses by conducting phylogenetic comparative methods[4, 5] with a Sino-Tibetan language dataset including a large-scale phylogeny and several tonogenetic potentials. Our results revealed a strong phylogenetic pattern in the distribution of tones and suggested that the Proto-Sino-Tibetan languages should be likely non-tonal. Moreover, we identified specific phonological structures, such as the loss of syllable-final consonants and voice quality on vowels, that were closely associated with the origin of tone. These associations were in line with the linguistic suggestion that phonological simplification could induce the origin of tonal contrast. Interestingly, we also found that the presence of several tonal features did not significantly affect the diversification rate of Sino-Tibetan languages. Collectively, our findings shed light on the nature of tone, revealing that it emerged as a compensatory mechanism to facilitate the structural organization and evolution of languages[6, 7]. This research contributes to a quantitative understanding of the origins and functions of tonal features within the broader context of language evolution and human cognition.


1. Thurgood G. Vietnamese and tonogenesis: Revising the model and the analysis. Diachronica. 2002;19(2):333-63.

2. Michaud A, Sands B. Tonogenesis. Oxford Research Encyclopedia of Linguistics2020.

3. Kingston J. Tonogenesis. The Blackwell Companion to Phonology2011. p. 1-30.

4. Pagel M. Detecting correlated evolution on phylogenies: a general method for the comparative analysis of discrete characters. Proceedings of the Royal Society of London Series B: Biological Sciences. 1994;255(1342):37-45.

5. Pagel M, Meade A. Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo. The American Naturalist. 2006;167(6):808-25.

6. Matisoff JA, editor Tibeto-Burman tonology in an areal context. Proceedings of the symposium “Crosslinguistic studies of tonal phenomena: Tonogenesis, Japanese Accentology, and Other Topics; 1999: Tokyo: Tokyo University of Foreign Studies, Institute for the Study of Languages and Cultures of Asia and Africa.

7. Matisoff JA. Tonogenesis in Southeast Asia. Consonant types and tone. 1973;1:71-96.

09:40 Unraveling the influence of essential climatic factors on the number of tones through an extensive database of languages in China Shuai Wang, Yuzhu Liang, Tianheng Wang, Wei Huang, Ke Xu, Aleksandr Mitkov, Lining Wang, Yongdao Zhou, Quansheng Xia & Qibin Ran

Recent research provides strong support for the hypothesis that humidity influences the tonal system of language, and encourages a more comprehensive exploration of the connection between climate and tone. Based on a substantial database of 1,525 language varieties in China, as well as 41 years of monthly climate data, we conduct a thorough investigation into the relationships between multiple climatic factors and the number of tones, while also investigating voice quality and intra-word pitch variation as intermediary factors. The findings reveal that climatic factors affecting the pitch manifestation, voice quality, and the number of tones are multifaceted, with specific humidity, precipitation, and temperature as essential factors. Furthermore, better voice quality and smaller pitch fluctuations, both related to humid and warm regions, are often associated with languages with more tones. Our results offer additional evidence for interactions between ecology and human behavior mediated through physiological mechanisms.
10:05 Break
10:15 Tone, word length, and population size across languages Søren Wichmann

In previous work some correlations were established which are tentatively fitted together in a causal chain in the present paper. Wichmann et al. (2011) and Wichmann and Holman (2023) found that word length is inversely correlated with speaker populations of languages. This relationship resonates with observations in Lupyan and Dale (2010) suggesting an inverse relation between language complexity in general and population size. Subsequently, Wichmann (2023) observed that the number of tones tends to increase as word length decreases across the world's languages. This would ultimately be an outcome of processes of tonogenesis (Wu et al. 2023, Micheaud and Sands 2020). Thus, both processes are inherently plausible and in this sense unexpected. A difference between them, however, is that the correlation between word length and population size found by Wichmann and Holman (2023), although strong (r = -0.92), only emerges across macro-areas, not within areas or families, which suggests that the relationship takes several thousand years to develop, whereas a phylogenetic correlation analysis showed support for the inverse correlation between word length and the number of tones in as many as half of the families for which the size and presence of tones allow for this type of analysis. Thus, it seems to be a relationship that can emerge relatively quickly, perhaps within centuries. The two processes can be tied together in a causal chain: population size > word length > tone, where the greater-than sign means "influences" without necessarily implying a strong, deterministic force. In addition, population size, which thus seems to be an important player in the patterning of linguistic diversity, is itself influenced by a variety of factors, where climate seems to be of central importance (e.g., Tallavaara et al. 2015). Therefore, in addition to exploring the causal chain just mentioned, including the typical tempo of changes along the chain, this paper will inquire into climatic conditions on language speaker populations drawing upon recent resources such as Beyer et al. (2020).


Beyer, R. M., Krapp, M. and Manica, A. 2020. High-resolution terrestrial climate, bioclimate and vegetation for the last 120 000 years. Scientific Data 7, 236.

Lupyan, G. and Dale, R. 2010. Language structure is partly determined by social structure. PLoS One 5, e8559.

Michaud, A. and Sands, B. 2020 Tonogenesis. In Oxford Research Encyclopedia of Linguistics (ed. M. Aronoff). Oxford, UK: Oxford University Press. doi:10.1093/acrefore/9780199384655.013.748

Tallavaara, M., Luoto, M., Korhonen, N., and Seppä, H. 2015. Human population dynamics in Europe over the Last Glacial Maximum. Proceedings of the National Academy of Sciences of the U.S.A. 112(27), 8232-8237.

Wichmann, S. and Holman, E.W., 2023. Cross-linguistic conditions on word length. PLoS ONE, 18(1), e0281041.

Wichmann S., Rama, T., and Holman, E.W.. 2011. Phonological diversity, word length, and population sizes across languages: The ASJP evidence. Linguistic Typology 15, 177–197.

Wu, B., Zhang, H. and Zhang, M. 2023. Phylogenetic insight into the origin of tones. Proceedings of the Royal Society B, 290(2002), 20230606.

10:40 Tone as signal simplification Ian Maddieson & Karl Benedict

Languages without tone occur widely distributed throughout the world with the notable exception of sub-Saharan Africa, whereas languages with tone are concentrated in regions relatively close to the equator where generally warm and humid conditions prevail, apart from (primarily western) North America and some marginal cases in Europe. Everett et al (2015, 2016) suggest that low ambient humidity leads to loss of tone distinctions, since precise control of phonation is impeded by dry air. An implicit assumption here is that tonality is the ‘natural’ state of language, and its absence is derived (as is claimed by Brown 2017, among others). Comparative evidence suggests that tone is not original, it can nearly always be shown to be derived through documentable phonological change (Maddieson 2023). Hence, if the claimed correlation between the distribution of tone and humidity is valid (see Roberts 2018 for discussion) motivation to encourage tonality in warm wet climates rather than to discourage it in cold dry ones is required as the explanation. Our analyses suggest that it is worth further pursuing the hypothesis that differences in transmission conditions, rather than concerns with production, influence such design features of languages. Tone contrasts concern steady to slow-moving changes in the acoustic signal, which are less disrupted by ambient factors affecting fidelity of transmission, such as dense vegetation and high temperature, as found in tropical regions, compared to distinctions between consonants, especially obstruents. Thus, for example, transferring contrasts between consonant types to distinctions of tone on the following vowel — the most frequent route to tonogenesis — provides a more robust acoustic identity for the intended message under the prevailing conditions.
11:05 Break

11:15 Causal approaches for testing the effect of humidity on tone Frederik Hartmann, Seán G. Roberts, Paul Valdes & Rebecca Grollemund

Previous work has proposed various mechanisms by which the environment may affect the emergence of linguistic features. For example, dry air may cause careful control of pitch to be more effortful, and so affect the emergence of linguistic distinctions that rely on pitch such as lexical tone or vowel inven- tories. Criticisms of these proposals point out that there are both historical and geographic confounds that need to be controlled for. We take a causal inference approach to this problem to design the most detailed test of the theory to date. We analyse languages from the Bantu language family, using prior geographic-phylogenetic tree of relationships to establish where and when languages were spoken. This is combined with estimates of humidity for those times and places, taken from historical climate models. We then estimate the strength of causal relationships in a causal path model, controlling for various influences of inheritance and borrowing. We find no evidence to support the previous claims that humidity affects the emergence of lexical tone. This study shows how using causal inference approaches lets us test complex causal claims about the cultural evolution of language.

11:40 Contextualizing the spatial autocorrelation for some climatic and linguistic variables Caleb Everett & Steven Moran

A number of recent studies have examined potential effects of environmental factors on the use of sounds in speech. Work on this topic now includes acoustic data, in the form of over one million audio files recorded across China, which show an association between extreme ambient aridity and patterns in the fundamental frequency of vocal cord vibration (Liang et al. 2023). The effects in question – increased jitter and shimmer in very dry regions – are not expected to be transmitted via language contact or inheritance. Other higher-level patterns in sound systems that are potentially impacted by environmental factors are definitely influenced by genealogy and/or contact. Recent work aims to control for the latter confounding variables in various ways. With respect to language contact, some work has incorporated tests for spatial autocorrelation between languages rather than, for instance, including geographic regions as random factors in a linear mixed model (e.g., Bromhan & Yaxley 2023). Such recent work advances the discussion, but as we suggest here, could be improved by: a) more careful considerations of the causal mechanisms proposed in the literature, and b) contextualizing the autocorrelation results with “sanity checks”, i.e. tests of well-known environmental influences on human behavior. We aim to enhance work on this topic by addressing these two points. We do so while incorporating new and more detailed data on ambient water vapor, obtained via NASA’s AQUA MODIS satellite data repository. These water vapor data are plotted in Figure 1. We test the potential association between aridity and phonation patterns in speech, finding some support for the association even while controlling for spatial autocorrelation. We contextualize this association with results from tests of better established environmental influences on human cultures.

Figure 1. Map of ambient water vapor data worldwide, gathered from NASA satellite data. Brightness corresponds to the amount of water vapor in the air.


Bromham, L., & Yaxley, K. J. (2023). Neighbours and relatives: Accounting for spatial distribution when testing causal hypotheses in cultural evolution. Evolutionary Human Sciences, 5, e27.

Liang, Y., Wang, L., Wichmann, S., Xia, Q., Wang, S., Ding, J., ... & Ran, Q. (2023). Languages in China link climate, voice quality, and tone in a causal chain. Humanities and Social Sciences Communications, 10(1), 1-10.

12:05 Discussion: integrating culture, population history, and climate Rebecca Grollemund

Discussion and future directions.


Atkinson, Q.D., 2011. Phonemic diversity supports a serial founder effect model of language expansion from Africa. Science, 332(6027), pp.346-349.

Dediu, D. and Ladd, D.R., 2007. Linguistic tone is related to the population frequency of the adaptive haplogroups of two brain size genes, ASPM and Microcephalin. Proceedings of the National Academy of Sciences, 104(26), pp.10944-10949.

Ember, C.R. and Ember, M., 2007. Climate, econiche, and sexuality: Influences on sonority in language. American Anthropologist, 109(1), pp.180-185.

Everett, C., 2021. The sounds of prehistoric speech. Philosophical Transactions of the Royal Society B, 376(1824), p.20200195.

Everett, C., Blasi, D.E. and Roberts, S.G., 2016. Language evolution and climate: the case of desiccation and tone. Journal of Language Evolution, 1(1), pp.33-46.

Fought, J.G., Munroe, R.L., Fought, C.R. and Good, E.M., 2004. Sonority and climate in a world sample of languages: Findings and prospects. Cross-cultural research, 38(1), pp.27-51.

Grollemund, R., Schoenbrun, D. and Vansina, J., 2023. Moving Histories: Bantu Language Expansions, Eclectic Economies, and Mobilities. The Journal of African History, 64(1), pp.13-37.

Hartmann, F., Roberts, S. G., Valdes, P. & Grollemund, R. under review. Investigating environmental effects on phonology using diachronic models. Evolutionary Human Sciences.

Liang, Y., Wang, L., Wichmann, S., Xia, Q., Wang, S., Ding, J., Wang, T. and Ran, Q., 2023. Languages in China link climate, voice quality, and tone in a causal chain. Humanities and Social Sciences Communications, 10(1), pp.1-10.

Lupyan, G. and Dale, R., 2010. Language structure is partly determined by social structure. PloS one, 5(1), p.e8559.

Maddieson, I., 2011, August. Phonological Complexity in Linguistic Patterning. In ICPhS (pp. 28-34).

Maddieson, I., 2023. Investigating the ‘what’, ‘where’ and ‘why’ of global phonological typology. Linguistic Typology.

Maddieson, I. and Benedict, K., 2023. Demonstrating environmental impacts on the sound structure of languages: challenges and solutions. Frontiers in Psychology, 14.

Wichmann, S., 2023. Tone and word length across languages. Frontiers in Psychology, 14, p.1128461.

Wichmann, S. and Holman, E.W., 2023. Cross-linguistic conditions on word length. Plos one, 18(1), p.e0281041.

Wu, B., Zhang, H. and Zhang, M., 2023. Phylogenetic insight into the origin of tones. Proceedings of the Royal Society B, 290(2002), p.20230606.