Introduction

This Rmarkdown script contains the full results supporting the main paper (but very little interpretation). As described in detail in the README.md document, this script uses various types of input data (linguistic and genetic) and multiple methods of analysis.

Data

Tone

The data sources

WALS

WALS uses a categorical classification with 3 ordered categories ‘None’ < ‘Simple’ < ‘Complex.’ There are 513 languages with data.

None	Simple	Complex
301	127	85

Figure 1. Distribution of tone in WALS.

LAPSyD

LAPSyD gives both a categorical classification with 5 ordered categories ‘None’ < ‘Simple’ < ‘Complex,’ and the actual count of tones. There are 569 languages with data.

None	Marginal	Simple	Moderately complex	Complex
386	8	94	39	42

Figure 2. Distribution of tone in LAPSyD.

Figure 3. Distribution of tone counts in LAPSyD.

Dediu & Ladd (2007)

This uses a categorical classification with 2 (presence/absence) categories ‘No’ and ‘Yes.’ There are 60 languages with data.

No	Yes
30	30

Figure 4. Distribution of tone in Dediu & Ladd (2007)’s database.

PHOIBLE

PHOIBLE gives the actual count in 2030 languages with data.

0	1	2	3	4	5	6	7	8	9	10
1495	4	148	173	101	60	25	11	4	6	3

Figure 5. Distribution of tone in PHOIBLE.

WPHON

WPHON gives the actual count in 3160 languages with data.

0	1	2	3	4	5	6	7	8	9	10	11	12
2193	3	427	222	174	66	43	11	15	2	1	2	1

Figure 6. Distribution of tone in WPHON

Relationships between data sources

WALS - LAPSyD

languages with values in at least one classification: 724
shared languages: 358
language with values only in WALS: 155
language with values only in LAPSyD: 211

	None	Marginal	Simple	Moderately complex	Complex
None	229	2	1	0	0
Simple	4	4	59	12	4
Complex	1	0	2	16	24

Figure 7. Relationship between tone in WALS and LAPSyD.

Pearson’s Chi-squared test: `cooc_tab`
Test statistic	df	P value
515.4	8	3.407e-106 * * *

Pearson’s Chi-squared test with simulated p-value (based on 10000 replicates): `cooc_tab`
Test statistic	df	P value
515.4	NA	9.999e-05 * * *

WALS - Dediu & Ladd (2007)

languages with values in at least one classification: 550
shared languages: 23
language with values only in WALS: 490
language with values only in Dediu & Ladd (2007): 37

	No	Yes
None	12	1
Simple	0	5
Complex	0	5

Figure 8. Relationship between tone in WALS and Dediu & Ladd (2007)’s database.

Pearson’s Chi-squared test: `cooc_tab`
Test statistic	df	P value
19.3	2	6.44e-05 * * *

Pearson’s Chi-squared test with simulated p-value (based on 10000 replicates): `cooc_tab`
Test statistic	df	P value
19.3	NA	9.999e-05 * * *

LAPSyD - Dediu & Ladd (2007)

languages with values in at least one classification: 609
shared languages: 20
language with values only in LAPSyD: 549
language with values only in Dediu & Ladd (2007): 40

	No	Yes
None	12	0
Marginal	0	1
Simple	0	1
Moderately complex	0	2
Complex	0	4

Figure 9. Relationship between tone in LAPSyD and Dediu & Ladd (2007)’s database.

Pearson’s Chi-squared test: `cooc_tab`
Test statistic	df	P value
20	4	0.0004994 * * *

Pearson’s Chi-squared test with simulated p-value (based on 10000 replicates): `cooc_tab`
Test statistic	df	P value
20	NA	9.999e-05 * * *

PHOIBLE - WALS

languages with values in at least one classification: 2074
shared languages: 469
language with values only in PHOIBLE: 1561
language with values only in WALS: 44

	None	Simple	Complex
0	272	70	41
1	1	0	0
2	1	17	4
3	2	7	12
4	0	6	6
5	0	4	7
6	1	6	3
7	0	1	2
8	0	0	1
9	0	0	3
10	0	0	2

Figure 10. Relationship between tone in PHOIBLE and WALS (barplot).

Figure 11. Relationship between tone in PHOIBLE and WALS (boxplots).

Analysis of Variance Model
	Df	Sum Sq	Mean Sq	F value	Pr(>F)
wa_tone	2	371.4	185.7	76.53	1.826e-29
Residuals	466	1131	2.427	NA	NA

	diff	lwr	upr	p adj
Simple-None	1.225	0.8137	1.637	3.976e-11
Complex-None	2.292	1.829	2.754	1.3e-11
Complex-Simple	1.066	0.5312	1.602	1.099e-05

PHOIBLE - LAPSyD

languages with values in at least one classification: 2132
shared languages: 467
language with values only in PHOIBLE: 1563
language with values only in LAPSyD: 102

	None	Marginal	Simple	Moderately complex	Complex
0	314	6	57	13	15
1	1	0	0	0	0
2	2	1	13	0	1
3	0	0	3	9	3
4	0	0	2	4	2
5	0	0	1	3	6
6	1	0	1	3	0
7	0	0	1	0	1
8	0	0	0	0	1
9	0	0	0	0	2
10	0	0	0	0	1

Figure 12. Relationship between tone in PHOIBLE and LAPSyD (barplot).

Figure 13. Relationship between tone in PHOIBLE and LAPSyD (boxplots).

Analysis of Variance Model
	Df	Sum Sq	Mean Sq	F value	Pr(>F)
la_tone	4	368.3	92.06	61.43	1.324e-41
Residuals	462	692.3	1.499	NA	NA

	diff	lwr	upr	p adj
Marginal-None	0.2511	-1.03	1.532	0.9835
Simple-None	0.7475	0.3239	1.171	1.812e-05
Moderately complex-None	2.34	1.719	2.962	4.919e-12
Complex-None	2.84	2.219	3.462	4.874e-12
Simple-Marginal	0.4963	-0.8264	1.819	0.8425
Moderately complex-Marginal	2.089	0.6904	3.488	0.0004863
Complex-Marginal	2.589	1.19	3.988	5.735e-06
Moderately complex-Simple	1.593	0.8892	2.297	1.263e-08
Complex-Simple	2.093	1.389	2.797	4.965e-12
Complex-Moderately complex	0.5	-0.3381	1.338	0.4765

Figure 14. Relationship between tone in PHOIBLE and number of tones in LAPSyD (jittered for increased visibility).

Pearson’s product-moment correlation: `la_n_tones` and `ph_n_tones`
Test statistic	df	P value	Alternative hypothesis	cor
13.86	465	8.444e-37 * * *	two.sided	0.5406

Spearman’s rank correlation rho: `la_n_tones` and `ph_n_tones`
Test statistic	P value	Alternative hypothesis	rho
7516233	1.916e-39 * * *	two.sided	0.5572

PHOIBLE - Dediu & Ladd (2007)

languages with values in at least one classification: 2050
shared languages: 40
language with values only in PHOIBLE: 1990
language with values only in Dediu & Ladd (2007): 20

	No	Yes
0	20	7
1	0	0
2	1	3
3	0	3
4	0	2
5	0	3
6	0	0
7	0	0
8	0	1
9	0	0
10	0	0

Figure 15. Relationship between tone in PHOIBLE and Dediu & Ladd (2007)’s database (barplot).

Figure 16. Relationship between tone in PHOIBLE and Dediu & Ladd (2007)’s database (boxplots).

Welch Two Sample t-test: `ph_n_tones` by `dl_tone` (continued below)
Test statistic	df	P value	Alternative hypothesis
-4.264	19.13	0.0004133 * * *	two.sided

mean in group No	mean in group Yes
0.09524	2.421

WPHON - WALS

languages with values in at least one classification: 3188
shared languages: 485
language with values only in WPHON: 2675
language with values only in WALS: 28

	None	Simple	Complex
0	270	9	3
1	0	0	0
2	13	73	9
3	6	24	18
4	1	11	20
5	0	2	10
6	0	1	8
7	0	0	2
8	0	0	3
9	0	0	1
10	0	0	0
11	0	0	0
12	0	0	1

Figure 17. Relationship between tone in WPHON and WALS (barplot).

Figure 18. Relationship between tone in WPHON and WALS (boxplots).

Analysis of Variance Model
	Df	Sum Sq	Mean Sq	F value	Pr(>F)
wa_tone	2	1094	546.8	490	7.358e-117
Residuals	482	537.9	1.116	NA	NA

	diff	lwr	upr	p adj
Simple-None	2.151	1.882	2.421	5.838e-11
Complex-None	3.954	3.633	4.276	5.838e-11
Complex-Simple	1.803	1.438	2.169	5.838e-11

WPHON - LAPSyD

languages with values in at least one classification: 3211
shared languages: 518
language with values only in WPHON: 2642
language with values only in LAPSyD: 51

	None	Marginal	Simple	Moderately complex	Complex
0	334	1	11	2	1
1	0	0	1	0	0
2	19	6	49	6	6
3	3	0	11	20	6
4	2	0	6	6	9
5	0	0	2	1	7
6	0	0	1	0	4
7	0	0	0	0	1
8	0	0	0	0	3
9	0	0	0	0	0
10	0	0	0	0	0
11	0	0	0	0	0
12	0	0	0	0	0

Figure 19. Relationship between tone in WPHON and LAPSyD (barplot).

Figure 20. Relationship between tone in WPHON and LAPSyD (boxplots).

Analysis of Variance Model
	Df	Sum Sq	Mean Sq	F value	Pr(>F)
la_tone	4	868.5	217.1	278.1	6.119e-127
Residuals	513	400.6	0.7808	NA	NA

	diff	lwr	upr	p adj
Marginal-None	1.561	0.6375	2.484	4.595e-05
Simple-None	1.97	1.672	2.267	2.014e-10
Moderately complex-None	2.732	2.304	3.16	2.014e-10
Complex-None	4.063	3.645	4.48	2.014e-10
Simple-Marginal	0.4092	-0.5438	1.362	0.7655
Moderately complex-Marginal	1.171	0.1699	2.173	0.01257
Complex-Marginal	2.502	1.505	3.499	3.885e-10
Moderately complex-Simple	0.7623	0.273	1.252	0.0002305
Complex-Simple	2.093	1.613	2.573	2.014e-10
Complex-Moderately complex	1.331	0.7601	1.901	4.023e-09

Figure 21. Relationship between tone in WPHON and number of tones in LAPSyD (jittered for increased visibility).

Pearson’s product-moment correlation: `la_n_tones` and `wp_tone`
Test statistic	df	P value	Alternative hypothesis	cor
31.51	516	2.475e-122 * * *	two.sided	0.8112

Spearman’s rank correlation rho: `la_n_tones` and `wp_tone`
Test statistic	P value	Alternative hypothesis	rho
3591376	2.359e-142 * * *	two.sided	0.845

WPHON - Dediu & Ladd (2007)

languages with values in at least one classification: 3176
shared languages: 44
language with values only in WPHON: 3116
language with values only in Dediu & Ladd (2007): 16

	No	Yes
0	22	1
1	0	0
2	1	6
3	0	7
4	0	4
5	0	0
6	0	1
7	0	2
8	0	0
9	0	0
10	0	0
11	0	0
12	0	0

Figure 22. Relationship between tone in WPHON and Dediu & Ladd (2007)’s database (barplot).

Figure 23. Relationship between tone in WPHON and Dediu & Ladd (2007)’s database (boxplots).

Welch Two Sample t-test: `wp_tone` by `dl_tone` (continued below)
Test statistic	df	P value	Alternative hypothesis
-8.362	22.18	2.64e-08 * * *	two.sided

mean in group No	mean in group Yes
0.08696	3.286

WPHON - PHOIBLE

languages with values in at least one classification: 3760
shared languages: 1430
language with values only in WPHON: 1730
language with values only in PHOIBLE: 600

	0	1	2	3	4	5	6	7	8	9	10
0	918	1	8	6	8	3	1	0	0	0	0
1	3	0	0	0	0	0	0	0	0	0	0
2	131	0	27	21	16	7	2	1	1	0	0
3	48	0	17	37	6	5	6	2	1	0	0
4	43	0	14	4	13	6	7	2	1	0	0
5	9	0	2	5	4	7	1	0	0	1	1
6	5	0	1	4	1	2	2	1	0	1	0
7	3	0	0	1	0	1	1	0	0	0	0
8	2	0	0	0	1	0	0	2	0	2	0
9	0	0	1	0	0	1	0	0	0	0	0
10	0	0	0	0	0	0	0	0	0	0	0
11	0	0	0	0	0	1	0	0	0	0	0
12	0	0	0	1	0	0	0	0	0	0	0

Figure 24. Relationship between tone in WPHON and PHOIBLE (barplot).

Figure 25. Relationship between tone in WPHON and PHOIBLE (scatterplot).

Pearson’s product-moment correlation: `wp_tone` and `ph_n_tones`
Test statistic	df	P value	Alternative hypothesis	cor
26.83	1428	9.157e-129 * * *	two.sided	0.579

Spearman’s rank correlation rho: `wp_tone` and `ph_n_tones`
Test statistic	P value	Alternative hypothesis	rho
1.97e+08	3.554e-138 * * *	two.sided	0.5959

Reconciliating the sources

Collapse LAPSyD 4-way

The 5-level coding in LAPSyD is too fine-grained, especially “Marginal” is very rare, and seemingly quite similar with “Simple” (rather than “None”) in its behaviour in the other data sets. On the other hand, “Moderately complex,” while quite similar with “Complex” (but not “Simple”), seems to have an identity of its own. Thus, I collapsed “Marginal” into “Simple,” resulting in a 4-way classification: “None” < “Simple” < “Moderately complex” < “Complex.”

The content of the sources

With this (and as a reminder), the sources contain the following information:

LAPSyD:
- 4-way classification for 569 languages: “None” (386), “Simple” (102), “Moderately complex” (39), “Complex” (42)
- counts for 569 languages: 0 (386), 1 (6), 2 (89), 3 (47), 4 (22), 5 (6), 6 (8), 7 (3), 9 (1), 11 (1)
WALS:
- 3-way classification for 513 languages: “None” (301), “Simple” (127), “Complex” (85)
Dediu & Ladd (2007):
- binary classification for 60 languages: “No” (30), “Yes” (30)
WPHON:
- counts for 3160 languages: 0 (2193), 1 (3), 2 (427), 3 (222), 4 (174), 5 (66), 6 (43), 7 (11), 8 (15), 9 (2), 10 (1), 11 (2), 12 (1)
PHOIBLE:
- counts for 2030 languages: 0 (1495), 1 (4), 2 (148), 3 (173), 4 (101), 5 (60), 6 (25), 7 (11), 8 (4), 9 (6), 10 (3)

The reconciliation rules

I designed a set of rules for deciding on a set of two “agreement” categorical classifications, based on a precedence of the sources and the patterns of (dis)agreement between them:

a binary classification: no tone (“No”) vs any form of tone (“Yes”), and
a 3-way classification: “None” < “Simple” < “Complex.”

More precisely, I preferred to use manually-curated categorical classifications to count sources, resulting in the following (rough) ordering: LAPSyD > WALS > Dediu & Ladd (2007) > WPHON > PHOIBLE.

For the sources that give actual numbers (i.e., counts of tones or tone symbols), we observe that 1 is very rare, probably signalling coding errors, marginal systems (“pitch-accent”) or theoretical arguments, so they can probably be safely collapsed it into 2, and then move everything “one step down” (i.e., 2 → 1, 3 → 2, etc) so we have a continuum of counts from 0 onward. With this, the pairwise correlations between the count sources become:

Figure 26. Relationships between counts after merging 1 into 2 and moving everything down by 1.

Thus, the main idea is to use LAPSyD wherever these data exists, followed by WPHON and finally PHOIBLE (thus with precedence LAPSyD > WPHON > PHOIBLE). Please note that the counts in WPHON and PHOIBLE are “corrected” to better map on those in LAPSyD and to “predict” missing data, using quadratic regression (i.e., the “corrected” counts are computed as WPHON_corr = 0.079 +0.919WPHON -0.04WPHON², and PHOIBLE_corr = 0.394 +0.68PHOIBLE -0.037PHOIBLE², respectively).

The “agreement” classifications

Distributions

Binary classification

# languages with data: 3798:

No	Yes
2541	1257

Figure 27. Distribution of the binary agreement classification of tone.

3-way classification

# languages with data: 3785:

None	Simple	Complex
2538	936	311

Figure 28. Distribution of the 3-way agreement classification of tone.

Counts

Rounded

# languages with data: 3785:

0	1	2	3	4	5	6	8	10
2544	516	524	114	56	26	3	1	1

Figure 29. Distribution of the agreement counts of tone.

Unrounded

# languages with data: 3785:

Figure 30. Distribution of the agreement counts of tone (unrounded).

Relationships with original sources

With WALS

Binary classification

	No	Yes
None	297	4
Simple	4	123
Complex	0	85

Figure 31. Relationship between tone in WALS and the agreement binary classification.

Pearson’s Chi-squared test: `cooc_tab`
Test statistic	df	P value
480.7	2	4.049e-105 * * *

Pearson’s Chi-squared test with simulated p-value (based on 10000 replicates): `cooc_tab`
Test statistic	df	P value
480.7	NA	9.999e-05 * * *

3-way classification

	None	Simple	Complex
None	298	3	0
Simple	4	119	4
Complex	1	2	82

Figure 32. Relationship between tone in WALS and the agreement 3-way classification.

Pearson’s Chi-squared test: `cooc_tab`
Test statistic	df	P value
921	4	4.723e-198 * * *

Pearson’s Chi-squared test with simulated p-value (based on 10000 replicates): `cooc_tab`
Test statistic	df	P value
921	NA	9.999e-05 * * *

With LAPSyD

Binary classification

	No	Yes
None	385	1
Simple	0	102
Moderately complex	0	39
Complex	0	42

Figure 33. Relationship between tone in LAPSyD and the agreement binary classification.

Pearson’s Chi-squared test: `cooc_tab`
Test statistic	df	P value
564.4	3	5.148e-122 * * *

Pearson’s Chi-squared test with simulated p-value (based on 10000 replicates): `cooc_tab`
Test statistic	df	P value
564.4	NA	9.999e-05 * * *

3-way classification

	None	Simple	Complex
None	386	0	0
Simple	0	102	0
Moderately complex	0	12	27
Complex	0	0	42

Figure 34. Relationship between tone in LAPSyD and the agreement 3-way classification.

Pearson’s Chi-squared test: `cooc_tab`
Test statistic	df	P value
1028	6	7.755e-219 * * *

Pearson’s Chi-squared test with simulated p-value (based on 10000 replicates): `cooc_tab`
Test statistic	df	P value
1028	NA	9.999e-05 * * *

Counts

Rounded

Figure 35. Relationship between tone in LAPSyD and the agreement counts.

Pearson’s product-moment correlation: `la_n_tones` and `n_tones`
Test statistic	df	P value	Alternative hypothesis	cor
Inf	567	0 * * *	two.sided	1

Spearman’s rank correlation rho: `la_n_tones` and `n_tones`
Test statistic	P value	Alternative hypothesis	rho
0	0 * * *	two.sided	1

Unrounded

Figure 36. Relationship between tone in LAPSyD and the agreement counts (unrounded).

Pearson’s product-moment correlation: `la_n_tones` and `n_tones_raw`
Test statistic	df	P value	Alternative hypothesis	cor
Inf	567	0 * * *	two.sided	1

Spearman’s rank correlation rho: `la_n_tones` and `n_tones_raw`
Test statistic	P value	Alternative hypothesis	rho
0	0 * * *	two.sided	1

With Dediu & Ladd (2007)

Binary classification

	No	Yes
No	30	0
Yes	0	30

Figure 37. Relationship between tone in Dediu & Ladd (2007) and the agreement binary classification.

Pearson’s Chi-squared test with Yates’ continuity correction: `cooc_tab`
Test statistic	df	P value
56.07	1	7.005e-14 * * *

Pearson’s Chi-squared test with simulated p-value (based on 10000 replicates): `cooc_tab`
Test statistic	df	P value
60	NA	9.999e-05 * * *

3-way classification

	None	Simple	Complex
No	24	0	0
Yes	2	8	13

Figure 38. Relationship between tone in Dediu & Ladd (2007) and the agreement 3-way classification.

Pearson’s Chi-squared test: `cooc_tab`
Test statistic	df	P value
39.61	2	2.502e-09 * * *

Pearson’s Chi-squared test with simulated p-value (based on 10000 replicates): `cooc_tab`
Test statistic	df	P value
39.61	NA	9.999e-05 * * *

With PHOIBLE

Counts

Rounded

Figure 39. Relationship between tone in PHOIBLE and the agreement counts.

Pearson’s product-moment correlation: `ph_n_tones` and `n_tones`
Test statistic	df	P value	Alternative hypothesis	cor
41.42	2028	2.854e-272 * * *	two.sided	0.677

Spearman’s rank correlation rho: `ph_n_tones` and `n_tones`
Test statistic	P value	Alternative hypothesis	rho
358843117	0 * * *	two.sided	0.7426

Unrounded

Figure 40. Relationship between tone in PHOIBLE and the agreement counts (unrounded).

Pearson’s product-moment correlation: `ph_n_tones` and `n_tones_raw`
Test statistic	df	P value	Alternative hypothesis	cor
38.8	2028	9.067e-247 * * *	two.sided	0.6527

Spearman’s rank correlation rho: `ph_n_tones` and `n_tones_raw`
Test statistic	P value	Alternative hypothesis	rho
488625970	1.276e-243 * * *	two.sided	0.6495

With WPHON

Counts

Rounded

Figure 41. Relationship between tone in WPHON and the agreement counts.

Pearson’s product-moment correlation: `wp_tone` and `n_tones`
Test statistic	df	P value	Alternative hypothesis	cor
156	3158	0 * * *	two.sided	0.9408

Spearman’s rank correlation rho: `wp_tone` and `n_tones`
Test statistic	P value	Alternative hypothesis	rho
136411420	0 * * *	two.sided	0.9741

Unrounded

Figure 42. Relationship between tone in WPHON and the agreement counts (unrounded).

Pearson’s product-moment correlation: `wp_tone` and `n_tones_raw`
Test statistic	df	P value	Alternative hypothesis	cor
168.4	3158	0 * * *	two.sided	0.9486

Spearman’s rank correlation rho: `wp_tone` and `n_tones_raw`
Test statistic	P value	Alternative hypothesis	rho
689289938	0 * * *	two.sided	0.8689

Conclusions about tone

These three agreement tone codings were obtain using the full information from the 5 sources, but, of course, we have information about much fewer languages for this study, so that we end up using fewer languages here.

After this sub-setting, in summary, I used 5 primary sources:

WALS: categorical with 3 ordered categories ‘None’ < ‘Simple’ < ‘Complex,’
LAPSYD: categorical, recoded with 4 ordered categories ‘None’ < ‘Simple’ < ‘Moderately complex’ < ‘Complex’ by collapsing ‘Marginal’ into ‘Simple,’ and count, from 0 to 10 tones (mean 0.62 and median 0), by collapsing the original 1 tone into the original 2 tones and moving all tones one step down (i.e., original 2 tones become 1 tone),
Dediu & Ladd (2007): categorical with 2 (presence/absence) categories ‘No’ and ‘Yes,’
WPHON: count, from 0 to 11 tones (mean 0.67 and median 0), by collapsing the original 1 tone into the original 2 tones and moving all tones one step down (i.e., original 2 tones become 1 tone), and
PHOIBLE: count, from 0 to 9 tones (mean 0.66 and median 0), by collapsing the original 1 tone into the original 2 tones and moving all tones one step down (i.e., original 2 tones become 1 tone),

From these, I built 3 “agreement” combined and reconciled measures:

tone_binary: a binary (presence/absence) variable with categories ‘No’ and ‘Yes,’
tone_3way: a categorical variable with 3 ordered categories ‘None’ < ‘Simple’ < ‘Complex,’ and
n_tones: count, from 0 to 10 tones (mean 0.61 and median 0).

However, for the analyses reported here, I used the following variables:

tone1: this represents directly tone_binary and encapsulates the question “does the language use tone?” contrasting no tone (“No”) versus any type of tone system (“Yes”),
tone2: this is the dichotomisation of tone_3way into the question “does the language use a complex tone system?” contrasting complex tone systems (“Yes”) versus no tone and simple tone systems (“No”), and
tone counts: this is the n_tones, counting the number of tones/tone symbols in the language.

For counts, I will also use the unrounded (i.e., raw) “counts,” n_tones_raw, varying between 0 to 10 tones (mean 0.67 and median 0.0793991), to avoid any biases induced by numerically rounding to integer counts.

Distribution of retained tone data

Binary

# languages with data: 321:

No	Yes
251	70

Figure 43. Distribution of binary tone.

3-way

# languages with data: 314:

None	Simple	Complex
248	39	27

Figure 44. Distribution of 3-way tone.

Counts

# languages with data: 314:

0	1	2	3	4	5	6
249	26	23	6	5	3	2

Figure 45. Distribution of tone counts.

Counts (unrounded)

# languages with data: 314:

Figure 46. Distribution of tone counts (unrounded).

Intersection

There are 314 languages with data for binary, 3-way and counts.

ASPM-D and MCPH1-D population frequencies

I will denote the “derived” alleles of ASPM and MCPH1 (Microcephalin) as ASPM-D and MCPH1-D, respectively.

ASPM-D

ASPM-D this was originally defined in relation to “haplotype 63” and two of its polymorphic nonsynonymous sites in exon 18 in an open reading frame (ORF), A44871G and C45126A with the ancestral alleles, respectively, A and C, and the derived ones, G and A (Mekel-Bobrov et al., 2005, p. 1720). Later relevant publications (Patrick C. M. Wong, Chandrasekaran, & Zheng, 2012; Patrick C. M. Wong et al., 2020) however, use SNP rs41310927 with ancestral allele T and derived allele C. While most databases do contain info about this SNP, others do not, such that I also collected data about SNPs in very tight LD with it: rs41308365, rs3762271, rs41304071, rs147068597 and rs61819087 (the LD data was obtained from LDlink’s “LDproxy Tool” using all populations in that database).

Thus, I collected the following data:

Locus/SNP	“derived” allele	Datatbases	Position and LD to target
“haplotype 63”	“haplogroup D”	MB2005	the target
rs41310927	C	WONG2020, LDLink, gnomAD, dbSNP	the target
rs41308365	A	LDLink, gnomAD, dbSNP	chr1:197070707; D’=1.00, R²=1.00
rs3762271	T	LDLink, gnomAD, dbSNP, ALFRED	chr1:197070442; D’=1.00, R²=1.00
rs41304071	T	LDLink, dbSNP	chr1:197063352; D’=1.00, R²=1.00
rs147068597	A	LDLink	chr1:197058136; D’=1.00, R²=1.00
rs61819087	G	LDLink, dbSNP	chr1:197084857; D’=1.00, R²=1.00

where the databases are identified as:

Database	URL	Info	ID
Mekel-Bobrov et al. (2005)	https://science.sciencemag.org/content/309/5741/1720	The original source; 59 populations	MB2005
Patrick C. M. Wong et al. (2020)	https://advances.sciencemag.org/content/6/22/eaba5090	Massive experimental study in Cantonese speakers; 1 population	WONG2020
LDLink	https://ldlink.nci.nih.gov/?tab=home	“[…] a suite of web-based applications designed to easily and efficiently interrogate linkage disequilibrium in population groups”; 1000 genomes data in 32 individual and grouped populations	LDLink
gnomAD	https://gnomad.broadinstitute.org/	Genome Aggregation Database v2.1.1; very broad populations	gnomAD
dbSNP	https://www.ncbi.nlm.nih.gov/snp/	aggregation of info form multiple databases, mostly using very broad populations	dbSNP
1000 genomes	https://www.internationalgenome.org/	this info is included in other databases (gnomAD) so is not specifically used here	1KG
ALFRED	https://alfred.med.yale.edu/alfred/index.asp	The ALlele FREquency Database; lots of info in many populations; unfortunately, for ASPM only one SNP in strong LD with the target rs41310927 (rs3762271) is available	ALFRED

All SNPs

I ended up with frequency data about these loci in 170 unique samples coming from 127 unique meta-populations (such as “Han Chinese,” “Italians” or “Finnish”). After making sure the frequencies of these SNPs are very highly correlated (in those samples where they do co-occur), I computed their weighted average frequency (weighed by the number of sampled individuals).

Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0.1012	0.2291	0.2416	0.3886	0.684

Figure 48. Distribution of the frequency of the “derived” allele of ASPM across the world.

Excluding “proxy” SNPs

Of these 7 SNPs, 5 are “proxy” SNPs (rs147068597, rs3762271, rs41304071, rs41308365, rs61819087), representing 289 unique samples (and 233237 total alleles) out of 396 (73%) unique samples (and 367519 total alleles; 63.5%) available for ASPM-D.

Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0	0.0995	0.2073	0.2275	0.38	0.6

Figure 50. Distribution of the frequency of the “derived” allele of ASPM across the world, excluding the “proxy” SNPs.

Due to this high proportion of the data being represented by “proxy” SNPs, I also conducted separate analyses excluding these SNPs.

New samples

Moreover, 111 are new samples from 84 unique (meta)populations, compared to the 59 samples in 56 (meta)populations in the original Mekel-Bobrov et al. (2005). These new samples are distributed as:

Africa	Eurasia	America	Papunesia
12	90	5	4

and the corresponding new (meta)populations as:

Africa	Eurasia	America	Papunesia
12	63	5	4

Figure 51. Distribution of the “original” and “new” samples of ASPM-D across the world.

MCPH1-D

MCPH1-D was originally defined in relation to G37995C in exon 8 in an open reading frame (ORF) with the ancestral allele G, and the derived one C (Evans et al., 2005, p. 1717). Later relevant publications (Patrick C. M. Wong et al., 2020) however, use SNP rs930557 with ancestral allele G and derived allele C. While most databases do contain info about this SNP, others do not, such that I also collected info about the SNP rs1129706 which is in very tight LD with it (the linkage data was obtained from LDlink’s “LDproxy Tool” using all populations in that database).

Thus, I obtained the following data:

Locus/SNP	“derived” allele	Datatbases	Position and LD to target
G37995C	C	MB2005	the target
rs930557	C	WONG2020, LDLink, dbSNP	the target
rs1129706	G	ALFRED	chr8:6304814; D’=0.995, R²=0.936

All SNPs

I ended up with frequency data about these loci in 166 unique samples coming from 128 unique meta-populations. After making sure the frequencies of these SNPs are very highly correlated (in those samples where they do co-occur), I computed their weighted average frequency (weighted by the number of sampled individuals).

Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0.0315	0.658	0.7986	0.7125	0.8652	1

Figure 53. Distribution of the frequency of the “derived” allele of MCPH1 across the world.

Excluding “proxy” SNPs

Of these 3 SNPs, 1 are “proxy” SNPs (rs1129706), representing 141 unique samples (and 13028 total alleles) out of 245 (57.6%) unique samples (and 107258 total alleles; 12.1%) available for MCPH1-D.

Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
0.033	0.5634	0.7737	0.6729	0.8357	1

Figure 55. Distribution of the frequency of the “derived” allele of MCPH1 across the world, excluding the “proxy” SNPs.

Due to this high proportion of the data being represented by “proxy” SNPs, I also conducted separate analyses excluding these SNPs.

New samples

Moreover, 107 are new samples from 85 unique (meta)populations, compared to the 59 samples in 56 (meta)populations in the original Evans et al. (2005). These new samples are distributed as:

Africa	Eurasia	America	Papunesia
12	86	5	4

and the corresponding new (meta)populations as:

Africa	Eurasia	America	Papunesia
12	64	5	4

Figure 56. Distribution of the “original” and “new” samples of MCPH1-D across the world.

The original Dediu & Ladd (2007) samples

These are the same for ASPM-D and MCPH1-D:

Africa	Eurasia	America	Papunesia
15	37	5	2

and the corresponding new (meta)populations as:

Africa	Eurasia	America	Papunesia
14	35	5	2

Figure 57. Distribution of the “original” and “new” samples of MCPH1-D across the world.

Putting tone and genes together

When combining the linguistic and genetic data, we are left with 175 unique samples in 129 unique (meta)populations speaking 321 unique “languages” (i.e., Glottolog codes) (from now on, denoted as 175:129:321), of which:

Information for	Number of samples:(meta)pops:languages	Missing samples:(meta)pops:languages
tone binary	175:129:321	0:0:0 = {} : {} : {}
tone 3-way	170:124:314	5:5:7 = {SA001471N, SA001477T, SA001487U, SA001491P, SA001681Q} : {Burunge, Hazara, Mozabite, Oroqen, Xibe} : {buru1320, efee1239, gyel1242, haza1239, oroq1238, tumz1238, xibe1242}
tone counts	170:124:314	5:5:7 = {SA001471N, SA001477T, SA001487U, SA001491P, SA001681Q} : {Burunge, Hazara, Mozabite, Oroqen, Xibe} : {buru1320, efee1239, gyel1242, haza1239, oroq1238, tumz1238, xibe1242}
ASPM-D	170:127:319	5:2:2 = {FINRISK, GenDan, GenNed5, KRGDB, Qatari} : {Dutch, Qatari} : {dutc1256, gulf1241}
MCPH1-D	166:128:320	9:1:1 = {gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish} : {Bulgarian} : {bulg1262}
ASPM-D & MCPH1-D	161:126:318	14:3:3 = {FINRISK, GenDan, GenNed5, gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish, KRGDB, Qatari} : {Bulgarian, Dutch, Qatari} : {bulg1262, dutc1256, gulf1241}
tone binary & ASPM-D & MCPH1-D	161:126:318	14:3:3 = {FINRISK, GenDan, GenNed5, gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish, KRGDB, Qatari} : {Bulgarian, Dutch, Qatari} : {bulg1262, dutc1256, gulf1241}
tone 3-way & ASPM-D & MCPH1-D	156:121:311	19:8:10 = {FINRISK, GenDan, GenNed5, gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish, KRGDB, Qatari, SA001471N, SA001477T, SA001487U, SA001491P, SA001681Q} : {Bulgarian, Burunge, Dutch, Hazara, Mozabite, Oroqen, Qatari, Xibe} : {bulg1262, buru1320, dutc1256, efee1239, gulf1241, gyel1242, haza1239, oroq1238, tumz1238, xibe1242}
tone counts & ASPM-D & MCPH1-D	156:121:311	19:8:10 = {FINRISK, GenDan, GenNed5, gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish, KRGDB, Qatari, SA001471N, SA001477T, SA001487U, SA001491P, SA001681Q} : {Bulgarian, Burunge, Dutch, Hazara, Mozabite, Oroqen, Qatari, Xibe} : {bulg1262, buru1320, dutc1256, efee1239, gulf1241, gyel1242, haza1239, oroq1238, tumz1238, xibe1242}

Some pair-wise differences in terms of samples:(meta)populations:languages with data:

Present in…	… but absent from	samples:(meta)pops:languages
tone binary	tone 3-way (and counts)	5:5:7 = {SA001471N, SA001477T, SA001487U, SA001491P, SA001681Q} : {Burunge, Hazara, Mozabite, Oroqen, Xibe} : {buru1320, efee1239, gyel1242, haza1239, oroq1238, tumz1238, xibe1242}
tone binary	ASPM-D	5:2:2 = {FINRISK, GenDan, GenNed5, KRGDB, Qatari} : {Dutch, Qatari} : {dutc1256, gulf1241}
tone binary	MCPH1-D	9:1:1 = {gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish} : {Bulgarian} : {bulg1262}
tone binary	ASPM-D & MCPH1-D	14:3:3 = {FINRISK, GenDan, GenNed5, gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish, KRGDB, Qatari} : {Bulgarian, Dutch, Qatari} : {bulg1262, dutc1256, gulf1241}
tone 3-way (and counts)	ASPM-D	5:2:2 = {FINRISK, GenDan, GenNed5, KRGDB, Qatari} : {Dutch, Qatari} : {dutc1256, gulf1241}
tone 3-way (and counts)	MCPH1-D	9:1:1 = {gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish} : {Bulgarian} : {bulg1262}
tone 3-way (and counts)	ASPM-D & MCPH1-D	14:3:3 = {FINRISK, GenDan, GenNed5, gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish, KRGDB, Qatari} : {Bulgarian, Dutch, Qatari} : {bulg1262, dutc1256, gulf1241}

Stats

tone1 (is there tone?)

I kept only the entries with non-missing data for the tone1, ASPM-D and MCPH1-D, and if there are more than one possible languages or allele frequencies for a given sample, I only kept those entries that have different tone or allele data. The resulting dataset has 181 observations, distributed among 119 unique Glottolg codes in 35 families (ranging from a minimum of 1 language per family to a maximum of 48, with a mean 5.2 and median 2 languages per family) and 4 macroareas.

There are 161:126:119 unique samples:(meta)populations:languages retained, dropping 14:3:202 = {FINRISK, GenDan, GenNed5, gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish, KRGDB, Qatari} : {Bulgarian, Dutch, Qatari} : {adze1240, ajie1238, amar1272, ambu1247, anei1239, apma1241, arak1252, arib1241, arop1243, aros1241, aulu1238, awtu1239, ayiw1239, baba1268, bahi1254, bann1247, bign1238, bili1260, boik1241, bulg1262, caro1242, cham1313, chek1238, chuu1238, dehu1237, dumb1241, dutc1256, east2443, east2447, fiji1243, futu1245, fwai1237, gapa1238, geez1241, gela1263, gilb1244, gulf1241, guma1254, hali1244, hang1263, hano1246, hmon1264, hoav1238, iaai1238, iatm1242, idak1243, idun1242, iris1253, iwam1256, juho1239, kaia1245, kair1263, kamb1297, kapi1249, kara1486, kaul1240, kela1255, kele1258, kiku1240, kili1267, kire1240, koko1269, kosr1238, kuan1247, kuan1248, kuma1276, kung1261, kwai1243, kwam1251, kwam1252, kwas1243, kwom1262, labu1248, lala1268, lame1260, lauu1247, lena1238, lese1243, lewo1242, long1395, loni1238, lonw1238, louu1245, lusi1240, maee1241, mais1250, male1289, malo1243, mana1295, mana1298, maor1246, mars1254, masa1299, matu1261, mbal1255, mbul1263, mehe1243, meke1243, mele1250, mina1269, ming1252, moch1256, moki1238, moks1248, mono1273, motl1237, motu1246, mudu1242, muri1260, muso1238, muss1246, muyu1244, naka1262, nali1244, nama1264, nami1256, natu1246, naur1243, ndon1254, neha1247, neng1238, ngan1300, niua1240, niue1239, nort2646, nort2836, nort2845, nuku1260, onto1237, paam1238, pate1247, patp1243, pile1238, ping1243, pohn1238, port1285, pulu1242, qima1242, raoo1244, rapa1244, renn1242, rotu1241, rovi1238, russ1264, saaa1240, saam1283, saka1289, sali1295, samo1305, sapo1253, scot1243, siar1238, siee1239, sina1266, sioo1240, sobe1238, sons1242, sout2642, sout2679, sout2807, sout2856, sout2866, sout2869, stan1318, sude1239, surs1246, tahi1242, tain1252, taki1248, tawa1275, tean1237, teop1238, tiga1245, tigr1271, tiri1258, toab1237, toba1266, toke1240, tong1325, tsot1241, tswa1253, tuam1242, tuml1238, tung1290, tuva1244, ulit1238, urav1235, urip1239, vinm1237, waim1251, wall1257, wata1253, west2500, west2519, woga1249, wole1240, xamt1239, xara1244, yabe1254, yess1239, yima1243, zulu1248}.

	Africa	Eurasia	America	Papunesia	Sum
No	9	100	4	7	120
Yes	27	26	6	2	61
Sum	36	126	10	9	181

Figure 58. Distribution of tone1.

Figure 59. Map of tone1.

Figure 60. Relationship between tone1, ASPM-D and MCPH1-D.

Regressions

`glmer`

All data

null model: R² = 0.0%¹, ICC = 70.4%² (but generates warnings: Model is nearly unidentifiable: very large eigenvalue)
macroarea: R² = 23.3%, p_{macroarea/null} = 0.00082³
ASPM:
- by itself: R² = 10.0%, β = -1.00 ± 0.37, p_ASPM/null = 0.0041
- quadratic: R² = 10.7%, β_ASPM2 = -0.97 ± 0.38, p_ASPM2/ASPM = 0.6
- with macroarea: R² = 24.4%, β = -0.37 ± 0.45, p_{macroarea/ASPM} = 0.028, p_{ASPM/macroarea} = 0.42
MCPH1:
- by itself: R² = 9.4%, β = -1.04 ± 0.39, p_MCPH1/null = 0.0064
- quadratic: R² = 9.3%, β_MCPH12 = -1.01 ± 0.39, p_MCPH12/MCPH1 = 0.21
- with macroarea: R² = 24.1%, β = -0.39 ± 0.57, p_{macroarea/MCPH1} = 0.021, p_{MCPH1/macroarea} = 0.5
both alleles (no macroarea):
- ASPM + MCPH1: R² = 13.4%, β_ASPM = -0.72 ± 0.40, p_ASPM/MCPH1 = 0.074, β_MCPH1 = -0.62 ± 0.40, p_MCPH1/ASPM = 0.12, p_{ASPM+MCPH1/null} = 0.0049,
- interaction: R² = 13.2%, p_{ASPM:MCPH1/ASPM+MCPH1} = 0.86

Alleles on macroarea

To better understand this overlap between family, macroarea and the two “derived” alleles, I regressed (separately) the ASPM-D and MCPH1-D on the macroarea, using mixed-effects beta regression (after replacing all \(0.0\) values by \(10^{-7}\) and all \(1.0\) by \(1.0-10^{-7}\), respectively) with language family as random effect:

the alleles are very strongly clustered within families:
- ASPM: ICC = 100.0%
- MCPH1: ICC = 100.0%
macroarea predicts their distribution very strongly:
- ASPM: p = 3.4e-16, R² = 57.9%
- MCPH1: p = 3.1e-12, R² = 70.3%
separating Africa vs the rest of the world seems to drive most of this effect (both alleles have lower frequencies in Africa):
- ASPM: p = 2.3e-14, R² = 39.2%
- MCPH1: p = 3.2e-09, R² = 32.6%

Randomization

For these randomization analyses there are several important parameters:

Parameter	Meaning	Values
`permute`	what to permute?	`nothing` = the original data
		`tone` = permute the tone variable
		`alleles-together` = permute the two alleles together
		`alleles-independent` = permute the two alleles separately, i.e., each is independently permuted
`within`	how are the permutations constrained?	`unrestricted` = all the observations are freely permuted (i.e., there are no constraints, no structure in the data is preserved)
		`families` = only observations within the same language family are permuted (i.e., the structure of the families is preserved)
		`macroareas` = only observations within the same macroarea are permuted (i.e., the structure of the macroareas is preserved)
`macroarea`	how do we control for macroareas?	`none` = no control for macroareas at all
		`fixef` = as fixed effects

I performed 1000 independent replications of each of these parameter combinations, and below are the distributions of the permuted values versus the original ones (i.e., those obtained on the original, non-permuted data).

Regressions on 1000 permuted data. The first 3 columns show the permutation constraints (if any), how the *macroarea* is considered (if at all), and what is permuted. The next columns show the percent of the permutations that, in order, have a better AIC compared to the original fit, are significantly better than the null model (thus testing the effect of both alleles simultaneously), have a significant effect of *ASPM*-D, have a smaller effect (β) of *ASPM*-D than the original fit, and the same for *MCPH1*-D.
Permute within	Macroarea	Permute	AIC	Signif.	p_ASPM-D	β_ASPM-D	p_MCPH1-D	β_MCPH1-D
unrestricted	none	tone	0%	4%	6%	0%	4%	0%
unrestricted	none	alleles-together	0%	5%	4%	0%	5%	2%
unrestricted	none	alleles-independent	1%	6%	6%	0%	6%	1%
unrestricted	fixef	tone	0%	5%	5%	8%	5%	25%
unrestricted	fixef	alleles-together	68%	6%	7%	15%	4%	23%
unrestricted	fixef	alleles-independent	68%	5%	6%	16%	6%	20%
macroareas	none	tone	0%	95%	42%	4%	86%	28%
macroareas	none	alleles-together	26%	76%	10%	7%	59%	73%
macroareas	none	alleles-independent	32%	83%	20%	15%	66%	78%
macroareas	fixef	tone	0%	7%	7%	11%	6%	29%
macroareas	fixef	alleles-together	65%	5%	5%	19%	5%	35%
macroareas	fixef	alleles-independent	66%	4%	5%	20%	4%	35%
families	none	tone	2%	16%	2%	3%	14%	36%
families	none	alleles-together	2%	11%	3%	5%	5%	16%
families	none	alleles-independent	2%	16%	13%	10%	12%	20%
families	fixef	tone	1%	8%	3%	16%	11%	74%
families	fixef	alleles-together	66%	4%	5%	46%	2%	16%
families	fixef	alleles-independent	66%	3%	5%	37%	3%	22%

Regressions on 1000 permuted data. Each plot shows the original result (vertical dashed black line) and the distribution of the permutations for the three possible things to be permuted (colored curves) for each combination of permutation constraints (horizontal panels) and control for macroarea (vertical panels) in terms of the effect size β; ASPM-D is on the left and MCPH1-D on the right. The vertical dotted black thin line is at 0.0.

Restricted sampling

Figure 61. Results for 1000 restricted samplings. For ASPM-D (left): 100% of βs are negative when regressing tone on ASPM alone (one-sided t-test < 0: t(999) = -81.1, mean = -0.81, p = 0), 82.7%, when controlling for the macroarea (t(999) = -32.0, mean = -0.45, p = 2.2e-155), and 82.2% when controlling for both macroarea and MCPH1 (t(999) = -30.5, mean = -0.45, p = 4.9e-145). For MCPH1-D (right): 100% of βs are negative when regressing tone on MCPH1 alone (one-sided t-test < 0: t(999) = -107.4, mean = -0.59, p = 0), 68.7% when controlling for the macroarea (t(999) = -15.5, mean = -0.38, p = 3.9e-49), and 67.6% when controlling for both macroarea and ASPM (t(999) = -14.3, mean = -0.37, p = 1.7e-42).

`brms`

tone1 on ASPM-D and MCPH1-D in a mixed-effects Bayesian framework (using brms) with macroarea, language family and (meta)population as (nested) random effects. The ROPE is the region of practical equivalence around 0.0, usually [-0.1, 0.1] but may vary by regression type" the idea is that the HDI should have an as small intersection as possible with the ROPE. Another take is represented by the p_ROPE which is the proportion of the whole posterior distribution (i.e., 100%HDI) inside the ROPE; so, it can be interpreted like a “classic” p-value.

ASPM only:
- β = -0.69, 89%HDI = [-1.59, 0.25]
- posterior probability p(β<0) = 0.89 (evidence ratio = 7.9), p(β=0) = 0.73 (evidence ratio = 2.8)
- ROPE⁴ = [-0.18, 0.18], % HDI inside ROPE = 13.3%; p_ROPE = 0.118
- comparison ‘null’ vs ‘ASPM’: [B> L= W=(71%:29%) K>]: moderate evidence for null against ASPM (BF=3.83), LOO=0.70 [SE=1.43], WAIC=0.92 [SE=1.19], KFOLD=4.48 [SE=3.46]⁵
MCPH1 only:
- β = -0.63, 89%HDI = [-1.66, 0.46]
- posterior probability p(β<0) = 0.83 (evidence ratio = 5), p(β=0) = 0.76 (evidence ratio = 3.2)
- ROPE = [-0.18, 0.18], % HDI inside ROPE = 15.2%; p_ROPE = 0.135
- comparison ‘null’ vs ‘MCPH1’: [B> L= W=(61%:39%) K>]: moderate evidence for null against MCPH1 (BF=3.17), LOO=-0.01 [SE=1.03], WAIC=0.45 [SE=0.77], KFOLD=2.33 [SE=2.31]
both alleles:
- comparison ‘null’ vs ‘both’: [B> L= W=(79%:21%) K>]: moderate evidence for null against both (BF=9.56), LOO=1.30 [SE=1.53], WAIC=1.35 [SE=1.44], KFOLD=4.67 [SE=3.70]
- interaction:
  - posterior probability p(=0) = 0.82 (evidence ratio = 4.5)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 21.5%; p_ROPE = 0.134
  - comparison ‘no interaction’ vs ‘with interaction’: [B> L> W=(46%:54%) K>]: moderate evidence for no interaction against with interaction (BF=3.06), LOO=0.88 [SE=0.81], WAIC=-0.14 [SE=0.46], KFOLD=6.05 [SE=3.43]
- ASPM (partial):
  - β = -0.61, 89%HDI = [-1.62, 0.32]
  - posterior probability p(β<0) = 0.84 (evidence ratio = 5.3), p(β=0) = 0.78 (evidence ratio = 3.6)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 17.6%; p_ROPE = 0.156
- MCPH1 (partial):
  - β = -0.46, 89%HDI = [-1.55, 0.78]
  - posterior probability p(β<0) = 0.75 (evidence ratio = 2.9), p(β=0) = 0.8 (evidence ratio = 4)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 19.4%; p_ROPE = 0.173

Figure 62. Posterior distributions (with 50% probability mass highlighted) versus 0.0 (the vertical line) for ASPM-D (left) and MCPH1-D (right).

Figure 63. Conditional effects of ASPM-D (left) and MCPH1-D (right).

Figure 64. Posterior predictive checks for ASPM-D (left) and MCPH1-D (right).

Figure 65. Confusion matrices for ASPM-D (left) and MCPH1-D (right).

Mediation and path analysis

Here, I try to disentangle the fact that macroarea is a very good predictor of tone1, but also of the frequency of the two alleles, from any effect that the alleles might have on tone1. For this, I conducted mediation analysis and path analysis, where I model the effect of macroarea on tone1 as partially mediated by the two alleles.

Please note that there are several technical issues with these approaches:

for mediation analysis, the method used (as implemented by function mediate in package mediation):
- cannot deal with a factor with several levels → I focused on the contrast between Africa and the rest of the world;
- cannot deal with language family as random effect → I use “flat” regressions throughout, but I did perform restricted sampling as well as a method to control for family.
to adress these issues, I also conducted Bayesian mediation analysis (using brms) with logistic regression for the outcome, beta regression for the “derived” allele frequencies, and family and (meta)population as random effects (the macroarea cannot be a random effect as it is the treatment as Africa vs the rest of the world).
for path analysis, the method used (as implemented by function sem with robust estimators in package lavaan):
- cannot deal with binary variables unless they are either converted to numeric (0 vs 1) or ordered (i.e., assume that there is an intrinsic ordering between the two values), affecting both the binary contrast between Africa and the rest of the world (coded as Africa=1, or ordered as “rest of the world” < “Africa”) and tone1 (coded as Yes=1, or No < Yes); I tested both codings separately;
- cannot deal with language family as random effect, but I did perform restricted sampling as well as a method to control for family.

Mediation analysis

Figure 66. Graphical representation of the mediation model for the two alleles considered separately. Blue = direct effect of macroarea on tone1; red = indirect effect mediatated by the alleles.

`(g)lm`

All data

For ASPM-D:

total effect (TE) of being in Africa on tone: 0.49 (0.33, 0.63), p=0, decomposed into:
average direct effect (ADE): 0.27 (0.08, 0.47), p=0.008, and
average indirect effect (ACME) mediated by ASPM-D: 0.22 (0.11, 0.34), p=0, mediating 44.9% (19.1%, 79.5%), p=0 of the effect, resulting from:
- effect of being in Africa on ASPM-D: -1.25 ±0.16, p=7.7e-13, and
- effect of ASPM-D on tone: -0.90 ±0.24, p=0.00015.

For MCPH1-D:

TE: 0.50 (0.34, 0.65), p=0, decomposed into:
ADE: 0.55 (0.19, 0.75), p=0.002, and
ACME: -0.05 (-0.22, 0.25), p=0.49, mediating -14.7% (-51.3%, 56.8%), p=0.49 of the effect, resulting from:
- effect of being in Africa on MCPH1-D: -2.19 ±0.09, p=9.9e-59, and
- effect of MCPH1-D on tone: 0.20 ±0.38, p=0.6.

Restricted sampling

Figure 67. Mediation analysis for 1000 restricted samples (i.e., picking one random language per family). The leftmost panels show the distribution of point estimates of the Total Effect (TE), the Direct Effect (ADE) and the Indirect Effect (ACME) for ASPM-D and MCPH1-D; the middle panels show the distribution of the p-values for the same effects, while the rightmost panels show the distribution of the regression slopes (β) for the two alleles, top: for the regression of the allele frequency on within vs outside Africa, and bottom: for the regression of tone on the allele while controlling for within vs outside Africa. The black vertical lines show: 0.0 (solid), 0.05 (dashed) and 0.10 (dotted).

For ASPM-D:

TE: mean = 0.38, median = 0.38; 44.5% significant at α-level 0.05 and 72.8% significant at α-level 0.10; 100.0% > 0.0; one-sample one-sided t-test vs 0: t(999) = 134.2, p = 0;
ADE: mean = 0.28, median = 0.28; 8.2% significant at α-level 0.05 and 29.6% significant at α-level 0.10; 99.6% > 0.0; one-sample one-sided t-test vs 0: t(999) = 87.2, p = 0;
ACME: mean = 0.094, median = 0.091; 3.6% significant at α-level 0.05 and 20.1% significant at α-level 0.10; 99.5% > 0.0; one-sample one-sided t-test vs 0: t(999) = 61.4, p = 0;
β(Africa → allele): mean = -0.86, median = -0.87; 79.8% significant at α-level 0.05 and 96.0% significant at α-level 0.10; 100.0% < 0.0; one-sample one-sided t-test vs 0: t(999) = -211.9, p = 0;
β(allele → tone | Africa): mean = -0.61, median = -0.6; 10.2% significant at α-level 0.05 and 27.7% significant at α-level 0.10; 99.1% < 0.0; one-sample one-sided t-test vs 0: t(999) = -60.0, p = 0.

For MCPH1-D:

TE: mean = 0.38, median = 0.39; 44.3% significant at α-level 0.05 and 72.6% significant at α-level 0.10; 100.0% > 0.0; one-sample one-sided t-test vs 0: t(999) = 133.1, p = 0;
ADE: mean = 0.41, median = 0.44; 6.0% significant at α-level 0.05 and 18.9% significant at α-level 0.10; 96.9% > 0.0; one-sample one-sided t-test vs 0: t(999) = 74.0, p = 0;
ACME: mean = -0.029, median = -0.052; 0.1% significant at α-level 0.05 and 1.4% significant at α-level 0.10; 35.7% > 0.0; one-sample one-sided t-test vs 0: t(999) = -6.2, p = 1;
β(Africa → allele): mean = -2.5, median = -2.5; 100.0% significant at α-level 0.05 and 100.0% significant at α-level 0.10; 100.0% < 0.0; one-sample one-sided t-test vs 0: t(999) = -884.5, p = 0;
β(allele → tone | Africa): mean = 0.42, median = 0.42; 0.2% significant at α-level 0.05 and 1.5% significant at α-level 0.10; 25.5% < 0.0; one-sample one-sided t-test vs 0: t(999) = 21.4, p = 1.

Given the low sample size N = 35 unique families, relatively few effect sizes are big enough to be significant for each individual analysis; however, there are many more significant ACMEs for ASPM-D than for MCPH1-D: 10.2% vs 0.2% (51.0 times) for α-level 0.05, and 27.7% vs 1.5% (18.5 times) for α-level 0.10.

`brms`

Figure 68. Graphical representation of the Bayesian mediation analysis for ASPM-D showing the means of the effects and the actual partial regression coefficients, with their 89% HDIs and p-ROPEs. The colors reflect the sign of the mean estimate (blue=negative, red=positive, gray=(p-ROPE >= 0.05)); solid=(0 not in the HDI), dashed=(0 is in the HDI).

Figure 69. Graphical representation of the Bayesian mediation analysis for MCPH1-D showing the means of the effects and the actual partial regression coefficients, with their 89% HDIs and p-ROPEs. The colors reflect the sign of the mean estimate (blue=negative, red=positive, gray=(p-ROPE >= 0.05)); solid=(0 not in the HDI), dashed=(0 is in the HDI).

Figure 70. Graphical representation of the Bayesian mediation analysis for both ASPM-D and MCPH1-D showing the means of the effects and the actual partial regression coefficients, with their 89% HDIs and p-ROPEs. The colors reflect the sign of the mean estimate (blue=negative, red=positive, gray=(p-ROPE >= 0.05)); solid=(0 not in the HDI), dashed=(0 is in the HDI).

Path analysis

All data

With Africa and tone1 coded numerically, the model fits the data very well⁶ (χ²(1)=0.22, p=0.64; CFI=1.00, TLI=1.01, NNFI=1.01 and RFI=1.00):

Figure 71. Path analysis model with standardised coefficients and significance stars. tone1 and macroarea (Africa vs non-Africa) are coded as numeric binary (tone_bin_num with Yes=1 and Africa_num with in Africa=1); ASPM_z is ASPM-D and MCPH1_z is MCPH1-D.

## lavaan 0.6-8 ended normally after 25 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         8
##                                                       
##   Number of observations                           181
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.225
##   Degrees of freedom                                 1
##   P-value (Chi-square)                           0.635
## 
## Model Test Baseline Model:
## 
##   Test statistic                               371.522
##   Degrees of freedom                                 6
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.013
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -448.206
##   Loglikelihood unrestricted model (H1)       -448.094
##                                                       
##   Akaike (AIC)                                 912.413
##   Bayesian (BIC)                               938.000
##   Sample-size adjusted Bayesian (BIC)          912.664
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.154
##   P-value RMSEA <= 0.05                          0.704
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.005
## 
## Parameter Estimates:
## 
##   Standard errors                           Robust.sem
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##   tone_bin_num ~                                                        
##     Africa_num        0.390    0.166    2.351    0.019    0.065    0.716
##     ASPM_z           -0.144    0.035   -4.163    0.000   -0.212   -0.076
##     MCPH1_z           0.025    0.061    0.410    0.682   -0.095    0.145
##   ASPM_z ~                                                              
##     Africa_num       -1.249    0.111  -11.254    0.000   -1.467   -1.032
##   MCPH1_z ~                                                             
##     Africa_num       -2.190    0.081  -27.039    0.000   -2.349   -2.031
##    Std.lv  Std.all
##                   
##     0.390    0.330
##    -0.144   -0.305
##     0.025    0.053
##                   
##    -1.249   -0.500
##                   
##    -2.190   -0.877
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##    .tone_bin_num      0.165    0.015   10.844    0.000    0.135    0.195
##    .ASPM_z            0.746    0.074   10.144    0.000    0.602    0.890
##    .MCPH1_z           0.230    0.039    5.899    0.000    0.154    0.307
##    Std.lv  Std.all
##     0.165    0.740
##     0.746    0.750
##     0.230    0.232
## 
## R-Square:
##                    Estimate
##     tone_bin_num      0.260
##     ASPM_z            0.250
##     MCPH1_z           0.768

Likewise, with Africa and tone1 coded as ordered binary factors, the model also fits the data very well (χ²(1)=0.57, p=0.45; CFI=1.00, TLI=1.07, NNFI=1.07 and RFI=0.92):

Figure 72. Path analysis model with standardised coefficients and significance stars. tone1 and macroarea (Africa vs non-Africa) are coded as ordered binary factors (tone_bin_ord with No < Yes, and Africa_ord with outside Africa < in Africa); ASPM_z is ASPM-D and MCPH1_z is MCPH1-D.

Restricted sampling

Here I use only the numerical coding.

Figure 73. Path analysis for 1000 restricted samples (i.e., picking one random language per family). The leftmost row of two plots shows the coefficient estimates and the p-values, respectively, for the five paths in the model (see the path plots above). The rightmost plot shows the various fit indices. The black horiontal lines show: 0.0 (solid), 0.05 (dashed) and 1.0 (dotted).

models fits:
- 94.7% of the p-values are not significant
- mean(CFI) = 0.99, median(CFI) = 1, sd(CFI) = 0.01, IQR(CFI) = 0.02
- mean(TLI) = 0.97, median(TLI) = 0.99, sd(TLI) = 0.1, IQR(TLI) = 0.16
- mean(NNFI) = 0.97, median(NNFI) = 0.99, sd(NNFI) = 0.1, IQR(NNFI) = 0.16
- mean(RFI) = 0.9, median(RFI) = 0.92, sd(RFI) = 0.09, IQR(RFI) = 0.14
Africa → ASPM-D: mean = -0.87, median = -0.89, sd = 0.12, IQR = 0.17, 100.0% < 0; 98.7% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -2.2e+02, p = 0;
Africa → MCPH1-D: mean = -2.5, median = -2.5, sd = 0.086, IQR = 0.12, 100.0% < 0; 100.0% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -9e+02, p = 0;
Africa → tone1: mean = 0.43, median = 0.44, sd = 0.32, IQR = 0.46, 89.5% > 0; 12.8% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = 41, p = 3.2e-219;
ASPM-D → tone1: mean = -0.11, median = -0.11, sd = 0.058, IQR = 0.089, 98.6% < 0; 36.2% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -62, p = 0;
MCPH1-D → tone1: mean = 0.041, median = 0.041, as = 0.11, IQR = 0.17, 36.7% < 0; 0.9% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = 11, p = 1.

Machine Learning techniques

Here I apply various “machine learning” techniques to explore how well the macroarea and the two alleles predict tone1. For these techniques, in general I:

fit the model to the full data and estimate how well these modes fit, but also
repeatedly split the data into a training set the complementary test set; the first usually contains a random subset of 80% of the data and is used to fit the model, while the second, containing the remaining 20% of the data, is used to check how well the model generalizes to new data.

Thus, these techniques can:

quantify the amount of information about tone contained by macroarea and the alleles,
but also give an estimate of the relative importance of these variables as predictors.

Decision trees

Including macroarea

Using the frequency of the two alleles and the macroarea as predictors, the fit to the data is: accuracy = 77.3%, sensitivity = 71.7%, specificity = 79.3%, precision = 54.1%, and recall = 71.7%.

Figure 74. Decision tree on the full data using the two alleles and macroarea. ASPM.D = ASPM-D.

On the 100 training/testing sets, the fit is: accuracy = 77.1% ±6.6%, sensitivity = 71.6% ±15.2%, specificity = 79.3% ±6.8%, precision = 52.9% ±12.4%, recall = 71.6% ±15.2%.

Figure 75. The success of generalising to the testing sets from the training sets (yellow boxplots) compared to the success on the full data (red segments).

Excluding macroarea

When using the frequency of the two alleles only as predictors, the fit to the data is: accuracy = 75.1%, sensitivity = 75.0%, specificity = 75.2%, precision = 39.3%, and recall = 75.0%:

Figure 76. Decision tree on the full data using the two alleles only. ASPM.D = ASPM-D, MCPH1.D = MCPH1-D.

On the 100 training/testing sets,the fit is: accuracy = 70.1% ±7.6%, sensitivity = 61.3% ±19.1%, specificity = 76.2% ±8.7%, precision = 44.0% ±23.0%, recall = 61.3% ±19.1%.

Figure 77. The success of generalising to the testing sets from the training sets (yellow boxplots) compared to the success on the full data (red segments).

Random forests

I use two methods: random forests as implemented by randomForest() in package randomForest, and conditional random forests as implemented by cforest() in package partykit. As (conditional) random forests do internal bootstrapping, there is no need for the explicit training/testing set repeated refitting.

Including macroarea

When using the frequency of the two alleles and the macroarea as predictors, the models fit to the full data is:

random forests: accuracy = 77.7% ±0.8%, sensitivity = 68.9% ±1.9%, specificity = 81.6% ±0.4%, precision = 62.0% ±1.0%, recall = 68.9% ±1.9%,
conditional random forests: accuracy = 84.3% ±0.7%, sensitivity = 78.8% ±0.5%, specificity = 86.8% ±0.9%, precision = 73.0% ±2.0%, recall = 78.8% ±0.5%.

Figure 78. The success of the two random forest methods on the full data.

Figure 79. Variable importance using three methods: mean decrease in accuracy, mean decrease of the Gini coeficient, and unconditional importance. ASPM_freq_wavg = ASPM-D, MCPH1_freq_wavg = MCPH1-D.

Excluding macroarea

When using the frequency of the two alleles only, the models fit the full as:

random forests: accuracy = 70.7% ±1.0%, sensitivity = 56.3% ±1.4%, specificity = 78.6% ±0.8%, precision = 59.0% ±1.8%, recall = 56.3% ±1.4%,
conditional random forests: accuracy = 82.1% ±0.5%, sensitivity = 81.8% ±0.7%, specificity = 82.3% ±0.6%, precision = 60.5% ±1.6%, recall = 81.8% ±0.7%.

Figure 80. The success of the two random forest methods on the full data.

Figure 81. Variable importance using three methods: mean decrease in accuracy, mean decrease of the Gini coeficient, and unconditional importance. ASPM_freq_wavg = ASPM-D, MCPH1_freq_wavg = MCPH1-D.

Diachronic analyses

Here I try various analyses that explicitly take into account the diachronic nature of the processes.

The families with more than 2 tips are:

Figure 82. Phylogenies with tone1 (0=“No,” 1=“Yes,” ASPM-D and MCHP1-D for the families with at least 2 languages.

It can be seen that, unfortunately, there are very few families with more than 2 languages with data (17), and even for those with relatively many languages, there is very little variation in tone1 and in the frequencies of the two “derived” alleles. Unfortunately, combined with the issues concerning branch length for language family trees, this precludes the estimation of correlated evolution or phylogenetic regression methods.

tone2 (is there complex tone?)

I kept only the entries with non-missing data for the tone2, ASPM-D and MCPH1-D, and if there are more than one possible languages or allele frequencies for a given sample, I only kept those entries that have different tone or allele data. The resulting dataset has 180 observations, distributed among 118 unique Glottolg codes in 35 families (ranging from a minimum of 1 language per family to a maximum of 47, with a mean 5.1 and median 2 languages per family) and 4 macroareas.

There are 156:121:118 unique samples:(meta)populations:languages retained, dropping 19:8:203 = {FINRISK, GenDan, GenNed5, gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish, KRGDB, Qatari, SA001471N, SA001477T, SA001487U, SA001491P, SA001681Q} : {Bulgarian, Burunge, Dutch, Hazara, Mozabite, Oroqen, Qatari, Xibe} : {adze1240, ajie1238, amar1272, ambu1247, anei1239, apma1241, arak1252, arib1241, arop1243, aros1241, aulu1238, awtu1239, ayiw1239, baba1268, bahi1254, bann1247, bign1238, bili1260, boik1241, bulg1262, buru1320, caro1242, cham1313, chek1238, chuu1238, dehu1237, dumb1241, dutc1256, east2443, east2447, efee1239, fiji1243, futu1245, gapa1238, geez1241, gela1263, gilb1244, gulf1241, guma1254, gyel1242, hali1244, hang1263, hano1246, haza1239, hmon1264, hoav1238, iaai1238, iatm1242, idak1243, idun1242, iris1253, iwam1256, juho1239, kaia1245, kair1263, kamb1297, kapi1249, kara1486, kaul1240, kela1255, kele1258, kili1267, kire1240, koko1269, kosr1238, kuan1247, kuan1248, kuma1276, kung1261, kwai1243, kwam1251, kwam1252, kwom1262, labu1248, lala1268, lame1260, lauu1247, lena1238, lewo1242, long1395, loni1238, lonw1238, louu1245, lusi1240, maee1241, mais1250, male1289, malo1243, mana1295, mana1298, maor1246, mars1254, masa1299, matu1261, mbal1255, mbul1263, mehe1243, meke1243, mele1250, mina1269, ming1252, moch1256, moki1238, moks1248, mono1273, motl1237, motu1246, mudu1242, muri1260, muso1238, muss1246, muyu1244, naka1262, nali1244, nami1256, natu1246, naur1243, ndon1254, neha1247, neng1238, ngan1300, niua1240, niue1239, nort2646, nort2836, nort2845, nuku1260, onto1237, oroq1238, paam1238, pate1247, patp1243, pile1238, ping1243, pohn1238, port1285, pulu1242, qima1242, raoo1244, rapa1244, renn1242, rotu1241, rovi1238, russ1264, saaa1240, saam1283, saka1289, sali1295, samo1305, sapo1253, scot1243, siar1238, siee1239, sina1266, sioo1240, sobe1238, sons1242, sout2642, sout2679, sout2807, sout2856, sout2866, sout2869, stan1318, sude1239, surs1246, tahi1242, tain1252, taki1248, tawa1275, tean1237, teop1238, tiga1245, tigr1271, tiri1258, toab1237, toba1266, toke1240, tong1325, tswa1253, tuam1242, tuml1238, tumz1238, tung1290, tuva1244, ulit1238, urav1235, urip1239, vinm1237, waim1251, wall1257, wata1253, west2500, west2519, woga1249, wole1240, xamt1239, xara1244, xibe1242, yabe1254, yess1239, yima1243, zulu1248}.

	Africa	Eurasia	America	Papunesia	Sum
No	28	105	9	9	151
Yes	9	18	1	1	29
Sum	37	123	10	10	180

Figure 83. Distribution of tone2.

Figure 84. Map of tone2.

Figure 85. Relationship between tone2, ASPM-D and MCPH1-D.

Please note that the distribution of this variable is very skewed, so the results might not be very solid…

Regressions

`glmer`

All data

null model: R² = 0.0%, ICC = 95.6%
macroarea: R² = 2.0%, p_{macroarea/null} = 0.5
ASPM:
- by itself: R² = 1.3%, β = -0.87 ± 0.69, p_ASPM/null = 0.19
- quadratic: R² = 36.6%, β_ASPM2 = -3.46 ± 2.31, p_ASPM2/ASPM = 0.049
- with macroarea: R² = 2.6%, p_{macroarea/ASPM} = 0.8, p_{ASPM/macroarea} = 0.55
MCPH1:
- by itself: R² = 1.5%, β = -1.01 ± 0.73, p_MCPH1/null = 0.16
- quadratic: R² = 3.0%, β_MCPH12 = -1.17 ± 0.81, p_MCPH12/MCPH1 = 0.22
- with macroarea: R² = 2.4%, p_{macroarea/MCPH1} = 0.89, p_{MCPH1/macroarea} = 0.62
both alleles (no macroarea):
- ASPM + MCPH1: R² = 2.0%, β_ASPM = -0.56 ± 0.78, p_ASPM/MCPH1 = 0.47, β_MCPH1 = -0.68 ± 0.81, p_MCPH1/ASPM = 0.39, p_{ASPM+MCPH1/null} = 0.29,
- interaction: R² = 1.5%, p_{ASPM:MCPH1/ASPM+MCPH1} = 0.76

Randomization

Regressions with randomizations for *tone2*.
Permute within	Macroarea	Permute	AIC	Signif.	p_ASPM-D	β_ASPM-D	p_MCPH1-D	β_MCPH1-D
unrestricted	none	tone	0%	4%	4%	1%	6%	0%
unrestricted	none	alleles-together	31%	7%	6%	10%	6%	6%
unrestricted	none	alleles-independent	32%	7%	7%	9%	6%	4%
unrestricted	fixef	tone	0%	6%	6%	6%	7%	23%
unrestricted	fixef	alleles-together	84%	7%	7%	17%	7%	25%
unrestricted	fixef	alleles-independent	83%	9%	8%	16%	7%	21%
macroareas	none	tone	0%	4%	3%	1%	10%	1%
macroareas	none	alleles-together	40%	7%	5%	21%	6%	42%
macroareas	none	alleles-independent	44%	8%	6%	28%	10%	45%
macroareas	fixef	tone	0%	4%	4%	7%	4%	22%
macroareas	fixef	alleles-together	80%	8%	8%	28%	7%	38%
macroareas	fixef	alleles-independent	80%	9%	8%	25%	8%	37%
families	none	tone	31%	4%	4%	28%	1%	16%
families	none	alleles-together	20%	4%	5%	29%	1%	15%
families	none	alleles-independent	24%	4%	6%	25%	3%	22%
families	fixef	tone	45%	8%	9%	54%	4%	43%
families	fixef	alleles-together	80%	4%	6%	43%	3%	18%
families	fixef	alleles-independent	80%	5%	7%	34%	4%	24%

Restricted sampling

Figure 86. Results for 1000 restricted samplings. For ASPM-D (left): 99.6% of βs are negative when regressing tone on ASPM alone (one-sided t-test < 0: t(999) = -77.3, mean = -0.61, p = 0), 99.8%, when controlling for the macroarea (t(999) = -75.6, mean = -0.96, p = 0), and 99.9% when controlling for both macroarea and MCPH1 (t(999) = -70.9, mean = -1.08, p = 0). For MCPH1-D (right): 72.9% of βs are negative when regressing tone on MCPH1 alone (one-sided t-test < 0: t(999) = -21.7, mean = -0.16, p = 7.8e-86), 42.9% when controlling for the macroarea (t(999) = 9.6, mean = 0.30, p = 1), and 35.1% when controlling for both macroarea and ASPM (t(999) = 17.1, mean = 0.67, p = 1).

`brms`

ASPM only:
- β = -1.27, 89%HDI = [-2.73, 0.17]
- posterior probability p(β<0) = 0.93 (evidence ratio = 14), p(β=0) = 0.57 (evidence ratio = 1.4)
- ROPE = [-0.18, 0.18], % HDI inside ROPE = 6.1%; p_ROPE = 0.055
- comparison ‘null’ vs ‘ASPM’: [B= L> W=(53%:47%) K=]: anecdotal evidence for null against ASPM (BF=1.46), LOO=1.84 [SE=1.14], WAIC=0.10 [SE=1.16], KFOLD=-0.80 [SE=2.10]
MCPH1 only:
- β = -0.91, 89%HDI = [-2.38, 0.58]
- posterior probability p(β<0) = 0.85 (evidence ratio = 5.5), p(β=0) = 0.7 (evidence ratio = 2.4)
- ROPE = [-0.18, 0.18], % HDI inside ROPE = 10.9%; p_ROPE = 0.097
- comparison ‘null’ vs ‘MCPH1’: [B= L= W=(41%:59%) K<]: anecdotal evidence for null against MCPH1 (BF=1.63), LOO=0.25 [SE=0.64], WAIC=-0.36 [SE=0.73], KFOLD=-2.43 [SE=1.67]
both alleles:
- comparison ‘null’ vs ‘both’: [B> L= W=(30%:70%) K=]: moderate evidence for null against both (BF=3.66), LOO=0.13 [SE=1.39], WAIC=-0.87 [SE=1.44], KFOLD=-1.33 [SE=2.11]
- interaction:
  - posterior probability p(=0) = 0.75 (evidence ratio = 3)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 15.1%; p_ROPE = 0.134
  - comparison ‘no interaction’ vs ‘with interaction’: [B> L> W=(57%:43%) K>]: moderate evidence for no interaction against with interaction (BF=3.65), LOO=1.16 [SE=1.00], WAIC=0.29 [SE=0.39], KFOLD=2.31 [SE=1.26]
- ASPM (partial):
  - β = -1.13, 89%HDI = [-2.71, 0.60]
  - posterior probability p(β<0) = 0.87 (evidence ratio = 6.8), p(β=0) = 0.66 (evidence ratio = 1.9)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 9%; p_ROPE = 0.08
- MCPH1 (partial):
  - β = -0.58, 89%HDI = [-2.21, 0.99]
  - posterior probability p(β<0) = 0.73 (evidence ratio = 2.7), p(β=0) = 0.75 (evidence ratio = 3)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 13.7%; p_ROPE = 0.122

Figure 87. Posterior distributions (with 50% probability mass highlighted) versus 0.0 (the vertical line) for ASPM-D (left) and MCPH1-D (right).

Figure 88. Conditional effects of ASPM-D (left) and MCPH1-D (right).

Figure 89. Posterior predictive checks for ASPM-D (left) and MCPH1-D (right).

Figure 90. Confusion matrices for ASPM-D (left) and MCPH1-D (right).

Mediation and path analysis

Mediation analysis

`(g)lm`

All data

For ASPM-D:

total effect (TE) of being in Africa on tone: 0.14 (-0.01, 0.30), p=0.078, decomposed into:
average direct effect (ADE): -0.05 (-0.20, 0.10), p=0.43, and
average indirect effect (ACME) mediated by ASPM-D: 0.19 (0.08, 0.31), p=0.004, mediating 133.8% (-419.6%, 802.0%), p=0.082 of the effect, resulting from:
- effect of being in Africa on ASPM-D: -1.34 ±0.16, p=3.7e-15, and
- effect of ASPM-D on tone: -1.03 ±0.31, p=0.0011.

For MCPH1-D:

TE: 0.11 (-0.02, 0.27), p=0.12, decomposed into:
ADE: 0.12 (-0.21, 0.45), p=0.47, and
ACME: -0.01 (-0.29, 0.29), p=0.9, mediating -11.4% (-804.5%, 1112.4%), p=0.93 of the effect, resulting from:
- effect of being in Africa on MCPH1-D: -2.19 ±0.09, p=1.2e-61, and
- effect of MCPH1-D on tone: 0.03 ±0.45, p=0.94.

Restricted sampling

Figure 91. Mediation analysis for 1000 restricted samples (i.e., picking one random language per family). The leftmost panels show the distribution of point estimates of the Total Effect (TE), the Direct Effect (ADE) and the Indirect Effect (ACME) for ASPM and MCPH1; the middle panels show the distribution of the p-values for the same effects, while the rightmost panels show the distribution of the regression slopes (β) for the two alleles, top: for the regression of the allele frequency on within vs outside Africa, and bottom: for the regression of tone on the allele while controlling for within vs outside Africa. The black vertical lines show: 0.0 (solid), 0.05 (dashed) and 0.10 (dotted).

For ASPM-D:

TE: mean = 0.11, median = 0.12; 0.4% significant at α-level 0.05 and 2.3% significant at α-level 0.10; 89.6% > 0.0; one-sample one-sided t-test vs 0: t(999) = 39.7, p = 5.9e-208;
ADE: mean = 0.04, median = 0.039; 0.0% significant at α-level 0.05 and 0.0% significant at α-level 0.10; 67.3% > 0.0; one-sample one-sided t-test vs 0: t(999) = 16.0, p = 7.9e-52;
ACME: mean = 0.072, median = 0.07; 0.0% significant at α-level 0.05 and 0.1% significant at α-level 0.10; 100.0% > 0.0; one-sample one-sided t-test vs 0: t(999) = 78.5, p = 0;
β(Africa → allele): mean = -0.88, median = -0.89; 88.6% significant at α-level 0.05 and 98.7% significant at α-level 0.10; 100.0% < 0.0; one-sample one-sided t-test vs 0: t(999) = -249.8, p = 0;
β(allele → tone | Africa): mean = -0.58, median = -0.57; 0.0% significant at α-level 0.05 and 0.4% significant at α-level 0.10; 100.0% < 0.0; one-sample one-sided t-test vs 0: t(999) = -79.8, p = 0.

For MCPH1-D:

TE: mean = 0.11, median = 0.11; 0.2% significant at α-level 0.05 and 2.3% significant at α-level 0.10; 82.1% > 0.0; one-sample one-sided t-test vs 0: t(999) = 37.1, p = 1.3e-190;
ADE: mean = 0.13, median = 0.14; 0.0% significant at α-level 0.05 and 0.7% significant at α-level 0.10; 75.0% > 0.0; one-sample one-sided t-test vs 0: t(999) = 24.8, p = 2.3e-106;
ACME: mean = -0.028, median = -0.034; 0.0% significant at α-level 0.05 and 0.3% significant at α-level 0.10; 43.0% > 0.0; one-sample one-sided t-test vs 0: t(999) = -5.4, p = 1;
β(Africa → allele): mean = -2.4, median = -2.4; 100.0% significant at α-level 0.05 and 100.0% significant at α-level 0.10; 100.0% < 0.0; one-sample one-sided t-test vs 0: t(999) = -927.9, p = 0;
β(allele → tone | Africa): mean = 0.26, median = 0.21; 0.0% significant at α-level 0.05 and 0.1% significant at α-level 0.10; 38.0% < 0.0; one-sample one-sided t-test vs 0: t(999) = 11.3, p = 1.

`brms`

Figure 92. Graphical representation of the Bayesian mediation analysis for ASPM-D showing the means of the effects and the actual partial regression coefficients, with their 89% HDIs and p-ROPEs. The colors reflect the sign of the mean estimate (blue=negative, red=positive, gray=(p-ROPE >= 0.05)); solid=(0 not in the HDI), dashed=(0 is in the HDI).

Figure 93. Graphical representation of the Bayesian mediation analysis for MCPH1-D showing the means of the effects and the actual partial regression coefficients, with their 89% HDIs and p-ROPEs. The colors reflect the sign of the mean estimate (blue=negative, red=positive, gray=(p-ROPE >= 0.05)); solid=(0 not in the HDI), dashed=(0 is in the HDI).

Figure 94. Graphical representation of the Bayesian mediation analysis for both ASPM-D and MCPH1-D showing the means of the effects and the actual partial regression coefficients, with their 89% HDIs and p-ROPEs. The colors reflect the sign of the mean estimate (blue=negative, red=positive, gray=(p-ROPE >= 0.05)); solid=(0 not in the HDI), dashed=(0 is in the HDI).

Path analysis

All data

Coding Africa and tone2 numerically, the model fit is: χ²(1)=0.36, p=0.55; CFI=1.00, TLI=1.01, NNFI=1.01 and RFI=0.99.

Figure 95. Path analysis model with standardised coefficients and significance stars. Here, we coded tone and macroarea (Africa vs non-Africa) as numeric binary (tone_complex_num with Yes=1 and Africa_num with in Africa=1); ASPM_z is ASPM-D and MCPH1_z is MCPH1-D.

## lavaan 0.6-8 ended normally after 25 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         8
##                                                       
##   Number of observations                           180
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.361
##   Degrees of freedom                                 1
##   P-value (Chi-square)                           0.548
## 
## Model Test Baseline Model:
## 
##   Test statistic                               354.897
##   Degrees of freedom                                 6
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.011
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -407.836
##   Loglikelihood unrestricted model (H1)       -407.655
##                                                       
##   Akaike (AIC)                                 831.671
##   Bayesian (BIC)                               857.215
##   Sample-size adjusted Bayesian (BIC)          831.879
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.166
##   P-value RMSEA <= 0.05                          0.630
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.006
## 
## Parameter Estimates:
## 
##   Standard errors                           Robust.sem
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                      Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##   tone_complex_num ~                                                      
##     Africa_num         -0.051    0.139   -0.366    0.714   -0.322    0.221
##     ASPM_z             -0.108    0.026   -4.124    0.000   -0.159   -0.056
##     MCPH1_z            -0.005    0.050   -0.093    0.926   -0.102    0.093
##   ASPM_z ~                                                                
##     Africa_num         -1.338    0.101  -13.315    0.000   -1.535   -1.141
##   MCPH1_z ~                                                               
##     Africa_num         -2.189    0.072  -30.443    0.000   -2.330   -2.048
##    Std.lv  Std.all
##                   
##    -0.051   -0.056
##    -0.108   -0.292
##    -0.005   -0.013
##                   
##    -1.338   -0.542
##                   
##    -2.189   -0.887
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##    .tone_complx_nm    0.125    0.016    7.848    0.000    0.094    0.157
##    .ASPM_z            0.702    0.072    9.767    0.000    0.561    0.843
##    .MCPH1_z           0.212    0.037    5.748    0.000    0.140    0.284
##    Std.lv  Std.all
##     0.125    0.927
##     0.702    0.706
##     0.212    0.213
## 
## R-Square:
##                    Estimate
##     tone_complx_nm    0.073
##     ASPM_z            0.294
##     MCPH1_z           0.787

Coding Africa and tone2 as ordered binary factors, the model fit is: χ²(1)=0.98, p=0.32; CFI=1.00, TLI=1.01, NNFI=1.01 and RFI=0.79.

Figure 96. Path analysis model with standardised coefficients and significance stars. Here, we coded tone and macroarea (Africa vs non-Africa) as ordered binary factors (tone_complex_ord with No < Yes, and Africa_ord with outside Africa < in Africa); ASPM_z is ASPM-D and MCPH1_z is MCPH1-D.

Restricted sampling

Here I use here only the numerically-coded model.

Figure 97. Path analysis for 1000 restricted samples (i.e., picking one random language per family). The leftmost row of two plots shows the coefficient estimates and the p-values, respectively, for the five paths in the model (see the path plots above). The rightmost plot shows the various fit indices. The black horiontal lines show: 0.0 (solid), 0.05 (dashed) and 1.0 (dotted).

It can be seen that:

the models fit are:
- 97.6% of the p-values are not significant
- mean(CFI) = 0.99, median(CFI) = 1, sd(CFI) = 0.01, IQR(CFI) = 0.01
- mean(TLI) = 0.98, median(TLI) = 0.99, sd(TLI) = 0.1, IQR(TLI) = 0.15
- mean(NNFI) = 0.98, median(NNFI) = 0.99, sd(NNFI) = 0.1, IQR(NNFI) = 0.15
- mean(RFI) = 0.9, median(RFI) = 0.92, sd(RFI) = 0.09, IQR(RFI) = 0.13
Africa → ASPM-D: mean = -0.89, median = -0.9, sd = 0.11, IQR = 0.15, 100.0% < 0; 99.8% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -2.5e+02, p = 0;
Africa → MCPH1-D: mean = -2.4, median = -2.4, sd = 0.079, IQR = 0.11, 100.0% < 0; 100.0% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -9.6e+02, p = 0;
Africa → tone2: mean = 0.039, median = 0.021, sd = 0.26, IQR = 0.37, 53.6% > 0; 0.0% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = 4.7, p = 1.2e-06;
ASPM-D → tone2: mean = -0.071, median = -0.071, sd = 0.027, IQR = 0.037, 99.8% < 0; 5.9% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -83, p = 0;
MCPH1-D → tone2: mean = -0.00016, median = -0.0016, as = 0.1, IQR = 0.14, 50.8% < 0; 0.0% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -0.051, p = 0.48.

Machine Learning techniques

Decision trees

When using the frequency of the two alleles and the macroarea as predictors, the decision tree is trivial: it uniformly predicts just the majority value “No.”

Figure 98. Decision tree on the full data using the two alleles and macroarea.

accuracy = 83.9%, sensitivity = NA%, specificity = 83.9%, precision = 0.0%, and recall = NA%.

On the 100 training/testing sets: accuracy = 82.9% ±5.8%, sensitivity = 15.8% ±9.0%, specificity = 83.5% ±5.3%, precision = 0.8% ±4.4%, recall = 15.8% ±9.0%.

Figure 99. The success of generalising to the testing sets from the training sets (yellow boxplots) compared to the success on the full data (red segments).

Random forests

Including macroarea

random forests: accuracy = 84.2% ±0.5%, sensitivity = 58.4% ±11.2%, specificity = 84.9% ±0.4%, precision = 8.8% ±2.6%, recall = 58.4% ±11.2%
conditional random forests: accuracy = 87.2% ±0.7%, sensitivity = 97.5% ±4.8%, specificity = 86.9% ±0.7%, precision = 21.2% ±5.2%, recall = 97.5% ±4.8%

Figure 100. The success of the two random forest methods on the full data.

Figure 101. Variable importance using three methods: mean decrease in accuracy, mean decrease of the Gini coeficient, and unconditional importance.

Excluding macroarea

random forests: accuracy = 82.2% ±1.0%, sensitivity = 42.8% ±4.4%, specificity = 87.3% ±0.6%, precision = 30.3% ±3.5%, recall = 42.8% ±4.4%
conditional random forests: accuracy = 87.3% ±0.2%, sensitivity = 81.2% ±3.1%, specificity = 87.7% ±0.0%, precision = 27.6% ±0.0%, recall = 81.2% ±3.1%

Figure 102. The success of the two random forest methods on the full data.

Figure 103. Variable importance using three methods: mean decrease in accuracy, mean decrease of the Gini coeficient, and unconditional importance.

Tone counts

Here the imputed counts are rounded to the nearest integer; please see below for using the actually predicted values.

I kept only the entries with non-missing data for the tone counts, ASPM-D and MCPH1-D, and if there are more than one possible languages or allele frequencies for a given sample, I only kept those entries that have different tone or allele data. The resulting dataset has 184 observations, distributed among 121 unique Glottolg codes in 35 families (ranging from a minimum of 1 language per family to a maximum of 47, with a mean 5.3 and median 2 languages per family) and 4 macroareas.

There are 156:121:121 unique samples:(meta)populations:languages retained, dropping 19:8:200 = {FINRISK, GenDan, GenNed5, gnomAD_asj, gnomAD_bgr, gnomAD_est, gnomAD_fin, gnomAD_jpn, gnomAD_kor, gnomAD_swe, gnomADexomes_AshkenaziJewish, gnomADgenomes_AshkenaziJewish, KRGDB, Qatari, SA001471N, SA001477T, SA001487U, SA001491P, SA001681Q} : {Bulgarian, Burunge, Dutch, Hazara, Mozabite, Oroqen, Qatari, Xibe} : {adze1240, ajie1238, amar1272, ambu1247, anei1239, apma1241, arak1252, arib1241, arop1243, aros1241, aulu1238, awtu1239, ayiw1239, baba1268, bahi1254, bann1247, bign1238, bili1260, boik1241, bulg1262, buru1320, caro1242, cham1313, chek1238, chuu1238, dehu1237, dumb1241, dutc1256, east2443, east2447, efee1239, fiji1243, futu1245, gapa1238, geez1241, gela1263, gilb1244, gulf1241, guma1254, gyel1242, hali1244, hang1263, hano1246, haza1239, hoav1238, iaai1238, iatm1242, idak1243, idun1242, iris1253, iwam1256, juho1239, kaia1245, kair1263, kamb1297, kapi1249, kara1486, kaul1240, kela1255, kele1258, kili1267, kire1240, koko1269, kosr1238, kuan1248, kuma1276, kung1261, kwai1243, kwam1251, kwam1252, kwom1262, labu1248, lala1268, lame1260, lauu1247, lena1238, lewo1242, long1395, loni1238, lonw1238, louu1245, lusi1240, maee1241, mais1250, male1289, malo1243, mana1295, mana1298, maor1246, mars1254, masa1299, matu1261, mbal1255, mbul1263, mehe1243, meke1243, mele1250, mina1269, ming1252, moch1256, moki1238, moks1248, mono1273, motl1237, motu1246, mudu1242, muri1260, muso1238, muss1246, muyu1244, naka1262, nali1244, nami1256, natu1246, naur1243, ndon1254, neha1247, neng1238, ngan1300, niua1240, niue1239, nort2646, nort2836, nort2845, nuku1260, onto1237, oroq1238, paam1238, pate1247, patp1243, pile1238, ping1243, pohn1238, port1285, pulu1242, qima1242, raoo1244, rapa1244, renn1242, rotu1241, rovi1238, russ1264, saaa1240, saam1283, saka1289, sali1295, samo1305, sapo1253, scot1243, siar1238, siee1239, sina1266, sioo1240, sobe1238, sons1242, sout2642, sout2679, sout2807, sout2856, sout2866, sout2869, stan1318, sude1239, surs1246, tahi1242, taki1248, tawa1275, tean1237, teop1238, tiga1245, tigr1271, tiri1258, toab1237, toba1266, toke1240, tong1325, tswa1253, tuam1242, tuml1238, tumz1238, tung1290, tuva1244, ulit1238, urav1235, urip1239, vinm1237, waim1251, wall1257, wata1253, west2500, west2519, woga1249, wole1240, xamt1239, xara1244, xibe1242, yabe1254, yess1239, yima1243, zulu1248}.

	Africa	Eurasia	America	Papunesia	Sum
0	9	98	4	7	118
1	10	6	5	1	22
2	16	3	0	2	21
3	2	5	0	0	7
4	0	8	1	0	9
5	1	4	0	0	5
6	0	2	0	0	2
Sum	38	126	10	10	184

Figure 104. Distribution of tone counts.

Figure 105. Distribution of tone counts across the world.

Figure 106. Relationship between tone counts (colors) and the two alleles (frequency) by macroarea.

Regressions

I used a mixed-effects Poisson model.

`glmer`

All data

null model: R² = 0.0%, ICC = 100.0%
the Poisson model is not overdispersed: χ²(182) = 112.6, p = 1
macroarea: R² = 23.8%, p_{macroarea/null} = 0.013
ASPM:
- by itself: R² = 7.6%, β = -0.37 ± 0.19, p_ASPM/null = 0.061
- quadratic: R² = 17.9%, β_ASPM2 = -0.42 ± 0.21, p_ASPM2/ASPM = 0.12
- with macroarea: R² = 24.8%, p_{macroarea/ASPM} = 0.058, p_{ASPM/macroarea} = 0.66
MCPH1:
- by itself: R² = 9.9%, β = -0.46 ± 0.19, p_MCPH1/null = 0.016
- quadratic: R² = 10.9%, β_MCPH12 = -0.45 ± 0.19, p_MCPH12/MCPH1 = 0.19
- with macroarea: R² = 24.1%, p_{macroarea/MCPH1} = 0.15, p_{MCPH1/macroarea} = 0.64
both alleles (no macroarea):
- ASPM + MCPH1: R² = 15.7%, β_ASPM = -0.27 ± 0.20, p_ASPM/MCPH1 = 0.18, β_MCPH1 = -0.37 ± 0.19, p_MCPH1/ASPM = 0.043, p_{ASPM+MCPH1/null} = 0.022,
- interaction: R² = 15.4%, p_{ASPM:MCPH1/ASPM+MCPH1} = 0.86

Randomization

We performed 1000 independent replications:

Regressions with randomizations for tone *counts*.
Permute within	Macroarea	Permute	AIC	Signif.	p_ASPM-D	β_ASPM-D	p_MCPH1-D	β_MCPH1-D
unrestricted	none	tone	0%	18%	14%	4%	13%	2%
unrestricted	none	alleles-together	1%	2%	3%	0%	3%	0%
unrestricted	none	alleles-independent	1%	3%	4%	0%	3%	0%
unrestricted	fixef	tone	0%	23%	16%	33%	19%	30%
unrestricted	fixef	alleles-together	81%	2%	3%	16%	3%	12%
unrestricted	fixef	alleles-independent	81%	4%	3%	13%	4%	6%
macroareas	none	tone	0%	44%	19%	12%	31%	7%
macroareas	none	alleles-together	18%	36%	8%	6%	34%	20%
macroareas	none	alleles-independent	20%	34%	12%	7%	37%	19%
macroareas	fixef	tone	0%	31%	23%	33%	21%	35%
macroareas	fixef	alleles-together	79%	3%	4%	23%	4%	26%
macroareas	fixef	alleles-independent	81%	4%	4%	24%	4%	26%
families	none	tone	24%	19%	14%	28%	8%	8%
families	none	alleles-together	9%	16%	12%	32%	4%	4%
families	none	alleles-independent	10%	20%	20%	40%	8%	8%
families	fixef	tone	18%	7%	9%	63%	2%	54%
families	fixef	alleles-together	83%	4%	8%	61%	2%	5%
families	fixef	alleles-independent	82%	5%	7%	59%	2%	11%

Restricted sampling

Figure 107. Results for 1000 restricted samplings. For ASPM-D (left): 100% of βs are negative when regressing tone on ASPM alone (one-sided t-test < 0: t(999) = -81.5, mean = -0.48, p = 0), 96.7%, when controlling for the macroarea (t(999) = -52.9, mean = -0.48, p = 4.6e-292), and 96.7% when controlling for both macroarea and MCPH1 (t(999) = -53.5, mean = -0.52, p = 5e-296). For MCPH1-D (right): 99.2% of βs are negative when regressing tone on MCPH1 alone (one-sided t-test < 0: t(999) = -70.7, mean = -0.25, p = 0), 50.5% when controlling for the macroarea (t(999) = 2.8, mean = 0.05, p = 1), and 38.2% when controlling for both macroarea and ASPM (t(999) = 12.6, mean = 0.24, p = 1).

`brms`

ASPM only:
- β = -0.25, 89%HDI = [-0.64, 0.17]
- posterior probability p(β<0) = 0.84 (evidence ratio = 5.2), p(β=0) = 0.88 (evidence ratio = 7.5)
- ROPE = [-0.10, 0.10], % HDI inside ROPE = 21%; p_ROPE = 0.187
- comparison ‘null’ vs ‘ASPM’: [B> L= W>(71%:29%) K>]: moderate evidence for null against ASPM (BF=4.35), LOO=1.00 [SE=1.31], WAIC=0.89 [SE=0.89], KFOLD=4.18 [SE=3.39]
MCPH1 only:
- β = -0.24, 89%HDI = [-0.65, 0.25]
- posterior probability p(β<0) = 0.8 (evidence ratio = 4), p(β=0) = 0.89 (evidence ratio = 8.1)
- ROPE = [-0.10, 0.10], % HDI inside ROPE = 20.4%; p_ROPE = 0.182
- comparison ‘null’ vs ‘MCPH1’: [B> L> W>(65%:35%) K>]: moderate evidence for null against MCPH1 (BF=6.59), LOO=1.00 [SE=0.83], WAIC=0.63 [SE=0.61], KFOLD=1.49 [SE=1.40]
both alleles:
- comparison ‘null’ vs ‘both’: [B>> L> W>(83%:17%) K>>]: very strong evidence for null against both (BF=46.8), LOO=1.99 [SE=1.23], WAIC=1.58 [SE=1.08], KFOLD=9.05 [SE=3.68]
- interaction:
  - posterior probability p(=0) = 0.93 (evidence ratio = 13)
  - ROPE = [-0.10, 0.10], % HDI inside ROPE = 33.8%; p_ROPE = 0.301
  - comparison ‘no interaction’ vs ‘with interaction’: [B>> L> W>>(68%:32%) K>]: strong evidence for no interaction against with interaction (BF=17.1), LOO=0.63 [SE=0.59], WAIC=0.77 [SE=0.30], KFOLD=2.09 [SE=1.95]
- ASPM (partial):
  - β = -0.22, 89%HDI = [-0.66, 0.16]
  - posterior probability p(β<0) = 0.81 (evidence ratio = 4.2), p(β=0) = 0.9 (evidence ratio = 8.6)
  - ROPE = [-0.10, 0.10], % HDI inside ROPE = 24.2%; p_ROPE = 0.216
- MCPH1 (partial):
  - β = -0.21, 89%HDI = [-0.64, 0.23]
  - posterior probability p(β<0) = 0.78 (evidence ratio = 3.5), p(β=0) = 0.89 (evidence ratio = 8.1)
  - ROPE = [-0.10, 0.10], % HDI inside ROPE = 22.4%; p_ROPE = 0.199

Figure 108. Posterior distributions (with 50% probability mass highlighted) versus 0.0 (the vertical line) for ASPM-D (left) and MCPH1-D (right).

Figure 109. Conditional effects of ASPM-D (left) and MCPH1-D (right).

Figure 110. Posterior predictive checks for ASPM-D (left) and MCPH1-D (right).

Mediation and path analysis

Mediation analysis

`(g)lm`

All data

For ASPM-D:

total effect (TE) of being in Africa on tone: 0.94 (0.40, 1.72), p=0, decomposed into:
average direct effect (ADE): -0.16 (-0.69, 0.30), p=0.48, and
average indirect effect (ACME) mediated by ASPM-D: 1.11 (0.63, 1.79), p=0, mediating 117.0% (76.3%, 223.9%), p=0 of the effect, resulting from:
- effect of being in Africa on ASPM-D: -1.34 ±0.15, p=1.6e-15, and
- effect of ASPM-D on tone: -0.73 ±0.12, p=4.2e-10.

For MCPH1-D:

TE: 0.69 (0.32, 1.13), p=0, decomposed into:
ADE: 0.44 (-0.44, 1.38), p=0.32, and
ACME: 0.25 (-0.57, 1.06), p=0.53, mediating 36.6% (-95.5%, 197.0%), p=0.53 of the effect, resulting from:
- effect of being in Africa on MCPH1-D: -2.18 ±0.09, p=3e-62, and
- effect of MCPH1-D on tone: -0.12 ±0.17, p=0.5.

Restricted sampling

Figure 111. Mediation analysis for 1000 restricted samples (i.e., picking one random language per family). The leftmost panels show the distribution of point estimates of the Total Effect (TE), the Direct Effect (ADE) and the Indirect Effect (ACME) for ASPM and MCPH1; the middle panels show the distribution of the p-values for the same effects, while the rightmost panels show the distribution of the regression slopes (β) for the two alleles, top: for the regression of the allele frequency on within vs outside Africa, and bottom: for the regression of tone on the allele while controlling for within vs outside Africa. The black vertical lines show: 0.0 (dotted), 0.05 (solid) and 0.10 (dashed).

For ASPM-D:

TE: mean = 1.3, median = 1.3; 59.2% significant at α-level 0.05 and 72.5% significant at α-level 0.10; 100.0% > 0.0; one-sample one-sided t-test vs 0: t(999) = 79.3, p = 0;
ADE: mean = 0.73, median = 0.74; 19.4% significant at α-level 0.05 and 33.9% significant at α-level 0.10; 97.4% > 0.0; one-sample one-sided t-test vs 0: t(999) = 58.4, p = 0;
ACME: mean = 0.54, median = 0.5; 21.5% significant at α-level 0.05 and 44.6% significant at α-level 0.10; 98.4% > 0.0; one-sample one-sided t-test vs 0: t(999) = 53.7, p = 2.6e-297;
β(Africa → allele): mean = -0.89, median = -0.9; 87.0% significant at α-level 0.05 and 98.9% significant at α-level 0.10; 100.0% < 0.0; one-sample one-sided t-test vs 0: t(999) = -237.7, p = 0;
β(allele → tone | Africa): mean = -0.38, median = -0.36; 34.1% significant at α-level 0.05 and 51.0% significant at α-level 0.10; 98.2% < 0.0; one-sample one-sided t-test vs 0: t(999) = -66.8, p = 0.

For MCPH1-D:

TE: mean = 1.1, median = 1.1; 57.6% significant at α-level 0.05 and 70.0% significant at α-level 0.10; 100.0% > 0.0; one-sample one-sided t-test vs 0: t(999) = 83.9, p = 0;
ADE: mean = 3.9, median = 2; 16.4% significant at α-level 0.05 and 26.1% significant at α-level 0.10; 83.2% > 0.0; one-sample one-sided t-test vs 0: t(999) = 17.2, p = 1.3e-58;
ACME: mean = -2.8, median = -0.89; 5.2% significant at α-level 0.05 and 11.9% significant at α-level 0.10; 35.0% > 0.0; one-sample one-sided t-test vs 0: t(999) = -12.3, p = 1;
β(Africa → allele): mean = -2.4, median = -2.4; 100.0% significant at α-level 0.05 and 100.0% significant at α-level 0.10; 100.0% < 0.0; one-sample one-sided t-test vs 0: t(999) = -915.3, p = 0;
β(allele → tone | Africa): mean = 0.13, median = 0.1; 5.7% significant at α-level 0.05 and 11.9% significant at α-level 0.10; 40.4% < 0.0; one-sample one-sided t-test vs 0: t(999) = 10.3, p = 1.

Given the low sample size N = 35 unique families, relatively few effect sizes are big enough to be significant; however, there are many more significant indirect effects (ACME) for ASPM-D than for MCPH1-D: 34.1% vs 5.7% (6.0 times) for α-level 0.05, and 51.0% vs 11.9% (4.3 times) for α-level 0.10.

`brms`

Figure 112. Graphical representation of the Bayesian mediation analysis for ASPM-D showing the means of the effects and the actual partial regression coefficients, with their 89% HDIs and p-ROPEs. The colors reflect the sign of the mean estimate (blue=negative, red=positive, gray=(p-ROPE >= 0.05)); solid=(0 not in the HDI), dashed=(0 is in the HDI).

Figure 113. Graphical representation of the Bayesian mediation analysis for MCPH1-D showing the means of the effects and the actual partial regression coefficients, with their 89% HDIs and p-ROPEs. The colors reflect the sign of the mean estimate (blue=negative, red=positive, gray=(p-ROPE >= 0.05)); solid=(0 not in the HDI), dashed=(0 is in the HDI).

Figure 114. Graphical representation of the Bayesian mediation analysis for both ASPM-D and MCPH1-D showing the means of the effects and the actual partial regression coefficients, with their 89% HDIs and p-ROPEs. The colors reflect the sign of the mean estimate (blue=negative, red=positive, gray=(p-ROPE >= 0.05)); solid=(0 not in the HDI), dashed=(0 is in the HDI).

Path analysis

Please note that path analysis uses a linear model (so not a Poisson one) for the tone counts; also I only use the numeric coding for Africa.

All data

Coding Africa numerically, the model fits the data very well (χ²(1)=0.29, p=0.59; CFI=1.00, TLI=1.01, NNFI=1.01 and RFI=1.00):

Figure 115. Path analysis model with standardised coefficients and significance stars. Here, macroarea (Africa vs non-Africa) is coded as numeric binary (Africa_num with in Africa=1); ASPM_z is ASPM-D and MCPH1_z is MCPH1-D..

## lavaan 0.6-8 ended normally after 28 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         8
##                                                       
##   Number of observations                           184
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.292
##   Degrees of freedom                                 1
##   P-value (Chi-square)                           0.589
## 
## Model Test Baseline Model:
## 
##   Test statistic                               369.194
##   Degrees of freedom                                 6
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.012
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -663.138
##   Loglikelihood unrestricted model (H1)       -662.992
##                                                       
##   Akaike (AIC)                                1342.276
##   Bayesian (BIC)                              1367.995
##   Sample-size adjusted Bayesian (BIC)         1342.657
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.159
##   P-value RMSEA <= 0.05                          0.666
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.005
## 
## Parameter Estimates:
## 
##   Standard errors                           Robust.sem
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##   n_tones ~                                                             
##     Africa_num       -0.265    0.601   -0.441    0.659   -1.443    0.913
##     ASPM_z           -0.490    0.114   -4.291    0.000   -0.713   -0.266
##     MCPH1_z          -0.131    0.229   -0.571    0.568   -0.580    0.319
##   ASPM_z ~                                                              
##     Africa_num       -1.338    0.099  -13.477    0.000   -1.533   -1.144
##   MCPH1_z ~                                                             
##     Africa_num       -2.180    0.071  -30.517    0.000   -2.320   -2.040
##    Std.lv  Std.all
##                   
##    -0.265   -0.075
##    -0.490   -0.342
##    -0.131   -0.091
##                   
##    -1.338   -0.543
##                   
##    -2.180   -0.885
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##    .n_tones           1.790    0.273    6.566    0.000    1.256    2.324
##    .ASPM_z            0.701    0.071    9.837    0.000    0.561    0.841
##    .MCPH1_z           0.216    0.036    5.943    0.000    0.145    0.287
##    Std.lv  Std.all
##     1.790    0.879
##     0.701    0.705
##     0.216    0.217
## 
## R-Square:
##                    Estimate
##     n_tones           0.121
##     ASPM_z            0.295
##     MCPH1_z           0.783

Restricted sampling

Figure 116. Path analysis for 1000 restricted samples (i.e., picking one random language per family). The leftmost row of two plots shows the coefficient estimates and the p-values, respectively, for the five paths in the model (see the path plots above). The rightmost plot shows the various fit indices. The black horiontal lines show: 0.0 (solid), 0.05 (dashed) and 1.0 (dotted).

It can be seen that:

the models fits:
- 97.9% of the p-values are not significant
- mean(CFI) = 0.99, median(CFI) = 1, sd(CFI) = 0.01, IQR(CFI) = 0.01
- mean(TLI) = 0.99, median(TLI) = 1.01, sd(TLI) = 0.09, IQR(TLI) = 0.15
- mean(NNFI) = 0.99, median(NNFI) = 1.01, sd(NNFI) = 0.09, IQR(NNFI) = 0.15
- mean(RFI) = 0.91, median(RFI) = 0.93, sd(RFI) = 0.08, IQR(RFI) = 0.13
Africa → ASPM-D: mean = -0.88, median = -0.89, sd = 0.12, IQR = 0.16, 100.0% < 0; 99.9% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -2.3e+02, p = 0
Africa → MCPH1-D: mean = -2.4, median = -2.4, sd = 0.083, IQR = 0.12, 100.0% < 0; 100.0% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -9.1e+02, p = 0
Africa → tone counts: mean = 0.77, median = 0.83, sd = 1.1, IQR = 1.6, 74.9% > 0; 10.0% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = 22, p = 2e-86
ASPM-D → tone counts: mean = -0.32, median = -0.31, sd = 0.16, IQR = 0.22, 98.7% < 0; 22.5% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = -63, p = 0
MCPH1-D → tone counts: mean = 0.019, median = 0.034, as = 0.45, IQR = 0.65, 47.4% < 0; 3.3% significant at α-level 0.05; one-sample one-sided t-test vs 0: t(999) = 1.3, p = 0.91

Unrounded (raw) imputed counts

I kept only the entries with non-missing data for the tone counts, ASPM-D and MCPH1-D, and if there are more than one possible languages or allele frequencies for a given sample, I only kept those entries that have different tone or allele data. The resulting dataset has 200 observations, distributed among 136 unique Glottolg codes in 37 families (ranging from a minimum of 1 language per family to a maximum of 51, with a mean 5.4 and median 2 languages per family) and 4 macroareas.

Figure 117. Distribution of tone counts (unrounded).

Figure 118. Distribution of tone counts (unrounded) across the world.

Figure 119. Relationship between tone counts (unrounded; colors) and the two alleles (frequency) by macroarea.

Power analysis

I use simulations for power analysis (as implemented by package simr), focusing on the effect of ASPM-D on tone1 using glmer, i.e. logistic regression with ASPM-D as fixed effect and controlling for language family (as random effect) and macroarea as fixed effect.

Observed power

The observed effect size of ASPM-D is β_ASPM-D = -0.4, p_ASPM-D = 0.41, with an ICC = 68.4% on 35 level-2 groups (families) and 181 observations (languages/samples). The observed (post-hoc) power 1 - β = %, 95%CI = .

Changing the number of languages

If we keep the families but change the number of languages per family:

Figure 120. Estimated power (with 95%) when changing the number of languages but keeping everything else constant.

Changing the number of families

If we change the number of families:

Figure 121. Estimated power (with 95%) when changing the number of language families but keeping everything else constant.

Changing the number of families and languages

If we change the number of families and the number of languages per family:

Figure 122. Estimated power when changing the number of language families and the number of languages per family, but keeping everything else constant. Color is proportional to power and the shape shows if the power is > 80%. The two vertical dotted lines are the approximate number of families in Ethnologue (blue, ~150) and Glottolog (black, ~420). The horizontal dotted lines are summaries of the number of languages in Glottolog: the mean (red, ~20), the median (black, 2) and the median excluding isolates (blue, 5); not shown is the maxmimum (~1400 in Atlantic-Congo).

Appendix I: Gaussian Process

Here I model language contact with a 2D Gaussian Process as suggested in, for example, McElreath (2020), using brms’s gp(). tone is regressed on ASPM-D and MCPH1-D with language family and (meta)population as (nested) random effects, and a 2D Gaussian process separately for each macroarea.

tone1

ASPM only:
- β = -0.88, 89%HDI = [-1.63, -0.06]
- posterior probability p(β<0) = 0.96 (evidence ratio = 25), p(β=0) = 0.59 (evidence ratio = 1.4)
- ROPE = [-0.18, 0.18], % HDI inside ROPE = 3.5%; p_ROPE = 0.065
- comparison ‘null’ vs ‘ASPM’: [B= L= W=(33%:67%) K=]: anecdotal evidence for null against ASPM (BF=1.3), LOO=-0.05 [SE=2.69], WAIC=-0.69 [SE=2.74], KFOLD=-1.21 [SE=3.41]
MCPH1 only:
- β = -1.03, 89%HDI = [-1.63, -0.45]
- posterior probability p(β<0) = 0.99 (evidence ratio = 1.7e+02), p(β=0) = 0.15 (evidence ratio = 0.17)
- ROPE = [-0.18, 0.18], % HDI inside ROPE = 0%; p_ROPE = 0.011
- comparison ‘null’ vs ‘MCPH1’: [B< L< W<(10%:90%) K=]: moderate evidence for MCPH1 against null (BF=0.301), LOO=-2.13 [SE=1.73], WAIC=-2.22 [SE=1.73], KFOLD=-2.02 [SE=2.70]
both alleles:
- comparison ‘null’ vs ‘both’: [B= L= W=(9%:91%) K=]: anecdotal evidence for both against null (BF=0.704), LOO=-1.87 [SE=2.93], WAIC=-2.34 [SE=2.95], KFOLD=-2.34 [SE=3.50]
- interaction:
  - posterior probability p(=0) = 0.87 (evidence ratio = 6.6)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 31.5%; p_ROPE = 0.28
  - comparison ‘no interaction’ vs ‘with interaction’: [B> L>> W>(76%:24%) K>]: moderate evidence for no interaction against with interaction (BF=5.75), LOO=1.49 [SE=0.73], WAIC=1.16 [SE=0.66], KFOLD=2.70 [SE=2.09]
- ASPM (partial):
  - β = -0.58, 89%HDI = [-1.35, 0.20]
  - posterior probability p(β<0) = 0.88 (evidence ratio = 7.4), p(β=0) = 0.76 (evidence ratio = 3.2)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 15.7%; p_ROPE = 0.14
- MCPH1 (partial):
  - β = -0.85, 89%HDI = [-1.47, -0.25]
  - posterior probability p(β<0) = 0.99 (evidence ratio = 77), p(β=0) = 0.41 (evidence ratio = 0.69)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 0%; p_ROPE = 0.032

Figure 123. Posterior distributions (with 50% probability mass highlighted) versus 0.0 (the vertical line) for ASPM-D (left) and MCPH1-D (right). Please note that I have cut the x-axis at 2.5 as the distributions of the sdgp and lscale have a few extreme outliers which would make the plots impossible to see.

Figure 124. Conditional effects of ASPM-D (left) and MCPH1-D (right).

Figure 125. Posterior predictive checks for ASPM-D (left) and MCPH1-D (right).

Figure 126. Confusion matrices for ASPM-D (left) and MCPH1-D (right).

tone2

ASPM only:
- β = -1.15, 89%HDI = [-2.12, -0.16]
- posterior probability p(β<0) = 0.97 (evidence ratio = 33), p(β=0) = 0.5 (evidence ratio = 1)
- ROPE = [-0.18, 0.18], % HDI inside ROPE = 0.6%; p_ROPE = 0.044
- comparison ‘null’ vs ‘ASPM’: [B= L= W=(41%:59%) K=]: anecdotal evidence for ASPM against null (BF=0.767), LOO=-0.34 [SE=1.58], WAIC=-0.34 [SE=1.59], KFOLD=-0.07 [SE=1.64]
MCPH1 only:
- β = -0.63, 89%HDI = [-1.27, -0.04]
- posterior probability p(β<0) = 0.95 (evidence ratio = 19), p(β=0) = 0.68 (evidence ratio = 2.1)
- ROPE = [-0.18, 0.18], % HDI inside ROPE = 6.5%; p_ROPE = 0.099
- comparison ‘null’ vs ‘MCPH1’: [B= L= W=(55%:45%) K=]: anecdotal evidence for null against MCPH1 (BF=2.14), LOO=0.06 [SE=1.58], WAIC=0.19 [SE=1.58], KFOLD=0.88 [SE=1.88]
both alleles:
- comparison ‘null’ vs ‘both’: [B> L= W=(45%:55%) K=]: moderate evidence for null against both (BF=4.38), LOO=-0.23 [SE=1.82], WAIC=-0.19 [SE=1.81], KFOLD=0.17 [SE=1.92]
- interaction:
  - posterior probability p(=0) = 0.83 (evidence ratio = 4.7)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 22.4%; p_ROPE = 0.2
  - comparison ‘no interaction’ vs ‘with interaction’: [B> L> W>(62%:38%) K>>]: moderate evidence for no interaction against with interaction (BF=6.35), LOO=0.53 [SE=0.37], WAIC=0.48 [SE=0.37], KFOLD=2.63 [SE=1.03]
- ASPM (partial):
  - β = -0.88, 89%HDI = [-1.96, 0.15]
  - posterior probability p(β<0) = 0.91 (evidence ratio = 10), p(β=0) = 0.68 (evidence ratio = 2.1)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 9.9%; p_ROPE = 0.092
- MCPH1 (partial):
  - β = -0.42, 89%HDI = [-1.04, 0.21]
  - posterior probability p(β<0) = 0.86 (evidence ratio = 6.3), p(β=0) = 0.83 (evidence ratio = 4.7)
  - ROPE = [-0.18, 0.18], % HDI inside ROPE = 24.1%; p_ROPE = 0.214

Figure 127. Posterior distributions (with 50% probability mass highlighted) versus 0.0 (the vertical line) for ASPM-D (left) and MCPH1-D (right). Please note that I have cut the x-axis at 2.5 as the distributions of the sdgp and lscale have a few extreme outliers which would make the plots impossible to see.

Figure 128. Conditional effects of ASPM-D (left) and MCPH1-D (right).

Figure 129. Posterior predictive checks for ASPM-D (left) and MCPH1-D (right).

Figure 130. Confusion matrices for ASPM-D (left) and MCPH1-D (right).

Tone counts

ASPM only:
- β = -0.22, 89%HDI = [-0.66, 0.24]
- posterior probability p(β<0) = 0.78 (evidence ratio = 3.5), p(β=0) = 0.9 (evidence ratio = 8.9)
- ROPE = [-0.10, 0.10], % HDI inside ROPE = 23.7%; p_ROPE = 0.211
- comparison ‘null’ vs ‘ASPM’: [B> L>> W>>(90%:10%) K=]: moderate evidence for null against ASPM (BF=9.37), LOO=2.40 [SE=1.07], WAIC=2.20 [SE=1.01], KFOLD=1.54 [SE=1.62]
MCPH1 only:
- β = -0.41, 89%HDI = [-0.77, -0.08]
- posterior probability p(β<0) = 0.95 (evidence ratio = 21), p(β=0) = 0.7 (evidence ratio = 2.3)
- ROPE = [-0.10, 0.10], % HDI inside ROPE = 1.2%; p_ROPE = 0.058
- comparison ‘null’ vs ‘MCPH1’: [B> L> W>(93%:7%) K=]: moderate evidence for null against MCPH1 (BF=3.35), LOO=2.30 [SE=1.36], WAIC=2.53 [SE=1.30], KFOLD=1.72 [SE=1.79]
both alleles:
- comparison ‘null’ vs ‘both’: [B>> L> W>(97%:3%) K>]: strong evidence for null against both (BF=14), LOO=3.61 [SE=2.03], WAIC=3.47 [SE=1.89], KFOLD=3.27 [SE=2.59]
- interaction:
  - posterior probability p(=0) = 0.93 (evidence ratio = 12)
  - ROPE = [-0.10, 0.10], % HDI inside ROPE = 33.7%; p_ROPE = 0.3
  - comparison ‘no interaction’ vs ‘with interaction’: [B> L>> W>>(68%:32%) K>>]: moderate evidence for no interaction against with interaction (BF=6.73), LOO=1.03 [SE=0.44], WAIC=0.74 [SE=0.37], KFOLD=4.04 [SE=1.72]
- ASPM (partial):
  - β = -0.27, 89%HDI = [-0.67, 0.17]
  - posterior probability p(β<0) = 0.84 (evidence ratio = 5.4), p(β=0) = 0.88 (evidence ratio = 7.5)
  - ROPE = [-0.10, 0.10], % HDI inside ROPE = 19.5%; p_ROPE = 0.174
- MCPH1 (partial):
  - β = -0.4, 89%HDI = [-0.72, -0.11]
  - posterior probability p(β<0) = 0.97 (evidence ratio = 29), p(β=0) = 0.67 (evidence ratio = 2)
  - ROPE = [-0.10, 0.10], % HDI inside ROPE = 0%; p_ROPE = 0.049

Figure 131. Posterior distributions (with 50% probability mass highlighted) versus 0.0 (the vertical line) for ASPM-D (left) and MCPH1-D (right). Please note that I have cut the x-axis at 2.5 as the distributions of the sdgp and lscale have a few extreme outliers which would make the plots impossible to see.

Figure 132. Conditional effects of ASPM-D (left) and MCPH1-D (right).

Figure 133. Posterior predictive checks for ASPM-D (left) and MCPH1-D (right).

Appendix II: Sensitivity to the prior

Here I explore the sensitivity to the prior of the brms models, focusing on each “derived” allele independently.