Supplementary materials to 'The effect of dictionary omissions on phylogenies computationally inferred from lexical data' by Igor Yanovich.
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
File S1: R code for generating pseudo-datasets for studying dictionary omission
File S2: Python 2.7 code preparing likelihood-adjusted data for BEAST
File S3: Recoded Lezgian NEXUS file with Lezgian data from Kassian (2015). The recoding was as described in Section 2.1.
File S4: Posterior clade probabilities from a MrBayes analysis analogous to Fig. 5(a).
Analyses reported in Fig. 5 were performed using BEAST, while those in Fig. 6 were done using MrBayes. Here, I provide a part of the log of a MrBayes analysis with mostly the same settings as the BEAST analysis reported in Fig. 5(a). (The two differ in using different wide priors on the speciation rate.) Comparison of the clade posteriors in File S4 (containing the output of MrBayes) shows that both software packages correctly sample from one and the same posterior, as expected. Furthermore, the tree height is almost exactly the same (BEAST: [0.1191, 0.1332] 95% HPDI, MrBayes: [0.1196, 0.1333] 95% HPDI), though the inferred speciation rates differ (BEAST: [6.8348, 18.204] 95% HPDI, MrBayes: [3.9682, 10.5462] 95% HPDI). The difference in the speciation rate goes in the direction we expect, given the influence of the prior (the exponential(1) prior in the MrBayes analysis depresses that rate compared to the uniform prior in the BEAST analysis).