Measuring natural selection on multivariate phenotypic traits: a protocol for verifiable and reproducible analyses of natural selection: supplementary material

The use of multiple regression analysis to quantify the regime and strength of natural selection in nature has been an influential approach in evolutionary biology over the last 36 years. However, many studies fail to report the protocol of estimation of selection coefficients (selection gradients) and the specific model assumptions, thus failing to verify and reproduce the estimation of selection coefficients. We present a brief overview of the Lande and Arnold’s approach and a step-by-step R routine to aid researchers to perform a verifiable and reproducible regression analysis of natural selection. The steps involved in the analysis include: (1) assessing collinearity between phenotypic traits, (2) testing normality of model residuals, and (3) testing multivariate normality of phenotypic traits. We also performed a series of simulations to test the effect of non-symmetrical (skewed) phenotypic traits on the estimation of linear selection gradients. These showed that the bias in the linear gradient increased with increased skewness in phenotypic traits for the quadratic model, whereas the linear gradient of a model with only linear terms was nearly independent of trait skewness. If none of the above assumptions are met, selection gradients need to be estimated from two separate equations, whereas standard errors must be computed using other methods (e.g. bootstrapping). We expect that the procedure outlined here and the availability of analytical codes motivate the verifiability and reproducibility of the Lande and Arnold’s approach in the study of microevolution.