We inferred whether species richness and climate and environmental conditions influenced speciation and extinction rates and if the effects differ among time strata. For this, we employed the univariate birth–death shift model (Silvestro & Schnitzler 2018) with linear effects implemented in PyRate. Testing for diversity-dependent diversification also required the constraint of the correlation of diversity with speciation and extinction rates to be less or greater than zero, respectively. Diversity refers here to the sum of endemic and non‑endemic species. Prior to the analyses, all covariates were scaled to a mean of zero and unit variance to compare their correlation strength γ (see covariates). All unifactorial diversification models were ranked by their fit as quantified through Bayes Factors obtained by thermodynamic integration. This approach down-weighted the contribution of the data on posterior likelihoods via 10 stepping‑stones, each using 4,000,000 MCMC iterations, sampling every 10,000 iterations after discarding 20,000 iterations as burn-in.
# Load required packages
library(coda)
library(foreach)
library(doParallel)
# Auxiliary functions
source("AuxiliaryFunctions/runPyRateCont.R")
source("AuxiliaryFunctions/calcBayesFactor.R")
source("AuxiliaryFunctions/plotDrivers.R")
Due to computational limits we will only extract 2 random sets out of the speciation and extinction times inferred by each of the 2 BDS analyses.
First, we set up the files NoShift
and Shift
as template to instruct PyRateContinuous (Silvestro & Schnitzler 2018) on the covariate and whether there are shift in rates or not.
BDS <- 1:2 # Number of BDS analyses
RanSamp <- 2 # Random samples of speciation and extinction times
Fix <- paste(paste0("Fix_", BDS), rep(1:RanSamp, each = length(BDS)), sep = "_")
Fix <- sort(Fix)
Covars <- c("DD",
"ArbPol", "ArbPolabschange",
"Ca", "Caabschange",
"DecOaks", "DecOaksabschange",
"Ins41", "Ins41abschange",
"Isotope", "Isotopeabschange",
"K", "Kabschange",
"Grainsize", "Grainsizeabschange",
"LR04", "LR04abschange",
"Medstack", "Medstackabschange",
"Quartz", "Quartzabschange",
"TIC", "TICabschange",
"TOC", "TOCabschange")
Comb <- expand.grid(Fix, Covars)
colnames(Comb) <- c("Fix", "Covariate")
Comb$Model <- "1"
NoShift <- Comb
NoShift$stimesL <- c("13.66")
NoShift$stimesM <- c("13.66")
NoShiftNull <- data.frame(Fix = Fix,
Covariate = "Grainsize",
Model = "-1",
stimesL = "13.66",
stimesM = "13.66")
NoShift <- rbind(NoShiftNull, NoShift)
Shift <- Comb
Shift$stimesL <- c("13.66 12.46 10.48")
Shift$stimesM <- c("13.66 11.95")
ShiftNull <- data.frame(Fix = Fix,
Covariate = "Grainsize",
Model = "-1",
# Shifts from BD analysis
stimesL = c("13.66 12.46 10.48"),
stimesM = c("13.66 11.95"))
Shift <- rbind(ShiftNull, Shift)
The function runPyRateCont
is a wrapper around PyRateContinuous and uses the NoShift
and Shift
templates to infer the covariate influence on speciation and extinction rates in parallel on two CPU cores.
For this reproducible example we will run less iterations than 4,000,000 MCMC iterations and sample more often than every 10,000 iterations.
Covariate influence without shifts.
Files <- list.files("Results/BD/", ".txt")
file.copy(file.path("Results/BD", Files),
"Results/Covariates/NoShift", overwrite = TRUE)
## [1] TRUE TRUE TRUE TRUE
# Run on 2 cores in parallel
Ncores <- 2
registerDoParallel(Ncores)
PyRateCont <- foreach(iter = 1:nrow(NoShift),
.inorder = FALSE) %dopar% runPyRateCont(iter,
NoShift,
Path = "Covariates/NoShift",
n = 10000, s = 100)
stopImplicitCluster()
Covariate influence with shifts.
Files <- list.files("Results/BD/", ".txt")
file.copy(file.path("Results/BD", Files),
"Results/Covariates/Shift", overwrite = TRUE)
## [1] TRUE TRUE TRUE TRUE
# Run on 2 cores in parallel
Ncores <- 2
registerDoParallel(Ncores)
PyRateCont <- foreach(iter = 1:nrow(NoShift),
.inorder = FALSE) %dopar% runPyRateCont(iter,
Shift,
Path = "Covariates/Shift",
n = 10000, s = 100)
stopImplicitCluster()
All unifactorial diversification models were ranked by their fit as quantified through Bayes Factors obtained by thermodynamic integration.
system( paste0("python2 PyRate/PyRateContinuous2.py",
" -mL Results/Covariates/NoShift/",
" -b 10") )
system( paste0("python2 PyRate/PyRateContinuous2.py",
" -mL Results/Covariates/Shift/",
" -b 10") )
# Remove the cold chains from HDD
ColdChains <- list.files("Results/Covariates/NoShift/", pattern = "cold")
file.remove(paste0("Results/Covariates/NoShift/", ColdChains))
ColdChains <- list.files("Results/Covariates/Shift/", pattern = "cold")
file.remove(paste0("Results/Covariates/Shift/", ColdChains))
The next chunk ranks the covariates according to the difference in the Bayes Factor K to the best-fit model within replicated datasets and quantifies the uncertainty in the correlation parameter γ between rates and covariates across all replicates through 95% HPD intervals.
Nrep <- length(BDS) * RanSamp
Ncovar <- 26
Nrow <- Nrep * Ncovar
From <- seq(1, Nrow, by = Ncovar)
To <- seq(Ncovar, Nrow, by = Ncovar)
NrowBf <- 4 + Nrow/Ncovar
BfLa <- as.data.frame(matrix(NA_real_, nrow = Ncovar, ncol = NrowBf))
rownames(BfLa) <- c("Arboreal pollen", "Arboreal pollen change",
"Ca", "Ca change",
"Diversity",
"Deciduous oaks", "Deciduous oaks change",
"Constant",
"Grain size", "Grain size change",
"Summer insolation", "Summer insolation change",
"Isotope", "Isotope change",
"K", "K change",
"LR04", "LR04 change",
"Medstack", "Medstack change",
"Quartz", "Quartz change",
"TIC", "TIC change",
"TOC", "TOC change")
BfLa[, NrowBf] <- c(rep("red", 2),
rep("blue", 2),
"blue",
rep("red", 2),
"grey",
rep("red", 2),
rep("blue", 2),
rep("blue", 2),
rep("blue", 2),
rep("green", 2),
rep("green", 2),
rep("blue", 2),
rep("blue", 2),
rep("blue", 2))
BfMu2 <- BfMu1 <- BfMu <- BfLa3 <- BfLa2 <- BfLa1 <- BfLa
DdMaLi <- read.table("Results/Covariates/NoShift/marginal_likelihoods.txt",
header = TRUE, sep = "\t", check.names = FALSE, fill = NA)
for(i in 1:Nrep){
DdMaLiTmp <- DdMaLi[From[i]:To[i], ]
BfLa[, i] <- calcBayesFactor(DdMaLiTmp$lik_L_13-0)
BfMu[, i] <- calcBayesFactor(DdMaLiTmp$lik_M_13-0)
}
BfLa[, NrowBf - 3] <- rowMeans(BfLa[, 1:(NrowBf - 4)])
BfMu[, NrowBf - 3] <- rowMeans(BfMu[, 1:(NrowBf - 4)])
for(i in 1:Ncovar){
S <- seq(i, nrow(DdMaLi), by = Ncovar)
File <- DdMaLi$file_name[S]
Gmtmp <- Gltmp <- matrix(NA_real_,
nrow = 90,
ncol = length(File))
for(y in 1:length(File)){
ContLog <- read.table(paste0("Results/Covariates/NoShift/", File[y], ".log"),
header = TRUE, sep = "\t")
ContLog <- ContLog[ContLog$beta == 1, ]
ContLog <- ContLog[-c(1:round(nrow(ContLog)*0.1)), ] # Remove 10% burnin
Gltmp[, y] <- ContLog[, 14]
Gmtmp[, y] <- ContLog[, 16]
}
BfLa[i, (NrowBf - 2):(NrowBf - 1)] <- HPDinterval(coda:::mcmc(c(Gltmp)))
BfMu[i, (NrowBf - 2):(NrowBf - 1)] <- HPDinterval(coda:::mcmc(c(Gmtmp)))
}
DdMaLi <- read.table("Results/Covariates/Shift/marginal_likelihoods.txt",
header = TRUE, sep = "\t", check.names = FALSE, fill = NA)
for(i in 1:Nrep){
DdMaLiTmp <- DdMaLi[From[i]:To[i], ]
BfLa1[, i] <- calcBayesFactor(DdMaLiTmp$lik_L_13-12)
BfLa2[, i] <- calcBayesFactor(DdMaLiTmp$lik_L_12-10)
BfLa3[, i] <- calcBayesFactor(DdMaLiTmp$lik_L_10-0)
BfMu1[, i] <- calcBayesFactor(DdMaLiTmp$lik_M_13-11)
BfMu2[, i] <- calcBayesFactor(DdMaLiTmp$lik_M_11-0)
}
BfLa1[, NrowBf - 3] <- rowMeans(BfLa1[, 1:(NrowBf - 4)])
BfLa2[, NrowBf - 3] <- rowMeans(BfLa2[, 1:(NrowBf - 4)])
BfLa3[, NrowBf - 3] <- rowMeans(BfLa3[, 1:(NrowBf - 4)])
BfMu1[, NrowBf - 3] <- rowMeans(BfMu1[, 1:(NrowBf - 4)])
BfMu2[, NrowBf - 3] <- rowMeans(BfMu2[, 1:(NrowBf - 4)])
for(i in 1:Ncovar){
S <- seq(i, nrow(DdMaLi), by = Ncovar)
File <- DdMaLi$file_name[S]
Gm2tmp <- Gm1tmp <- Gl3tmp <- Gl2tmp <- Gl1tmp <- matrix(NA_real_,
nrow = 90,
ncol = length(File))
for(y in 1:length(File)){
ContLog <- read.table(paste0("Results/Covariates/Shift/", File[y], ".log"),
header = TRUE, sep = "\t")
ContLog <- ContLog[ContLog$beta == 1, ]
ContLog <- ContLog[-c(1:round(nrow(ContLog)*0.1)), ] # Remove 10% burnin
Gl1tmp[, y] <- ContLog[, 20]
Gl2tmp[, y] <- ContLog[, 21]
Gl3tmp[, y] <- ContLog[, 22]
Gm1tmp[, y] <- ContLog[, 23]
Gm2tmp[, y] <- ContLog[, 24]
}
BfLa1[i, (NrowBf - 2):(NrowBf - 1)] <- HPDinterval(coda:::mcmc(c(Gl1tmp)))
BfLa2[i, (NrowBf - 2):(NrowBf - 1)] <- HPDinterval(coda:::mcmc(c(Gl2tmp)))
BfLa3[i, (NrowBf - 2):(NrowBf - 1)] <- HPDinterval(coda:::mcmc(c(Gl3tmp)))
BfMu1[i, (NrowBf - 2):(NrowBf - 1)] <- HPDinterval(coda:::mcmc(c(Gm1tmp)))
BfMu2[i, (NrowBf - 2):(NrowBf - 1)] <- HPDinterval(coda:::mcmc(c(Gm2tmp)))
}
BfLa <- BfLa[order(BfLa[, NrowBf - 3]), ]
BfMu <- BfMu[order(BfMu[, NrowBf - 3]), ]
BfLa1 <- BfLa1[order(BfLa1[, NrowBf - 3]), ]
BfLa2 <- BfLa2[order(BfLa2[, NrowBf - 3]), ]
BfLa3 <- BfLa3[order(BfLa3[, NrowBf - 3]), ]
BfMu1 <- BfMu1[order(BfMu1[, NrowBf - 3]), ]
BfMu2 <- BfMu2[order(BfMu2[, NrowBf - 3]), ]
Plots to summarize the results of the phenomenological diversification analyses of covariate influence on speciation and extinction rates.
plotDrivers(BfLa, Klim = 180,
GammaLab = expression(gamma [lambda]),
Mtext = "Speciation period full (1.36 - 0 Myr ago)",
SubPlot = "a")
plotDrivers(BfLa1, Klim = 50,
GammaLab = expression(gamma [lambda]),
Mtext = "Speciation period SP1 (1.36 - 1.24 Myr ago)",
SubPlot = "b")
plotDrivers(BfLa2, Klim = 10,
GammaLab = expression(gamma [lambda]),
Mtext = "Speciation period SP2 (1.24 - 1.05 Myr ago)",
SubPlot = "c")
plotDrivers(BfLa3, Klim = 10,
GammaLab = expression(gamma [lambda]),
Mtext = "Speciation period SP3 (1.05 - 0 Myr ago)",
SubPlot = "d")
plotDrivers(BfMu, Klim = 25,
GammaLab = expression(gamma [mu]),
Mtext = "Extinction period full (1.36 - 0 Myr ago)",
SubPlot = "e")
plotDrivers(BfMu1, Klim = 10,
GammaLab = expression(gamma [mu]),
Mtext = "Extinction period EP1 (1.36 - 1.19 Myr ago)",
SubPlot = "f")
plotDrivers(BfMu2, Klim = 10,
GammaLab = expression(gamma [mu]),
Mtext = "Extinction period EP2 (1.19 - 0 Myr ago)",
SubPlot = "g")
Comparison of relative model support and correlation strength of speciation and extinction rates with climate, environmental and diversity proxies. Plots summarize the results of the phenomenological diversification analyses of covariate influence over the entire period of 1.36 Myr (a, e) and within the three speciation (b–d) and two extinction rate periods (f, g). For environmental and climate proxies both the influence of the change of parameters per 1,000 years and the influence of the total value was tested. Left panels of the plots show the ranking of covariates according to the difference in the Bayes Factor K to the best-fit model within replicated datasets (dots), thereby integrating the uncertainty in speciation and extinction times. Lines in the right panels quantify the uncertainty in the correlation parameter γ between rates and covariates across all replicates through 95% HPD intervals. Covariates with a correlation significantly different from zero do not intersect the grey dashed line. Covariates shown to the left of the dashed line (–) are negatively correlated, to the right of the dashed line (+) are positively correlated.
Sys.time()
## [1] "2020-01-05 11:52:05 CET"
sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.6 LTS
##
## Matrix products: default
## BLAS: /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] doParallel_1.0.15 iterators_1.0.12 foreach_1.4.7 coda_0.19-3
## [5] knitr_1.25
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.2 codetools_0.2-16 lattice_0.20-38 digest_0.6.21
## [5] grid_3.6.2 magrittr_1.5 evaluate_0.14 rlang_0.4.0
## [9] stringi_1.4.3 rmarkdown_1.15 tools_3.6.2 stringr_1.4.0
## [13] xfun_0.10 yaml_2.2.0 compiler_3.6.2 htmltools_0.4.0