Network PCM Analysis in Xiphophorus


File History

v2

Implementing two changes:

  • Correction of functions computing hybrid shift detection test power.

This version fixes an error in the implementation of the non-central coefficients of the Fisher distributions as expressed in formula (9) and (10) in [1].

Using the notations from the article, theoretical formulas are (see [1], p.817, second column) :

$$ \left\{ \begin{aligned} \Delta_{10}^2(\mathbf{R}, \bar{\mathbf{N}}, \mathbf{C}^{\text{tip}}) & = \left\lVert(\mathbf{I} - \text{Proj}_{\mathbf{R}})\bar{\mathbf{N}}\right\lVert_{(\mathbf{C}^{\text{tip}})^{-1}}^2\\ \Delta_{20}^2(\mathbf{d}, \mathbf{R}, \bar{\mathbf{N}}, \mathbf{N}, \mathbf{C}^{\text{tip}}) & = \left\lVert(\mathbf{I} - \text{Proj}_{\mathbf{R}})\mathbf{N}\mathbf{d}\right\lVert_{(\mathbf{C}^{\text{tip}})^{-1}}^2 \;. \end{aligned} \right. $$

where the projection is taken with respect to the metric defined by $(\mathbf{C}^{\text{tip}})^{-1}$, i.e. according to the formula:

$$ \text{Proj}_{\mathbf{M}} \mathbf{X} = \text{argmin}_{\mathbf{U}\in\text{Span}(\mathbf{M})} \left\lVert\mathbf{X} - \mathbf{U}\right\lVert_{(\mathbf{C}^{\text{tip}})^{-1}}^2 = \mathbf{M}(\mathbf{M}^T(\mathbf{C}^{\text{tip}})^{-1}\mathbf{M})^{-1}\mathbf{M}^T(\mathbf{C}^{\text{tip}})^{-1}\mathbf{X}\; . $$

However, in the previous version of this notebook, the projection was taken with respect to the standard metric defined by the identity matrix.

Functions power10 and power20 of this version implement the correct formula.
Functions power10_OLD and power20_OLD are the previously used (erroneous) versions, kept for comparison.

The only impact this error has on the article is to change the estimated empirical power of the test $H_2$ versus $H_0$ for the sword index on the network with three hybridisation events from $0.47$ to $0.30$ (see below, and [1], p.812, first column, top).

This error has no impact on the theoretical power curves plotted in the article, that remain correct.

  • Update syntax and packages for compatibility with Julia 1.4.1.

v1

Initial release, along with the article:
[1] Paul Bastide, Claudia Solís-Lemus, Ricardo Kriebel, K William Sparks, Cécile Ané, Phylogenetic Comparative Methods on Phylogenetic Networks with Reticulations, Systematic Biology, Volume 67, Issue 5, September 2018, Pages 800–820 doi:10.1093/sysbio/syy033.


IJulia notebook compilation instructions

This IJulia notebook (ipynb) file describes the PCM analysis conducted on the fish genus Xiphophorus. It was compiled into this html file using the Julia package IJulia. The interested reader can consult both:

  • The .html file contains a summary of the most important commands and results. It is self-sufficient.
  • The .ipynb file can be opened with the IJulia notebook command, so that the commands included in "chunks" of code can be executed. The default keystroke to execute a line of code from the .ipynb file into the IJulia environment is Control + Enter.

To launch the IJulia environement, the following commands can be executed. (See the IJulia webpage for more information.)

In [1]:
# import Pkg
# Pkg.add("IJulia")
# using IJulia; notebook(detached=true)

The PhyloNetworks package

All the analyses presented here were done using functions from the PhyloNetworks package (Solís-Lemus et al, 2017). A complete documentation is available online. To run the examples below, we need to install both PhyloNetworksand its companion package PhyloPlots, which contains all the network plotting features.

In [2]:
# import Pkg
# Pkg.add("PhyloNetworks")
# Pkg.add("PhyloPlots")
# Pkg.add("RCall")

after packages are installed, we can use them:

In [3]:
using PhyloNetworks
using PhyloPlots
using RCall

The specific versions of packages installed while runing this notebook are printed out at the end of this document, along with session informations.

Data Import

In this section, we import and format the data from the online supplementary files provided with this article. These commands all assume that the associated files are in the same working directory as the .ipynb file.

We fist read in the calibrated networks.

In [4]:
topologies = readMultiTopology("xiphophorus_networks_calibrated.tre");
tree = topologies[1];
net1 = topologies[2];
net3 = topologies[3];

All these networks share the same species. Next, we get the tip labels to use in matching and reducing the data.

In [5]:
taxa_net = tipLabels(tree);

We then import the morphology data.

In [6]:
using DataFrames
using CSV
dat = CSV.read("xiphophorus_morphology_Cui_etal_2013.csv",
               types=[Union{Missing, String}, Union{Missing, Float64}, Union{Missing, Float64}],
               copycols=true)
thread = 1 warning: only found 2 / 3 columns on data row: 24. Filling remaining columns with `missing`
Out[6]:

24 rows × 3 columns

tipNamessword_indexpreference
String⍰Float64⍰Float64⍰
1Xalvarezi0.65missing
2Xandersi0.35missing
3Xbirchmanni0.275-0.33
4Xclemenciae0.5640.44
5Xcontinens0.3missing
6Xcortezi0.37missing
7Xcouchianus0.275missing
8Xevelynae0.275missing
9Xgordoni0.275missing
10Xhellerii0.6410.91
11Xmaculatus0.2750.24
12Xmalinche0.37-0.24
13Xmayae0.7missing
14Xmeyeri0.275missing
15Xmilleri0.275missing
16Xmontezumae1.030.19
17Xmonticolus0.517missing
18Xmultilineatus0.4-0.08
19Xnezahualcoyotl0.48missing
20Xnigrensis0.37-0.1
21Xpygmaeus0.3-0.02
22Xsignum0.6missing
23Xvariatus0.2750.28
24Xxiphidium0.3missing

and delete the rows of data for the taxa not in the tree/networks.

In [7]:
for i in reverse(1:nrow(dat))
    j = findfirst(taxa_net .== dat[!, :tipNames][i])
    if isa(j, Nothing) # taxon not found in network
        println("taxon ", dat[!, :tipNames][i], " (row $i) not found in network")
        deleterows!(dat, i)
    end
end
taxon Xnezahualcoyotl (row 19) not found in network

Study of the Sword Index

The first trait in the data we can study is the sword index. We conduct several PCM analyses using the three calibrated networks imported in the previous section.

Ancestral State Reconstruction

We perform an ancestral state reconstruction assuming that the trait evolved like a BM on the network at hand. We start by taking a subset of the data with only the sword index.

In [8]:
dat_si = dat[:, [:sword_index, :tipNames]]
Out[8]:

23 rows × 2 columns

sword_indextipNames
Float64⍰String⍰
10.65Xalvarezi
20.35Xandersi
30.275Xbirchmanni
40.564Xclemenciae
50.3Xcontinens
60.37Xcortezi
70.275Xcouchianus
80.275Xevelynae
90.275Xgordoni
100.641Xhellerii
110.275Xmaculatus
120.37Xmalinche
130.7Xmayae
140.275Xmeyeri
150.275Xmilleri
161.03Xmontezumae
170.517Xmonticolus
180.4Xmultilineatus
190.37Xnigrensis
200.3Xpygmaeus
210.6Xsignum
220.275Xvariatus
230.3Xxiphidium

For each network, we plot both the predictions, and the 95% prediction intervals.

For h=0 (tree)

In [9]:
ASR_si_tree = ancestralStateReconstruction(dat_si, tree)
┌ Warning: These prediction intervals show uncertainty in ancestral values,
│ assuming that the estimated variance rate of evolution is correct.
│ Additional uncertainty in the estimation of this variance rate is
│ ignored, so prediction intervals should be larger.
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:2153
Out[9]:
ReconstructedStates:
────────────────────────────────────────────
  Node index     Pred.      Min.  Max. (95%)
────────────────────────────────────────────
       -11.0  0.276391  0.206774    0.346007
       -10.0  0.276391  0.206774    0.346007
        -9.0  0.298354  0.144771    0.451936
        -8.0  0.306009  0.157562    0.454455
        -7.0  0.306009  0.157562    0.454455
        -6.0  0.3625    0.193859    0.531142
        -5.0  0.386975  0.222837    0.551114
        -4.0  0.402965  0.239478    0.566451
       -15.0  0.413353  0.291228    0.535478
       -14.0  0.413353  0.291228    0.535478
       -13.0  0.486198  0.355338    0.617057
       -17.0  0.371438  0.281364    0.461513
       -18.0  0.323971  0.246286    0.401657
       -16.0  0.369298  0.277981    0.460616
       -12.0  0.486198  0.355338    0.617057
        -3.0  0.405834  0.241743    0.569925
       -20.0  0.520259  0.313881    0.726637
       -23.0  0.626821  0.461853    0.79179
       -22.0  0.604872  0.443701    0.766044
       -21.0  0.604872  0.443701    0.766044
       -19.0  0.497234  0.301677    0.692791
        -2.0  0.461352  0.261596    0.661108
        22.0  0.65      0.65        0.65
         8.0  0.35      0.35        0.35
        13.0  0.275     0.275       0.275
        18.0  0.564     0.564       0.564
        17.0  0.3       0.3         0.3
        11.0  0.37      0.37        0.37
         3.0  0.275     0.275       0.275
         5.0  0.275     0.275       0.275
         1.0  0.275     0.275       0.275
        21.0  0.641     0.641       0.641
         9.0  0.275     0.275       0.275
        12.0  0.37      0.37        0.37
        23.0  0.7       0.7         0.7
         2.0  0.275     0.275       0.275
         6.0  0.275     0.275       0.275
        10.0  1.03      1.03        1.03
        19.0  0.517     0.517       0.517
        15.0  0.4       0.4         0.4
        14.0  0.37      0.37        0.37
        16.0  0.3       0.3         0.3
        20.0  0.6       0.6         0.6
         4.0  0.275     0.275       0.275
         7.0  0.3       0.3         0.3
────────────────────────────────────────────
In [10]:
R"par(mar=c(.5,.5,.5,.5))"    # to reduce margins of future plots
plot(tree, :RCall, nodeLabel=expectationsPlot(ASR_si_tree), useEdgeLength=true, xlim=[0,25]);
plot(tree, :RCall, nodeLabel=predintPlot(ASR_si_tree), useEdgeLength=true, xlim=[-3,26]);

For h=1

In [11]:
ASR_si_net1 = ancestralStateReconstruction(dat_si, net1)
┌ Warning: These prediction intervals show uncertainty in ancestral values,
│ assuming that the estimated variance rate of evolution is correct.
│ Additional uncertainty in the estimation of this variance rate is
│ ignored, so prediction intervals should be larger.
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:2153
Out[11]:
ReconstructedStates:
────────────────────────────────────────────
  Node index     Pred.      Min.  Max. (95%)
────────────────────────────────────────────
       -11.0  0.276519  0.197206    0.355831
       -10.0  0.277325  0.18959     0.365061
        -9.0  0.299276  0.144417    0.454134
        -8.0  0.306545  0.156663    0.456427
        -7.0  0.306545  0.156663    0.456427
       -12.0  0.337105  0.120091    0.554119
        -6.0  0.358028  0.18506     0.530995
        -5.0  0.394332  0.223935    0.564729
        -4.0  0.411969  0.242323    0.581615
       -17.0  0.418841  0.296304    0.541377
       -16.0  0.41891   0.296385    0.541434
       -15.0  0.478543  0.351498    0.605589
       -19.0  0.370751  0.278553    0.462949
       -20.0  0.321836  0.244499    0.399173
       -18.0  0.370692  0.278469    0.462916
       -14.0  0.478529  0.351479    0.605578
         8.0  0.478529  0.351479    0.605578
        -3.0  0.419503  0.248418    0.590589
       -22.0  0.520979  0.312041    0.729916
       -25.0  0.628202  0.461306    0.795099
       -24.0  0.605608  0.442327    0.768889
       -23.0  0.605608  0.442327    0.768889
       -21.0  0.498074  0.301047    0.695101
        -2.0  0.462842  0.263392    0.662293
        23.0  0.65      0.65        0.65
         9.0  0.35      0.35        0.35
        13.0  0.275     0.275       0.275
        19.0  0.564     0.564       0.564
        18.0  0.3       0.3         0.3
        12.0  0.37      0.37        0.37
         3.0  0.275     0.275       0.275
         5.0  0.275     0.275       0.275
         1.0  0.275     0.275       0.275
        22.0  0.641     0.641       0.641
        10.0  0.275     0.275       0.275
        14.0  0.37      0.37        0.37
        24.0  0.7       0.7         0.7
         2.0  0.275     0.275       0.275
         6.0  0.275     0.275       0.275
        11.0  1.03      1.03        1.03
        20.0  0.517     0.517       0.517
        16.0  0.4       0.4         0.4
        15.0  0.37      0.37        0.37
        17.0  0.3       0.3         0.3
        21.0  0.6       0.6         0.6
         4.0  0.275     0.275       0.275
         7.0  0.3       0.3         0.3
────────────────────────────────────────────
In [12]:
R"par(mar=c(.5,.5,.5,.5))"
plot(net1, :RCall, nodeLabel=expectationsPlot(ASR_si_net1), useEdgeLength=true, xlim=[0,25]);
plot(net1, :RCall, nodeLabel=predintPlot(ASR_si_net1), useEdgeLength=true, xlim=[-3,26]);

For h=3

In [13]:
ASR_si_net3 = ancestralStateReconstruction(dat_si, net3)
┌ Warning: These prediction intervals show uncertainty in ancestral values,
│ assuming that the estimated variance rate of evolution is correct.
│ Additional uncertainty in the estimation of this variance rate is
│ ignored, so prediction intervals should be larger.
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:2153
Out[13]:
ReconstructedStates:
────────────────────────────────────────────
  Node index     Pred.      Min.  Max. (95%)
────────────────────────────────────────────
       -11.0  0.276494  0.205575    0.347413
       -10.0  0.276494  0.205575    0.347413
        -9.0  0.300094  0.136572    0.463615
        -8.0  0.313992  0.157436    0.470548
       -12.0  0.320201  0.118408    0.521994
        -7.0  0.325285  0.171393    0.479176
        -6.0  0.325285  0.171393    0.479176
        -5.0  0.39221   0.218505    0.565916
        -4.0  0.409501  0.236586    0.582416
       -19.0  0.419009  0.299734    0.538284
        15.0  0.419009  0.299734    0.538284
       -17.0  0.411545  0.28576     0.537331
       -16.0  0.477985  0.350638    0.605333
       -21.0  0.370972  0.27952     0.462425
       -22.0  0.323785  0.244753    0.402817
       -20.0  0.368759  0.276116    0.461402
       -15.0  0.477985  0.350638    0.605333
       -14.0  0.477985  0.350638    0.605333
         7.0  0.477985  0.350638    0.605333
        -3.0  0.424496  0.249015    0.599977
       -24.0  0.521115  0.311506    0.730723
        22.0  0.609145  0.432659    0.785631
       -29.0  0.627545  0.460368    0.794723
       -27.0  0.605926  0.443486    0.768366
       -26.0  0.605926  0.443486    0.768366
       -25.0  0.605926  0.443486    0.768366
       -23.0  0.499063  0.302243    0.695883
        -2.0  0.463689  0.264182    0.663195
        25.0  0.65      0.65        0.65
         9.0  0.35      0.35        0.35
        14.0  0.275     0.275       0.275
        20.0  0.564     0.564       0.564
        19.0  0.3       0.3         0.3
        12.0  0.37      0.37        0.37
         3.0  0.275     0.275       0.275
         5.0  0.275     0.275       0.275
         1.0  0.275     0.275       0.275
        24.0  0.641     0.641       0.641
        10.0  0.275     0.275       0.275
        13.0  0.37      0.37        0.37
        26.0  0.7       0.7         0.7
         2.0  0.275     0.275       0.275
         8.0  0.275     0.275       0.275
        11.0  1.03      1.03        1.03
        21.0  0.517     0.517       0.517
        17.0  0.4       0.4         0.4
        16.0  0.37      0.37        0.37
        18.0  0.3       0.3         0.3
        23.0  0.6       0.6         0.6
         4.0  0.275     0.275       0.275
         6.0  0.3       0.3         0.3
────────────────────────────────────────────
In [14]:
R"par(mar=c(.5,.5,.5,.5))"
plot(net3, :RCall, nodeLabel=expectationsPlot(ASR_si_net3), useEdgeLength=true, xlim=[0,25]);
plot(net3, :RCall, nodeLabel=predintPlot(ASR_si_net3), useEdgeLength=true, xlim=[-3,26]);

Phylogenetic signal with Pagel's lambda

For h=0 (tree)

In [15]:
using GLM
lambda_si_tree = phyloNetworklm(@formula(sword_index ~ 1), dat, tree, model="lambda")
┌ Info: Maximum lambda value to maintain positive branch lengths: 1.07157
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1381
Out[15]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: sword_index ~ 1

Model: lambda

Parameter(s) Estimates:
Sigma2: 0.00294618
Lambda: 1.07157

Coefficients:
──────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────────
(Intercept)  0.462841    0.103614  4.46698    0.0002   0.247959   0.677723
──────────────────────────────────────────────────────────────────────────
Log Likelihood: 14.9907546797
AIC: -23.9815093594
In [16]:
round(lambda_estim(lambda_si_tree), digits = 2)
Out[16]:
1.07

For h=1

In [17]:
lambda_si_net1 = phyloNetworklm(@formula(sword_index ~ 1), dat, net1, model="lambda")
┌ Info: Maximum lambda value to maintain positive branch lengths: 1.068
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1381
Out[17]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: sword_index ~ 1

Model: lambda

Parameter(s) Estimates:
Sigma2: 0.00303513
Lambda: 1.068

Coefficients:
──────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────────
(Intercept)  0.464573    0.103264   4.4989    0.0002   0.250417   0.678729
──────────────────────────────────────────────────────────────────────────
Log Likelihood: 11.9093110604
AIC: -17.8186221209
In [18]:
round(lambda_estim(lambda_si_net1), digits = 2)
Out[18]:
1.07

For h=3

In [19]:
lambda_si_net3 = phyloNetworklm(@formula(sword_index ~ 1), dat, net3, model="lambda")
┌ Info: Maximum lambda value to maintain positive branch lengths: 1.07157
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1381
Out[19]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: sword_index ~ 1

Model: lambda

Parameter(s) Estimates:
Sigma2: 0.00307506
Lambda: 1.07157

Coefficients:
──────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────────
(Intercept)  0.465531    0.103858  4.48238    0.0002   0.250142   0.680919
──────────────────────────────────────────────────────────────────────────
Log Likelihood: 14.6769431224
AIC: -23.3538862449
In [20]:
round(lambda_estim(lambda_si_net3), digits = 2)
Out[20]:
1.07

Transgressive Evolution

Transgressive evolution can only be tested for networks (h>0). The method relies on the construction of a regressor matrix that allows for a shift below each hybrid.

For h=1

With a single reticulation event, we cannot test for heterogeneous effects of transgressive evolution. We can fit 2 models only: one with no transgressive evolution (H0), and another with a transgressive evolution effect after the 1 reticulation event (H1). Comparing these 2 models will be done with the F test for a (common) transgressive evolution effect.

We first construct the regressor matrix associated with hybrids.

In [21]:
regNet1 = regressorHybrid(net1)
┌ Warning: `names!(df::AbstractDataFrame, vals::Vector{Symbol}; makeunique::Bool = false)` is deprecated, use `rename!(df, vals, makeunique = makeunique)` instead.
│   caller = regressorShift(::Array{PhyloNetworks.Node,1}, ::HybridNetwork, ::PhyloNetworks.MatrixTopologicalOrder) at traits.jl:594
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:594
Out[21]:

23 rows × 3 columns

shift_34tipNamessum
Float64StringFloat64
10.0Xgordoni0.0
20.0Xmeyeri0.0
30.0Xcouchianus0.0
40.0Xvariatus0.0
50.0Xevelynae0.0
60.0Xmilleri0.0
70.0Xxiphidium0.0
80.0Xandersi0.0
90.0Xmaculatus0.0
101.0Xmontezumae1.0
111.0Xcortezi1.0
121.0Xbirchmanni1.0
131.0Xmalinche1.0
141.0Xnigrensis1.0
151.0Xmultilineatus1.0
161.0Xpygmaeus1.0
171.0Xcontinens1.0
180.0Xclemenciae0.0
190.0Xmonticolus0.0
200.0Xsignum0.0
210.0Xhellerii0.0
220.0Xalvarezi0.0
230.0Xmayae0.0

This matrix has a column corresponding to each edge going out of an hybrid. It indicates the tips descending from each, taking the possible partial descendances introduced by the network into account. It also has an extra column labelled sum, that we will discuss afterward. Here, only the edge with number 34 is below a reticulation, as we can see from plotting the network using edge numbers (where we do not use the branch lengths, so that the topology appears more clearly). The associated predictor is labelled shift_34 (see above).

In [22]:
R"par(mar=c(.5,.5,.5,.5))"
plot(net1, :RCall, showEdgeNumber=true, xlim=[0,15]);

We can then join the regressor matrix with the data, in order to use the phyloNetworklm function. We call this function twice. First, we fit the data against a simple intercept (H0). Then, we fit the data against the regressor induced by the shift after the hybrid (H1). Once this is done, we simply call the GLM function ftest to perform the F test, including the models from the most to the least complex.

In [23]:
dat1 = join(dat, regNet1, on = :tipNames);
fit_net1_0 = phyloNetworklm(@formula(sword_index ~ 1), dat1, net1)
fit_net1_1 = phyloNetworklm(@formula(sword_index ~ shift_34), dat1, net1)
ftest(fit_net1_1, fit_net1_0)
Out[23]:
F-test: 2 models fitted on 23 observations
───────────────────────────────────────────────────────────────
     DOF  ΔDOF     SSR    ΔSSR      R²      ΔR²      F*   p(>F)
───────────────────────────────────────────────────────────────
[1]    3        0.0618          0.1747                         
[2]    2    -1  0.0631  0.0013  0.1576  -0.0171  0.4353  0.5166
───────────────────────────────────────────────────────────────

Note that, for conventional reasons, the ftest function takes the most complex model as the first one. This means that, in the table of results, the models are actually named in a reverse order, so that "Model 2" is actually our model under H0, and "Model 1" the one under H1.

For h=3

Here with 3 reticulations, we can consider 3 models: no transgressive evolution (H0), common transgressive evolution (H1), and heterogeneous transgressive evolution (H2). We can run F tests to compare H1 vs H0, and H2 vs H1.

We construct the regressor matrix the same way as before.

In [24]:
regNet3 = regressorHybrid(net3)
Out[24]:

23 rows × 5 columns

shift_24shift_37shift_45tipNamessum
Float64Float64Float64StringFloat64
10.00.00.0Xgordoni0.0
20.00.00.0Xmeyeri0.0
30.00.00.0Xcouchianus0.0
40.00.00.0Xvariatus0.0
50.00.00.0Xevelynae0.0
60.00.00.0Xxiphidium0.0
70.00.00.0Xmilleri0.0
80.00.00.0Xandersi0.0
90.00.00.0Xmaculatus0.0
100.01.00.0Xmontezumae1.0
110.01.00.0Xcortezi1.0
121.01.00.0Xmalinche2.0
131.01.00.0Xbirchmanni2.0
140.01.00.0Xnigrensis1.0
150.01.00.0Xmultilineatus1.0
160.01.00.0Xpygmaeus1.0
170.01.00.0Xcontinens1.0
180.00.00.0Xclemenciae0.0
190.00.00.0Xmonticolus0.0
200.00.00.0Xsignum0.0
210.00.01.0Xhellerii1.0
220.00.00.0Xalvarezi0.0
230.00.00.0Xmayae0.0

Here, 3 columns are created, shift_24, shift_37 and shift_45, associated with the 3 hybridization events occurring in the network. The extra column sum is just the sum of these 3 predictors. It is useful to perform the fit of model H1, where we only consider a homogeneous transgressive evolution effect. For the third fit (H2), we need to consider all three predictors independently. The results of the F tests can hence be obtained as follow.

In [25]:
dat3 = join(dat, regNet3, on = :tipNames);
fit_net3_0 = phyloNetworklm(@formula(sword_index ~ 1), dat3, net3)
fit_net3_1 = phyloNetworklm(@formula(sword_index ~ sum), dat3, net3)
fit_net3_2 = phyloNetworklm(@formula(sword_index ~ shift_24 + shift_37 + shift_45),
                            dat3, net3)
ftest(fit_net3_2, fit_net3_1, fit_net3_0)
Out[25]:
F-test: 3 models fitted on 23 observations
───────────────────────────────────────────────────────────────
     DOF  ΔDOF     SSR    ΔSSR      R²      ΔR²      F*   p(>F)
───────────────────────────────────────────────────────────────
[1]    5        0.0544          0.2772                         
[2]    3    -2  0.0622  0.0078  0.1732  -0.1040  1.3667  0.2789
[3]    2    -1  0.0633  0.0011  0.1586  -0.0146  0.3707  0.5492
───────────────────────────────────────────────────────────────

As above, beware that the numbering of the models in the table of result (most to least complex) is reversed compared to our numbering (least to most complex). Hence, here, "Model 3" is actually our model under H0, "Model 2" the one under H1, and "Model 1" the one under H2.

Transgressive Evolution: Power

We can comute the power associated with the tests above assuming that the network, variance parameter and size of the effects are correctly estimated.

We first define two functions computing the power of the homogeneous vs no effect (power10) and heterogeneous vs no effect (power20), when the p-value is called significant if lower than α=0.05 (but change the "level" argument below to change α).

In [26]:
using Distributions
using LinearAlgebra

function power10_OLD(b::Real, sigma2::Real, net::HybridNetwork; level=0.95)
    # Characteristics of the network
    n = length(tipLabels(net))
    V = sharedPathMatrix(net)[:Tips]
    dfr_hybrid = regressorHybrid(net)
    # Compute the non-central coefficient
    X = ones(n)
    Z = convert(Array, dfr_hybrid[!, :sum])
    ncc10 = (I - 1/n .* X*X') * Z                ## WRONG PROJECTION HERE
    ncc10 = ncc10' * V^(-1) * ncc10
    ncc10 = b^2/sigma2 * ncc10[1]
    # Compute power
    d1 = 2 - 1
    d2 = n - 2
    q95 = quantile(FDist(d1, d2), level)
    pow95 = ccdf(NoncentralF(d1, d2, ncc10), q95)
end

function power10(b::Real, sigma2::Real, net::HybridNetwork; level=0.95)
    # Characteristics of the network
    n = length(tipLabels(net))
    V = sharedPathMatrix(net)[:Tips]
    dfr_hybrid = regressorHybrid(net)
    # Compute the non-central coefficient
    X = ones(n)
    Z = convert(Array, dfr_hybrid[!, :sum])
    Vinv = V^(-1)
    ncc10 = (I -  X*(X'*Vinv*X)^(-1)*X'*Vinv) * Z ## CORRECT PROJECTION HERE
    ncc10 = ncc10' * Vinv * ncc10
    ncc10 = b^2/sigma2 * ncc10[1]
    # Compute power
    d1 = 2 - 1
    d2 = n - 2
    q95 = quantile(FDist(d1, d2), level)
    pow95 = ccdf(NoncentralF(d1, d2, ncc10), q95)
end

function power20_OLD(B::Vector, sigma2::Real, net::HybridNetwork; level=0.95)
    # Characteristics of the network
    n = length(tipLabels(net))
    V = sharedPathMatrix(net)[:Tips]
    dfr_hybrid = regressorHybrid(net)
    # Regression matrices
    X = [ones(n) convert(Array, dfr_hybrid[!, :sum])]
    Z = convert(Array, dfr_hybrid[:, filter(x -> !(x in [:sum, :tipNames]), names(dfr_hybrid))])
    nh = size(Z,2)
    # noncentral coef
    ncc20 = (I - X*(X'*X)^(-1)*X') * Z * B           ## WRONG PROJECTION HERE
    ncc20 = ncc20' * V^(-1) * ncc20
    ncc20 = 1/sigma2 * ncc20[1]
    # power
    d1 = nh - 1
    d2 = n - nh - 1
    q95 = quantile(FDist(d1, d2), level)
    pow95 = ccdf(NoncentralF(d1, d2, ncc20), q95)
end

function power20(B::Vector, sigma2::Real, net::HybridNetwork; level=0.95)
    # Characteristics of the network
    n = length(tipLabels(net))
    V = sharedPathMatrix(net)[:Tips]
    dfr_hybrid = regressorHybrid(net)
    # Regression matrices
    X = [ones(n) convert(Array, dfr_hybrid[!, :sum])]
    Z = convert(Array, dfr_hybrid[:, filter(x -> !(x in [:sum, :tipNames]), names(dfr_hybrid))])
    nh = size(Z,2)
    # noncentral coef
    Vinv = V^(-1)
    ncc20 = (I -  X*(X'*Vinv*X)^(-1)*X'*Vinv) * Z * B ## CORRECT PROJECTION HERE
    ncc20 = ncc20' * Vinv * ncc20
    ncc20 = 1/sigma2 * ncc20[1]
    # power
    d1 = nh - 1
    d2 = n - nh - 1
    q95 = quantile(FDist(d1, d2), level)
    pow95 = ccdf(NoncentralF(d1, d2, ncc20), q95)
end
Out[26]:
power20 (generic function with 1 method)
In [27]:
power10(coef(fit_net1_1)[2], sigma2_estim(fit_net1_1), net1)
Out[27]:
0.10110837769630487
In [28]:
power10_OLD(coef(fit_net1_1)[2], sigma2_estim(fit_net1_1), net1)
Out[28]:
0.10556669080240755
In [29]:
power10(coef(fit_net3_1)[2], sigma2_estim(fit_net3_1), net3)
Out[29]:
0.09338159706423421
In [30]:
power10_OLD(coef(fit_net3_1)[2], sigma2_estim(fit_net3_1), net3)
Out[30]:
0.09414888020374224
In [31]:
power20(coef(fit_net3_2)[2:4], sigma2_estim(fit_net3_2), net3)
Out[31]:
0.3049694870749543
In [32]:
power20_OLD(coef(fit_net3_2)[2:4], sigma2_estim(fit_net3_2), net3)
Out[32]:
0.46566529934158374

The power to detect homogeneous effect is very small (around 10%), when using parameters estimated from the sword index data. This is consistent with the graph presented in the main text, as one need to use the normalized coefficient $$\sqrt{\frac{b^2}{\sigma^2 h}}$$ where h is the total heigh of the tree. Here, for net1, this coefficient is about 0.2 (computed below).

In [33]:
sqrt(coef(fit_net3_1)[2]^2/sigma2_estim(fit_net3_1)/maximum(getNodeAges(net3)))
Out[33]:
0.22104398918753074

However, the power to detect heterogeneous effects is a bit better, around 47%.

Female preference (with missing data)

We reproduce these analyses for the second trait, the female preference.

Some taxa present in the network are missing data for female preference. In all this section, we do not remove the taxa with missing data, from the network or from the morphology data frame.

That enables us to:

  • predict the female preference values of taxa that are missing those data, and
  • retain all hybridization events in the network with h=3, even if one of them could get invisible after pruning from the network the taxa (tips) with missing female preference data.

Ancestral State Reconstruction

As above, we assume that the trait evolved like a BM on the network at hand. We start by taking a sub-set of the data with only the female preference.

In [34]:
dat_fp = dat[:, [:preference, :tipNames]]
Out[34]:

23 rows × 2 columns

preferencetipNames
Float64⍰String⍰
1missingXalvarezi
2missingXandersi
3-0.33Xbirchmanni
40.44Xclemenciae
5missingXcontinens
6missingXcortezi
7missingXcouchianus
8missingXevelynae
9missingXgordoni
100.91Xhellerii
110.24Xmaculatus
12-0.24Xmalinche
13missingXmayae
14missingXmeyeri
15missingXmilleri
160.19Xmontezumae
17missingXmonticolus
18-0.08Xmultilineatus
19-0.1Xnigrensis
20-0.02Xpygmaeus
21missingXsignum
220.28Xvariatus
23missingXxiphidium

For each network, we plot both the predictions, and the 95% prediction intervals.

For h=0 (tree)

In [35]:
ASR_fp_tree = ancestralStateReconstruction(dat_fp, tree)
┌ Warning: These prediction intervals show uncertainty in ancestral values,
│ assuming that the estimated variance rate of evolution is correct.
│ Additional uncertainty in the estimation of this variance rate is
│ ignored, so prediction intervals should be larger.
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:2153
Out[35]:
ReconstructedStates:
────────────────────────────────────────────────
  Node index       Pred.        Min.  Max. (95%)
────────────────────────────────────────────────
       -11.0   0.266814   -0.29324     0.826868
       -10.0   0.266814   -0.29324     0.826868
        -9.0   0.266814   -0.0962233   0.629851
        -8.0   0.264774   -0.105456    0.635004
        -7.0   0.264774   -0.105456    0.635004
        -6.0   0.257556   -0.0934667   0.608579
        -5.0   0.25525    -0.0735353   0.584035
        -4.0   0.253929   -0.0576969   0.565555
       -15.0  -0.131033   -0.34241     0.0803438
       -14.0  -0.131033   -0.34241     0.0803438
       -13.0  -0.0313643  -0.24038     0.177652
       -17.0  -0.0641475  -0.205714    0.077419
       -18.0  -0.0405959  -0.190904    0.109712
       -16.0  -0.0600681  -0.205315    0.0851791
       -12.0  -0.0313643  -0.24038     0.177652
        -3.0   0.25385    -0.0573344   0.565035
       -20.0   0.418805    0.0205993   0.817011
       -23.0   0.617361    0.16475     1.06997
       -22.0   0.617361    0.223195    1.01153
       -21.0   0.617361    0.223195    1.01153
       -19.0   0.40675     0.0471255   0.766374
        -2.0   0.346724   -0.0150435   0.708492
        22.0   0.617361   -0.0323449   1.26707
         8.0   0.25525    -0.461896    0.972395
        17.0  -0.0405959  -0.280619    0.199427
        11.0  -0.131033   -0.567092    0.305026
         3.0   0.266814   -0.323273    0.856901
         5.0   0.264774   -0.35728     0.886828
         1.0   0.266814   -0.323273    0.856901
        23.0   0.617361   -0.0323449   1.26707
         2.0   0.266814   -0.323273    0.856901
         6.0   0.264774   -0.35728     0.886828
        19.0   0.418805   -0.252746    1.09036
        20.0   0.617361   -0.0323449   1.26707
         7.0   0.257556   -0.443557    0.95867
        13.0  -0.33       -0.33       -0.33
        18.0   0.44        0.44        0.44
        21.0   0.91        0.91        0.91
         9.0   0.24        0.24        0.24
        12.0  -0.24       -0.24       -0.24
        10.0   0.19        0.19        0.19
        15.0  -0.08       -0.08       -0.08
        14.0  -0.1        -0.1        -0.1
        16.0  -0.02       -0.02       -0.02
         4.0   0.28        0.28        0.28
────────────────────────────────────────────────
In [36]:
R"par(mar=c(.5,.5,.5,.5))"
plot(tree, :RCall, nodeLabel=expectationsPlot(ASR_fp_tree), useEdgeLength=true, xlim=[0,25]);
plot(tree, :RCall, nodeLabel=predintPlot(ASR_fp_tree), useEdgeLength=true, xlim=[-3,26]);

Note that imputed trait values are also plotted with their prediction intervals.

For h=1

In [37]:
ASR_fp_net1 = ancestralStateReconstruction(dat_fp, net1)
┌ Warning: These prediction intervals show uncertainty in ancestral values,
│ assuming that the estimated variance rate of evolution is correct.
│ Additional uncertainty in the estimation of this variance rate is
│ ignored, so prediction intervals should be larger.
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:2153
Out[37]:
ReconstructedStates:
─────────────────────────────────────────────────
  Node index       Pred.         Min.  Max. (95%)
─────────────────────────────────────────────────
       -11.0   0.247003   -0.335192     0.829199
       -10.0   0.247003   -0.325673     0.819679
        -9.0   0.247003   -0.129188     0.623195
        -8.0   0.242425   -0.139386     0.624236
        -7.0   0.242425   -0.139386     0.624236
       -12.0   0.183917   -0.354574     0.722408
        -6.0   0.226667   -0.136372     0.589706
        -5.0   0.232813   -0.103485     0.56911
        -4.0   0.235356   -0.083723     0.554435
       -17.0  -0.111826   -0.328238     0.104586
       -16.0  -0.111702   -0.328133     0.10473
       -15.0  -0.0262039  -0.236863     0.184455
       -19.0  -0.060073   -0.209324     0.0891781
       -20.0  -0.0388557  -0.193238     0.115526
       -18.0  -0.0599501  -0.209294     0.0893934
       -14.0  -0.0261911  -0.236855     0.184473
         8.0  -0.0261911  -0.236855     0.184473
        -3.0   0.236037   -0.0823852    0.554459
       -22.0   0.403442   -0.00794844   0.814832
       -25.0   0.605967    0.135106     1.07683
       -24.0   0.605967    0.198525     1.01341
       -23.0   0.605967    0.198525     1.01341
       -21.0   0.381995    0.015619     0.748371
        -2.0   0.316546   -0.0480464    0.681139
        23.0   0.605967   -0.0666756    1.27861
         9.0   0.232813   -0.517716     0.983342
        18.0  -0.0388557  -0.282909     0.205198
        12.0  -0.111702   -0.569847     0.346444
         3.0   0.247003   -0.369728     0.863734
         5.0   0.242425   -0.403913     0.888764
         1.0   0.247003   -0.369728     0.863734
        24.0   0.605967   -0.0666756    1.27861
         2.0   0.247003   -0.369728     0.863734
         6.0   0.242425   -0.403913     0.888764
        20.0   0.403442   -0.29137      1.09825
        21.0   0.605967   -0.0666756    1.27861
         7.0   0.183917   -0.518419     0.886253
        13.0  -0.33       -0.33        -0.33
        19.0   0.44        0.44         0.44
        22.0   0.91        0.91         0.91
        10.0   0.24        0.24         0.24
        14.0  -0.24       -0.24        -0.24
        11.0   0.19        0.19         0.19
        16.0  -0.08       -0.08        -0.08
        15.0  -0.1        -0.1         -0.1
        17.0  -0.02       -0.02        -0.02
         4.0   0.28        0.28         0.28
─────────────────────────────────────────────────
In [38]:
R"par(mar=c(.5,.5,.5,.5))"
plot(net1, :RCall, nodeLabel=expectationsPlot(ASR_fp_net1), useEdgeLength=true, xlim=[0,25]);
plot(net1, :RCall, nodeLabel=predintPlot(ASR_fp_net1), useEdgeLength=true, xlim=[-3,26]);

For h=3

In [39]:
ASR_fp_net3 = ancestralStateReconstruction(dat_fp, net3)
┌ Warning: These prediction intervals show uncertainty in ancestral values,
│ assuming that the estimated variance rate of evolution is correct.
│ Additional uncertainty in the estimation of this variance rate is
│ ignored, so prediction intervals should be larger.
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:2153
Out[39]:
ReconstructedStates:
────────────────────────────────────────────────
  Node index       Pred.        Min.  Max. (95%)
────────────────────────────────────────────────
       -11.0   0.245971   -0.342466    0.834409
       -10.0   0.245971   -0.342466    0.834409
        -9.0   0.245971   -0.13307     0.625012
        -8.0   0.237077   -0.149076    0.623231
       -12.0   0.204949   -0.305731    0.715629
        -7.0   0.232507   -0.150554    0.615569
        -6.0   0.232507   -0.150554    0.615569
        -5.0   0.241842   -0.105104    0.588788
        -4.0   0.243903   -0.0854701   0.573276
       -19.0  -0.114647   -0.329965    0.100672
        15.0  -0.114647   -0.329965    0.100672
       -17.0  -0.12487    -0.352535    0.102796
       -16.0  -0.0307929  -0.242756    0.181171
       -21.0  -0.0639022  -0.212912    0.0851074
       -22.0  -0.04045    -0.198955    0.118055
       -20.0  -0.0597842  -0.212554    0.0929861
       -15.0  -0.0307929  -0.242756    0.181171
       -14.0  -0.0307929  -0.242756    0.181171
         7.0  -0.0307929  -0.242756    0.181171
        -3.0   0.245278   -0.0825556   0.573113
       -24.0   0.406078   -0.0104218   0.822577
        22.0   0.642628    0.245421    1.03983
       -29.0   0.615611    0.148068    1.08315
       -27.0   0.615611    0.211298    1.01992
       -26.0   0.615611    0.211298    1.01992
       -25.0   0.615611    0.211298    1.01992
       -23.0   0.386783    0.017621    0.755946
        -2.0   0.319654   -0.0484843   0.687792
        25.0   0.615611   -0.0630733   1.2943
         9.0   0.241842   -0.520508    1.00419
        19.0  -0.04045    -0.293694    0.212794
        12.0  -0.12487    -0.583777    0.334038
         3.0   0.245971   -0.374301    0.866243
         5.0   0.237077   -0.436113    0.910268
         1.0   0.245971   -0.374301    0.866243
        26.0   0.615611   -0.0630733   1.2943
         2.0   0.245971   -0.374301    0.866243
         8.0   0.232507   -0.462605    0.92762
        21.0   0.406078   -0.300465    1.11262
        23.0   0.615611   -0.0630733   1.2943
         6.0   0.204949   -0.48046     0.890359
        14.0  -0.33       -0.33       -0.33
        20.0   0.44        0.44        0.44
        24.0   0.91        0.91        0.91
        10.0   0.24        0.24        0.24
        13.0  -0.24       -0.24       -0.24
        11.0   0.19        0.19        0.19
        17.0  -0.08       -0.08       -0.08
        16.0  -0.1        -0.1        -0.1
        18.0  -0.02       -0.02       -0.02
         4.0   0.28        0.28        0.28
────────────────────────────────────────────────
In [40]:
R"par(mar=c(.5,.5,.5,.5))"
plot(net3, :RCall, nodeLabel=expectationsPlot(ASR_fp_net3), useEdgeLength=true, xlim=[0,25]);
plot(net3, :RCall, nodeLabel=predintPlot(ASR_fp_net3), useEdgeLength=true, xlim=[-3,26]);

Phylogenetic signal with Pagel's lambda

For h=0 (tree)

In [41]:
lambda_fp_tree = phyloNetworklm(@formula(preference ~ 1), dat, tree, model="lambda")
┌ Info: Maximum lambda value to maintain positive branch lengths: 1.07157
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1381
Out[41]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: preference ~ 1

Model: lambda

Parameter(s) Estimates:
Sigma2: 0.00532966
Lambda: 1.07157

Coefficients:
───────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)   Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
(Intercept)  0.354209    0.165735  2.13719    0.0613  -0.0207107   0.729129
───────────────────────────────────────────────────────────────────────────
Log Likelihood: 0.5481794742
AIC: 4.9036410515
In [42]:
round(lambda_estim(lambda_fp_tree), digits = 2)
Out[42]:
1.07

For h=1

In [43]:
lambda_fp_net1 = phyloNetworklm(@formula(preference ~ 1), dat, net1, model="lambda")
┌ Info: Maximum lambda value to maintain positive branch lengths: 1.068
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1381
Out[43]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: preference ~ 1

Model: lambda

Parameter(s) Estimates:
Sigma2: 0.00576187
Lambda: 1.068

Coefficients:
──────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────────
(Intercept)  0.323594     0.16681   1.9399    0.0843  -0.053756   0.700944
──────────────────────────────────────────────────────────────────────────
Log Likelihood: 0.1476355653
AIC: 5.7047288695
In [44]:
round(lambda_estim(lambda_fp_net1), digits = 2)
Out[44]:
1.07

For h=3

In [45]:
lambda_fp_net3 = phyloNetworklm(@formula(preference ~ 1), dat, net3, model="lambda")
┌ Info: Maximum lambda value to maintain positive branch lengths: 1.07157
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1381
Out[45]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: preference ~ 1

Model: lambda

Parameter(s) Estimates:
Sigma2: 0.00596273
Lambda: 1.07157

Coefficients:
───────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)   Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
(Intercept)  0.327211    0.169062  1.93545    0.0849  -0.0552334   0.709655
───────────────────────────────────────────────────────────────────────────
Log Likelihood: 0.1948654889
AIC: 5.6102690221
In [46]:
round(lambda_estim(lambda_fp_net3), digits = 2)
Out[46]:
1.07

Transgressive Evolution

We apply the same methods to study transgressive evolution on both networks. The approach relies on the construction of the regressor matrix, and then on a fit of the different models, on which a F test can be performed.

Here, the regression matrices are the same as above, as the networks are the same. We can hence use the ones we already computed. We just recall the way to do it here for clarity, but will not show the regressor matrices.

For h=1

In [47]:
regNet1 = regressorHybrid(net1)
dat1 = join(dat, regNet1, on = :tipNames);
fit_net1_0 = phyloNetworklm(@formula(preference ~ 1), dat1, net1)
fit_net1_1 = phyloNetworklm(@formula(preference ~ shift_34), dat1, net1)
ftest(fit_net1_1, fit_net1_0)
Out[47]:
F-test: 2 models fitted on 10 observations
───────────────────────────────────────────────────────────────
     DOF  ΔDOF     SSR    ΔSSR      R²      ΔR²      F*   p(>F)
───────────────────────────────────────────────────────────────
[1]    3        0.0359          0.3613                         
[2]    2    -1  0.0550  0.0192  0.0201  -0.3412  4.2730  0.0726
───────────────────────────────────────────────────────────────

For h=3

In [48]:
regNet3 = regressorHybrid(net3)
dat3 = join(dat, regNet3, on = :tipNames);
fit_net3_0 = phyloNetworklm(@formula(preference ~ 1), dat3, net3)
fit_net3_1 = phyloNetworklm(@formula(preference ~ sum), dat3, net3)
fit_net3_2 = phyloNetworklm(@formula(preference ~ shift_24 + shift_37 + shift_45),
                            dat3, net3)
ftest(fit_net3_2, fit_net3_1, fit_net3_0)
Out[48]:
F-test: 3 models fitted on 10 observations
────────────────────────────────────────────────────────────────
     DOF  ΔDOF     SSR    ΔSSR      R²      ΔR²       F*   p(>F)
────────────────────────────────────────────────────────────────
[1]    5        0.0083          0.8560                          
[2]    3    -2  0.0401  0.0319  0.3007  -0.5553  11.5728  0.0087
[3]    2    -1  0.0566  0.0164  0.0147  -0.2860   3.2720  0.1081
────────────────────────────────────────────────────────────────

Here, we notice that the F test indicates a non-significant homogeneous effect (H1 vs H0, i.e. Model 2 vs Model 3 in the table), but a significant heterogeneous effect (H2 vs H1, i.e. Model 1 vs Model 2 in the table). It might be informative to test directly the full model against the null (H2 vs H0). This can be done easily applying the ftest function to the corresponding fits.

In [49]:
ftest(fit_net3_2, fit_net3_0)
Out[49]:
F-test: 2 models fitted on 10 observations
────────────────────────────────────────────────────────────────
     DOF  ΔDOF     SSR    ΔSSR      R²      ΔR²       F*   p(>F)
────────────────────────────────────────────────────────────────
[1]    5        0.0083          0.8560                          
[2]    2    -3  0.0566  0.0483  0.0147  -0.8413  11.6887  0.0064
────────────────────────────────────────────────────────────────

Transgressive Evolution: Power under the estimated parameters

As above, we can compute the power associated with the tests above assuming that the variance paremter and size of the effects are correctly estimated.

In [50]:
power10(coef(fit_net1_1)[2], sigma2_estim(fit_net1_1), net1)
Out[50]:
0.6980243964828112
In [51]:
power10_OLD(coef(fit_net1_1)[2], sigma2_estim(fit_net1_1), net1)
Out[51]:
0.7330605770133714
In [52]:
power10(coef(fit_net3_1)[2], sigma2_estim(fit_net3_1), net3)
Out[52]:
0.644806246039801
In [53]:
power10_OLD(coef(fit_net3_1)[2], sigma2_estim(fit_net3_1), net3)
Out[53]:
0.6522636211166466
In [54]:
power20(coef(fit_net3_2)[2:4], sigma2_estim(fit_net3_2), net3)
Out[54]:
0.9999996245238257
In [55]:
power20_OLD(coef(fit_net3_2)[2:4], sigma2_estim(fit_net3_2), net3)
Out[55]:
0.9999998915439907

Here, the power to detect effect is much higher, going up to almost 1 for heterogeneous effects. This is consistent with the effect of transgressive evolution being higher.

In [56]:
sqrt(coef(fit_net3_1)[2]^2/sigma2_estim(fit_net3_1)/maximum(getNodeAges(net3)))
Out[56]:
0.8478645236087462

Female preference (pruned networks)

We reproduce here the same analysis, but this time removing missing data from the network. Pruning taxa from networks is done below.

In [57]:
"""
    reduceNet!(phynet)

function to match data to a network, to
- delete tips in network if they lack "preference" data, and
- delete rows in data "dat" if not in network.

notes:
- reticulations reducing to a simple loop are simplified (removed)
- object "dat" is used (and modified) as an external variable,
  it needs to be defined beforehand.
"""
function reduceNet!(phynet)
  taxa = tipLabels(phynet)
  for i in 1:phynet.numTaxa
    j = findfirst(dat[:tipNames], taxa[i])
    if j==0 # taxon not found in trait data set
        #println("tipNames ", taxa[i], " not found in trait data")
        deleteleaf!(phynet, taxa[i])
        continue # skips next part: where j needs to be >0
    end
    if ismissing(dat[:preference][j]) # taxon found but has missing trait value
        #println("tipNames ", taxa[i], " lacks trait value")
        deleteleaf!(phynet, taxa[i])
        deleterows!(dat, j)
    end
  end
end
Out[57]:
reduceNet!

We now apply our function to each network:

In [58]:
reduceNet!(tree)
reduceNet!(net1)
reduceNet!(net3)
┌ Warning: `getindex(df::DataFrame, col_ind::ColumnIndex)` is deprecated, use `df[!, col_ind]` instead.
│   caller = reduceNet!(::HybridNetwork) at In[57]:16
└ @ Main ./In[57]:16
MethodError: no method matching findfirst(::WeakRefStrings.StringArray{Union{Missing, String},1}, ::String)
Closest candidates are:
  findfirst(!Matched::Regex, ::AbstractString) at regex.jl:327
  findfirst(::Union{AbstractString, AbstractArray}) at array.jl:1701
  findfirst(!Matched::Function, ::Union{AbstractString, AbstractArray}) at array.jl:1783
  ...

Stacktrace:
 [1] reduceNet!(::HybridNetwork) at ./In[57]:16
 [2] top-level scope at In[58]:1

Once we modified the data and networks to remove taxa with missing preference data, the analysis are almost exactly the same as in the previous section, only the results differ.

Ancestral State Reconstruction

As above, we assume that the trait evolved like a BM on the network at hand. We start by taking a subset of the data with only the female preference.

In [59]:
dat_fp = dat[:, [:preference, :tipNames]]
Out[59]:

23 rows × 2 columns

preferencetipNames
Float64⍰String⍰
1missingXalvarezi
2missingXandersi
3-0.33Xbirchmanni
40.44Xclemenciae
5missingXcontinens
6missingXcortezi
7missingXcouchianus
8missingXevelynae
9missingXgordoni
100.91Xhellerii
110.24Xmaculatus
12-0.24Xmalinche
13missingXmayae
14missingXmeyeri
15missingXmilleri
160.19Xmontezumae
17missingXmonticolus
18-0.08Xmultilineatus
19-0.1Xnigrensis
20-0.02Xpygmaeus
21missingXsignum
220.28Xvariatus
23missingXxiphidium

For each network, we plot both the predictions, and the 95% prediction intervals.

For h=0 (tree)

In [60]:
ASR_fp_tree = ancestralStateReconstruction(dat_fp, tree)
┌ Warning: These prediction intervals show uncertainty in ancestral values,
│ assuming that the estimated variance rate of evolution is correct.
│ Additional uncertainty in the estimation of this variance rate is
│ ignored, so prediction intervals should be larger.
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:2153
Out[60]:
ReconstructedStates:
────────────────────────────────────────────────
  Node index       Pred.        Min.  Max. (95%)
────────────────────────────────────────────────
       -11.0   0.266814   -0.29324     0.826868
       -10.0   0.266814   -0.29324     0.826868
        -9.0   0.266814   -0.0962233   0.629851
        -8.0   0.264774   -0.105456    0.635004
        -7.0   0.264774   -0.105456    0.635004
        -6.0   0.257556   -0.0934667   0.608579
        -5.0   0.25525    -0.0735353   0.584035
        -4.0   0.253929   -0.0576969   0.565555
       -15.0  -0.131033   -0.34241     0.0803438
       -14.0  -0.131033   -0.34241     0.0803438
       -13.0  -0.0313643  -0.24038     0.177652
       -17.0  -0.0641475  -0.205714    0.077419
       -18.0  -0.0405959  -0.190904    0.109712
       -16.0  -0.0600681  -0.205315    0.0851791
       -12.0  -0.0313643  -0.24038     0.177652
        -3.0   0.25385    -0.0573344   0.565035
       -20.0   0.418805    0.0205993   0.817011
       -23.0   0.617361    0.16475     1.06997
       -22.0   0.617361    0.223195    1.01153
       -21.0   0.617361    0.223195    1.01153
       -19.0   0.40675     0.0471255   0.766374
        -2.0   0.346724   -0.0150435   0.708492
        22.0   0.617361   -0.0323449   1.26707
         8.0   0.25525    -0.461896    0.972395
        17.0  -0.0405959  -0.280619    0.199427
        11.0  -0.131033   -0.567092    0.305026
         3.0   0.266814   -0.323273    0.856901
         5.0   0.264774   -0.35728     0.886828
         1.0   0.266814   -0.323273    0.856901
        23.0   0.617361   -0.0323449   1.26707
         2.0   0.266814   -0.323273    0.856901
         6.0   0.264774   -0.35728     0.886828
        19.0   0.418805   -0.252746    1.09036
        20.0   0.617361   -0.0323449   1.26707
         7.0   0.257556   -0.443557    0.95867
        13.0  -0.33       -0.33       -0.33
        18.0   0.44        0.44        0.44
        21.0   0.91        0.91        0.91
         9.0   0.24        0.24        0.24
        12.0  -0.24       -0.24       -0.24
        10.0   0.19        0.19        0.19
        15.0  -0.08       -0.08       -0.08
        14.0  -0.1        -0.1        -0.1
        16.0  -0.02       -0.02       -0.02
         4.0   0.28        0.28        0.28
────────────────────────────────────────────────
In [61]:
R"par(mar=c(.5,.5,.5,.5))"
plot(tree, :RCall, nodeLabel=expectationsPlot(ASR_fp_tree), useEdgeLength=true, xlim=[0,25]);
plot(tree, :RCall, nodeLabel=predintPlot(ASR_fp_tree), useEdgeLength=true, xlim=[-3,26]);

For h=1

In [62]:
ASR_fp_net1 = ancestralStateReconstruction(dat_fp, net1)
┌ Warning: These prediction intervals show uncertainty in ancestral values,
│ assuming that the estimated variance rate of evolution is correct.
│ Additional uncertainty in the estimation of this variance rate is
│ ignored, so prediction intervals should be larger.
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:2153
Out[62]:
ReconstructedStates:
─────────────────────────────────────────────────
  Node index       Pred.         Min.  Max. (95%)
─────────────────────────────────────────────────
       -11.0   0.247003   -0.335192     0.829199
       -10.0   0.247003   -0.325673     0.819679
        -9.0   0.247003   -0.129188     0.623195
        -8.0   0.242425   -0.139386     0.624236
        -7.0   0.242425   -0.139386     0.624236
       -12.0   0.183917   -0.354574     0.722408
        -6.0   0.226667   -0.136372     0.589706
        -5.0   0.232813   -0.103485     0.56911
        -4.0   0.235356   -0.083723     0.554435
       -17.0  -0.111826   -0.328238     0.104586
       -16.0  -0.111702   -0.328133     0.10473
       -15.0  -0.0262039  -0.236863     0.184455
       -19.0  -0.060073   -0.209324     0.0891781
       -20.0  -0.0388557  -0.193238     0.115526
       -18.0  -0.0599501  -0.209294     0.0893934
       -14.0  -0.0261911  -0.236855     0.184473
         8.0  -0.0261911  -0.236855     0.184473
        -3.0   0.236037   -0.0823852    0.554459
       -22.0   0.403442   -0.00794844   0.814832
       -25.0   0.605967    0.135106     1.07683
       -24.0   0.605967    0.198525     1.01341
       -23.0   0.605967    0.198525     1.01341
       -21.0   0.381995    0.015619     0.748371
        -2.0   0.316546   -0.0480464    0.681139
        23.0   0.605967   -0.0666756    1.27861
         9.0   0.232813   -0.517716     0.983342
        18.0  -0.0388557  -0.282909     0.205198
        12.0  -0.111702   -0.569847     0.346444
         3.0   0.247003   -0.369728     0.863734
         5.0   0.242425   -0.403913     0.888764
         1.0   0.247003   -0.369728     0.863734
        24.0   0.605967   -0.0666756    1.27861
         2.0   0.247003   -0.369728     0.863734
         6.0   0.242425   -0.403913     0.888764
        20.0   0.403442   -0.29137      1.09825
        21.0   0.605967   -0.0666756    1.27861
         7.0   0.183917   -0.518419     0.886253
        13.0  -0.33       -0.33        -0.33
        19.0   0.44        0.44         0.44
        22.0   0.91        0.91         0.91
        10.0   0.24        0.24         0.24
        14.0  -0.24       -0.24        -0.24
        11.0   0.19        0.19         0.19
        16.0  -0.08       -0.08        -0.08
        15.0  -0.1        -0.1         -0.1
        17.0  -0.02       -0.02        -0.02
         4.0   0.28        0.28         0.28
─────────────────────────────────────────────────
In [63]:
R"par(mar=c(.5,.5,.5,.5))"
plot(net1, :RCall, nodeLabel=expectationsPlot(ASR_fp_net1), useEdgeLength=true, xlim=[0,25]);
plot(net1, :RCall, nodeLabel=predintPlot(ASR_fp_net1), useEdgeLength=true, xlim=[-3,26]);

For h=3

In [64]:
ASR_fp_net3 = ancestralStateReconstruction(dat_fp, net3)
┌ Warning: These prediction intervals show uncertainty in ancestral values,
│ assuming that the estimated variance rate of evolution is correct.
│ Additional uncertainty in the estimation of this variance rate is
│ ignored, so prediction intervals should be larger.
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:2153
Out[64]:
ReconstructedStates:
────────────────────────────────────────────────
  Node index       Pred.        Min.  Max. (95%)
────────────────────────────────────────────────
       -11.0   0.245971   -0.342466    0.834409
       -10.0   0.245971   -0.342466    0.834409
        -9.0   0.245971   -0.13307     0.625012
        -8.0   0.237077   -0.149076    0.623231
       -12.0   0.204949   -0.305731    0.715629
        -7.0   0.232507   -0.150554    0.615569
        -6.0   0.232507   -0.150554    0.615569
        -5.0   0.241842   -0.105104    0.588788
        -4.0   0.243903   -0.0854701   0.573276
       -19.0  -0.114647   -0.329965    0.100672
        15.0  -0.114647   -0.329965    0.100672
       -17.0  -0.12487    -0.352535    0.102796
       -16.0  -0.0307929  -0.242756    0.181171
       -21.0  -0.0639022  -0.212912    0.0851074
       -22.0  -0.04045    -0.198955    0.118055
       -20.0  -0.0597842  -0.212554    0.0929861
       -15.0  -0.0307929  -0.242756    0.181171
       -14.0  -0.0307929  -0.242756    0.181171
         7.0  -0.0307929  -0.242756    0.181171
        -3.0   0.245278   -0.0825556   0.573113
       -24.0   0.406078   -0.0104218   0.822577
        22.0   0.642628    0.245421    1.03983
       -29.0   0.615611    0.148068    1.08315
       -27.0   0.615611    0.211298    1.01992
       -26.0   0.615611    0.211298    1.01992
       -25.0   0.615611    0.211298    1.01992
       -23.0   0.386783    0.017621    0.755946
        -2.0   0.319654   -0.0484843   0.687792
        25.0   0.615611   -0.0630733   1.2943
         9.0   0.241842   -0.520508    1.00419
        19.0  -0.04045    -0.293694    0.212794
        12.0  -0.12487    -0.583777    0.334038
         3.0   0.245971   -0.374301    0.866243
         5.0   0.237077   -0.436113    0.910268
         1.0   0.245971   -0.374301    0.866243
        26.0   0.615611   -0.0630733   1.2943
         2.0   0.245971   -0.374301    0.866243
         8.0   0.232507   -0.462605    0.92762
        21.0   0.406078   -0.300465    1.11262
        23.0   0.615611   -0.0630733   1.2943
         6.0   0.204949   -0.48046     0.890359
        14.0  -0.33       -0.33       -0.33
        20.0   0.44        0.44        0.44
        24.0   0.91        0.91        0.91
        10.0   0.24        0.24        0.24
        13.0  -0.24       -0.24       -0.24
        11.0   0.19        0.19        0.19
        17.0  -0.08       -0.08       -0.08
        16.0  -0.1        -0.1        -0.1
        18.0  -0.02       -0.02       -0.02
         4.0   0.28        0.28        0.28
────────────────────────────────────────────────
In [65]:
R"par(mar=c(.5,.5,.5,.5))"
plot(net3, :RCall, nodeLabel=expectationsPlot(ASR_fp_net3), useEdgeLength=true, xlim=[0,25]);
plot(net3, :RCall, nodeLabel=predintPlot(ASR_fp_net3), useEdgeLength=true, xlim=[-3,26]);

Phylogenetic signal of with Pagel's lambda

For h=0 (tree)

In [66]:
lambda_fp_pruned_tree = phyloNetworklm(@formula(preference ~ 1), dat, tree, model="lambda")
┌ Info: Maximum lambda value to maintain positive branch lengths: 1.07157
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1381
Out[66]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: preference ~ 1

Model: lambda

Parameter(s) Estimates:
Sigma2: 0.00532966
Lambda: 1.07157

Coefficients:
───────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)   Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
(Intercept)  0.354209    0.165735  2.13719    0.0613  -0.0207107   0.729129
───────────────────────────────────────────────────────────────────────────
Log Likelihood: 0.5481794742
AIC: 4.9036410515
In [67]:
round(lambda_estim(lambda_fp_pruned_tree), digits = 2)
Out[67]:
1.07

For h=1

In [68]:
lambda_fp_pruned_net1 = phyloNetworklm(@formula(preference ~ 1), dat, net1, model="lambda")
┌ Info: Maximum lambda value to maintain positive branch lengths: 1.068
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1381
Out[68]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: preference ~ 1

Model: lambda

Parameter(s) Estimates:
Sigma2: 0.00576187
Lambda: 1.068

Coefficients:
──────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────────
(Intercept)  0.323594     0.16681   1.9399    0.0843  -0.053756   0.700944
──────────────────────────────────────────────────────────────────────────
Log Likelihood: 0.1476355653
AIC: 5.7047288695
In [69]:
round(lambda_estim(lambda_fp_pruned_net1), digits = 2)
Out[69]:
1.07

For h=3

In [70]:
lambda_fp_pruned_net3 = phyloNetworklm(@formula(preference ~ 1), dat, net3, model="lambda")
┌ Info: Maximum lambda value to maintain positive branch lengths: 1.07157
└ @ PhyloNetworks /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1381
Out[70]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: preference ~ 1

Model: lambda

Parameter(s) Estimates:
Sigma2: 0.00596273
Lambda: 1.07157

Coefficients:
───────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)   Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
(Intercept)  0.327211    0.169062  1.93545    0.0849  -0.0552334   0.709655
───────────────────────────────────────────────────────────────────────────
Log Likelihood: 0.1948654889
AIC: 5.6102690221
In [71]:
round(lambda_estim(lambda_fp_pruned_net3), digits = 2)
Out[71]:
1.07

Transgressive Evolution

Here, as the networks changed, we do need to re-compute the regression matrices, as shown below.

For h=1

The topology of net1 is not affected by the pruning (the one reticulation is still present), so the analysis remains unchanged.

In [72]:
regNet1 = regressorHybrid(net1)
dat1 = join(dat, regNet1, on = :tipNames);
fit_net1_0 = phyloNetworklm(@formula(preference ~ 1), dat1, net1);
fit_net1_1 = phyloNetworklm(@formula(preference ~ shift_34), dat1, net1);
ftest(fit_net1_1, fit_net1_0)
Out[72]:
F-test: 2 models fitted on 10 observations
───────────────────────────────────────────────────────────────
     DOF  ΔDOF     SSR    ΔSSR      R²      ΔR²      F*   p(>F)
───────────────────────────────────────────────────────────────
[1]    3        0.0359          0.3613                         
[2]    2    -1  0.0550  0.0192  0.0201  -0.3412  4.2730  0.0726
───────────────────────────────────────────────────────────────

For h=3

When pruning the second network, one of the hybridization event was reduced to a simple loop over itself, and hence was dropped from the network. As a result, net3 has now only two hybridization events, as we can see on the following plot.

In [73]:
R"par(mar=c(.5,.5,.5,.5))"
plot(net3, :RCall, showEdgeNumber=true, xlim=[0,15]);

From this topology, we can conclude that only two predictors will be included in the regression matrix, associated with edges 24 and 37. The analysis can hence be conducted the following way.

In [74]:
regNet3 = regressorHybrid(net3)
dat3 = join(dat, regNet3, on = :tipNames);
fit_net3_0 = phyloNetworklm(@formula(preference ~ 1), dat3, net3)
fit_net3_1 = phyloNetworklm(@formula(preference ~ sum), dat3, net3)
fit_net3_2 = phyloNetworklm(@formula(preference ~ shift_24 + shift_37), dat3, net3)
ftest(fit_net3_2, fit_net3_1, fit_net3_0)
ArgumentError: F test is only valid for nested models

Stacktrace:
 [1] ftest(::LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Cholesky{Float64,Array{Float64,2}}}}, ::Vararg{LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Cholesky{Float64,Array{Float64,2}}}},N} where N; atol::Float64) at /Users/bastide/.julia/packages/GLM/6V3fS/src/ftest.jl:105
 [2] ftest(::LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Cholesky{Float64,Array{Float64,2}}}}, ::Vararg{LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Cholesky{Float64,Array{Float64,2}}}},N} where N) at /Users/bastide/.julia/packages/GLM/6V3fS/src/ftest.jl:88
 [3] ftest(::PhyloNetworkLinearModel, ::Vararg{PhyloNetworkLinearModel,N} where N) at /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1874
 [4] ftest(::StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}, ::Vararg{StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}},N} where N) at /Users/bastide/.julia/packages/PhyloNetworks/zeoh2/src/traits.jl:1869
 [5] top-level scope at In[74]:6

Here, contrary to the results above, we get a significant homogeneous effect, but no evidence for heterogeneous effects.

Test of elevated female preference for males with longer swords

This hypothesis was originally tested by Cui et al. 2013. We can plot the data first:

In [75]:
using Gadfly
Gadfly.plot(dat, x="sword_index", y="preference", Guide.ylabel("female preference"))
ArgumentError: Package Gadfly not found in current path:
- Run `import Pkg; Pkg.add("Gadfly")` to install the Gadfly package.


Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] top-level scope at In[75]:1

For h=0 (tree)

In [76]:
phyloNetworklm(@formula(sword_index ~ preference), dat, tree)
Out[76]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: sword_index ~ 1 + preference

Model: BM

Parameter(s) Estimates:
Sigma2: 0.00393681

Coefficients:
──────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────────
(Intercept)  0.268385    0.184286  1.45635    0.1834  -0.156579    0.69335
preference   0.587815    0.311326  1.8881     0.0957  -0.130105    1.30573
──────────────────────────────────────────────────────────────────────────
Log Likelihood: 1.2563320547
AIC: 3.4873358906

For h=1

In [77]:
phyloNetworklm(@formula(sword_index ~ preference), dat, net1)
Out[77]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: sword_index ~ 1 + preference

Model: BM

Parameter(s) Estimates:
Sigma2: 0.00417735

Coefficients:
──────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────────
(Intercept)  0.299698    0.17801   1.6836     0.1308  -0.110794    0.71019
preference   0.552902    0.308022  1.79501    0.1104  -0.157398    1.2632
──────────────────────────────────────────────────────────────────────────
Log Likelihood: 1.0466093349
AIC: 3.9067813302

For h=3

In [78]:
phyloNetworklm(@formula(sword_index ~ preference), dat, net3)
Out[78]:
StatsModels.TableRegressionModel{PhyloNetworkLinearModel,Array{Float64,2}}

Formula: sword_index ~ 1 + preference

Model: BM

Parameter(s) Estimates:
Sigma2: 0.00417913

Coefficients:
──────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error  t value  Pr(>|t|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────────
(Intercept)  0.29801     0.177351  1.68034    0.1314  -0.110963   0.706982
preference   0.552722    0.303918  1.81866    0.1065  -0.148114   1.25356
──────────────────────────────────────────────────────────────────────────
Log Likelihood: 1.1499249159
AIC: 3.7001501682

On all networks, we see a positive relationship with slopes of 0.5878, 0.5529 and 0.5563. But this relationship is not significant (p=0.096, 0.110, 0.106).

version info

In [79]:
import Dates
Dates.now()
Out[79]:
2020-05-29T09:27:37.154
In [80]:
versioninfo()
Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i5-7267U CPU @ 3.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
In [81]:
import Pkg
Pkg.status()
Status `~/.julia/environments/v1.4/Project.toml`
  [336ed68f] CSV v0.6.1
  [a93c6f00] DataFrames v0.20.2
  [31c24e10] Distributions v0.23.2
  [38e38edf] GLM v1.3.9
  [7073ff75] IJulia v1.21.2
  [4138dd39] JLD v0.10.0
  [33ad39ac] PhyloNetworks v0.11.0
  [c0d5b6db] PhyloPlots v0.2.2
  [6f49c342] RCall v0.13.6
  [3eaba693] StatsModels v0.6.11
  [fd094767] Suppressor v0.2.0