Introduction to Default Models in shadie
¶
import shadie
The original purpose of creating shadie
was to develop a repeatable, user-friendly framework for evolutionary simulations using models based on complex plant life cycles. To this end, shadie
provides six default models based on the life cycles of the major plant lineages:
- Monoicous Bryophytes:
.reproduction.bryophyte_monoicous
- Dioicous Bryophytes:
.reproduction.bryophyte_dioicous
- Homosporous Pteridophytes:
.reproduction.pteridophyte_homosporous
- Heterosporous Pteridophytes:
.reproduction.pteridophyte_heterosporous
- Monecious Angiosperms:
.reproduction.angiosperm_monecious
- Dioecious Angiosperms:
.reproduction.angiosperm_dioecious
The parameters in each model vary according to the details of the life cycle, but each parameter is meant to reflect a true biological value or phenomenon. The models are continuously maintained and updated, which means you may request the incorporation of new parameters or even new models. However, this also means that if you use shadie
models to conduct research you should always save a copy of the script for your own reference (and for the supplemental files, if you publish!), because the models may change in the future.
Alternation of Generations¶
The main different between plant life cycle models in shadie
and other standard models is that shadie
models incorporate alternation-of-generations. Plants switch between a haploid (gametophyte) and diploid (sporophyte) life stage during the completion of one life cycle, i.e. alternation-of-generations.
During each life stage, regardless of ploidy, the individuals can mate, clone, and experience selection. shadie
was designed to help account for this by modeling the dynamics of each life stage switching off, such that a single SLiM generation accounts for one haploid or one diploid generation. The result is that two SLiM generations are equal to a single "standard" generation, in the population genetics sense. This can be confusing, so shadie
also assists with post-simulation processing and analysis to account for the alternation-of-generations model.
The main thing to keep in mind when starting out with shadie
is the alternation-of-generations dynamic. This is the most powerful aspect of using shadie
, but can also lead to mistakes. shadie
is designed to take care of as much of this aspect of modeling as possible, so it has some standard behaviors that the user should be aware of:
- The
sim_time
parameter ofshadie
is in standard generations, i.e. a full life cycle. Whenshadie
writes out the Eidos model it will automatically account of the presence of alternation-of-generations, and you will see the number of cycles in the model is double your parameter setting, because the model will need to run through twice as many SLiM generations, one for every haploid stage and one for every diploid stage. shadie
takes separate parameters for the gametophyte (1N) and sporophyte (2N) generations, but if you do not provide separate values it will make some assumptions.- To remind you of the alternation-of-generations, any plant model in
shadie
will require explicit population sizes for both the gametophyte (gam_pop_size
) and the sporophyte (spo_pop_size
).- To make the population size constant the
gam_pop_size
should be twice thespo_pop_size
; this keeps the number of haplotypes constant. However, note thatshadie
does not force this population size, so other parameters may limit the population size to be less than what you set the parameter value to
- To make the population size constant the
- If you provide a mutation rate in the
initialize()
function this rate will be split between the sporophyte and gametophyte generations so that the overal mutation rate per standard generation is equal to the parameter setting and the mutation rates in each life stage are equal to each other. You can set different mutation rates in each life stage by using thespo_mutation_rate
andgam_mutation_rate
parameters.
- To remind you of the alternation-of-generations, any plant model in
shadie
sets default parameter values for most parameters, allowing the user to run simulations without having to specify many values- the default parameter values are not =0.0/neutral as might be assumed, but are set to "realistic" values (see Sorojsrisom et al., 2022 for more detail regarding how these values were chosen).
- the minimum requirements for running a default model in
shadie
are to pass a chromosome object and a population size for each life stage. - this is meant to help new users get started quickly, but is not recommended for any serious use of shadie for research purposes.
#the absolute minimum setting required to run a default model in shadie
default_chrom = shadie.chromosome.default()
with shadie.Model() as bryo_model:
bryo_model.initialize(
chromosome=default_chrom,
)
bryo_model.reproduction.bryophyte_dioicous(
spo_pop_size = 1000,
gam_pop_size = 2000,
)
If you print the script for the model you just specified (see below), you will see that there are many more parameter values specified than you may have expected! This includes a simulation time of 1000 full lifecycles and a default file-out name of "shadie.trees" (which will save to the current working directory). shadie
also uses a default recombination rate of 1e-9 and overall mutation rate of 1e-8 (i.e. sporophyte and gametophyte mutation rates both = 5e-9).
The rest of the parameters are model-specific. The default behavior in shadie
automatically sets realistic parameter values for all the default models.
#uncomment line below to view the Eidos script
#print(bryo_model.script)
Model Architecture¶
There are a few more things that shadie
is doing under the hood that are useful to keep in mind. By default, shadie
uses recapitation, a method that combines forward-time and backward-time (coalescent) simulations to greatly improve the efficiency of evolutionary simulations. You can read more about recapitation in this article by Haller et al. 2018.
For our purposes, this means that simulations in shadie
do not require a neutral burn-in period, because the ancestry simulation will be conducted using a coalescent simulation in msprime
. In other words, the burn-in will be added after the actual simulation has already occurred. You can read more about this process in the pyslim
docs.
Additionally, the selected part of the simulation is conducted in SLiM (using the Eidos model generated by shadie
) and the neutral part of the simulation is conducted in msprime
after the SLiM simulation is finished and the tree sequence has been generated.
For this reason, shadie
removes any neutral mutations from your chromosome model by default and SLiM will not place neutral mutations during the simulation. Additionally, any purely neutral genomic elements will be removed (they will literally appear as holes in the SLiMgui when you visualize the chromosome - don't be alarmed by this). This is all architectural management by shadie
to improve the efficiency of the simulation and avoid modeling neutral mutations.
Because of this behavior, a critical step for using shadie
is to recapitate and mutate the tree sequence file using the shadie.postsim
module. Details for this process will follow in the next tutorial.
But I want to model neutral mutations in SLiM!¶
Not to worry, if you want to override this default behavior in shadie
you can pass the argument skip_neutral_mutations=False
in the initialize()
function, as in the example below:
default_chrom = shadie.chromosome.default()
with shadie.Model() as bryo_model:
bryo_model.initialize(
chromosome=default_chrom,
skip_neutral_mutations=False
)
bryo_model.reproduction.bryophyte_dioicous(
spo_pop_size = 1000,
gam_pop_size = 2000,
)