import shadie
#re-make the chromosome object
chrom = shadie.chromosome.explicit({
(0, 200_000): shadie.NONCDS,
(200_001, 400_000): shadie.EXON,
(400_001, 600_000): shadie.INTRON,
(600_001, 800_000): shadie.EXON,
(800_001, 1_000_000): shadie.NONCDS,
})
#load the simulation from tutorial #5
post = shadie.postsim.OneSim("tut5.trees", chrom)
🌿 INFO | one_sim.py | shadie assumes sims were run without a burn-in and without neutral mutations and will recapitate (add ancestry simulation) and mutate (add neutral mutations) any loaded sims by default. Current settings - Recapitate: True Add neutral mutations: True 🌿 INFO | one_sim.py | Simulation currently has 7 existing mutations of type(s) {2, 3}. 🌿 INFO | one_sim.py | Mutating using mutation rate: 1e-08. Keeping 67 existing mutations of type(s) {0, 2, 3}.
Once you have recapitated and mutated your tree sequence, it may make sense to save it so that you don't have to re-do that process every time you come back. It can be helpful to keep the original simulation, so it is recommended to save as a new file:
#save using the treesequence method dump()
post.tree_sequence.dump("tut5_recapitated.trees")
And if you wanted to load a postsim analysis object in shadie from that recapitated file, you could do so using the method below. Remember that you need to suppress recapitation and mutation, which is the automatic behavior of the OneSim()
function.
post_recapitated = shadie.postsim.OneSim("tut5.trees", chrom,
recapitate=False,
add_neutral_mutations=False
)
🌿 INFO | one_sim.py | shadie assumes sims were run without a burn-in and without neutral mutations and will recapitate (add ancestry simulation) and mutate (add neutral mutations) any loaded sims by default. Current settings - Recapitate: False Add neutral mutations: False
post.tree_sequence.population(0), #the SLiM simulation
(Population(id=0, metadata={'name': 'p0', 'slim_id': 0}),)
post.tree_sequence.population(1) #the msprime ancestry simulation
Population(id=1, metadata={'description': 'ancestral population simulated by msprime', 'name': 'ancestral'})
You can also look up any of the individuals in the simulation and see what kind of data the tree sequence has stored for it:
post.tree_sequence.individual(15)
Individual(id=15, flags=196608, location=array([0., 0., 0.]), parents=array([-1, -1], dtype=int32), nodes=array([29, 30], dtype=int32), metadata={'pedigree_id': 19933821, 'pedigree_p1': 19904772, 'pedigree_p2': 19904772, 'age': 0, 'subpopulation': 1, 'sex': -1, 'flags': 0})
Visualizations¶
shadie
has a number of plotting functions that utilize toyplot
and allow you to visualize the results of the simulation.
draw_tree_sequence
allows you to sample haplotypes from the end of the simulation and visualize the tree sequence. You will see the chromosome structure at the top (if you hover over the genomic elements you will see some information about what they are) and then below is a representation of the tree sequence for the sampled haplotypes. The chromosome is broke up into chunks due to recombination, so each chunk of the chromosome has a different topology. Usually a random seed is generated by shadie
for the haplotype sampling, but if you wanted to sample the same individuals every time you can specify a number for the seed
and as long as the seed doesn't change, the same haplotypes will be sampled every time.
post.draw_tree_sequence(sample=10, seed=123);
You can also plot the individual trees using the draw_tree
function. You will need to specify the idx
, which corresponds to the chunk of chromosome that you want to see the topology for. If you use the same sample number and seed as above, you will see that you can access the same trees shown in the tree sequence:
post.draw_tree(idx=2, sample=10, seed=123);
Calculating Statistics¶
shadie
can calculate statistics like theta
(diversity) and Tajima's D
on your postsim
object using the function stats()
. You can specify how many haplotypes are sampled for each calculation and how many replicate calculations are performed. These calculations utilise the built-in theta and Tajima's D calculations from tskit.
post.stats(sample=15, reps=50)
mean | CI_5% | CI_95% | |
---|---|---|---|
theta | 0.000001 | 0.000001 | 0.000001 |
D_Taj | -1.654312 | -1.740551 | -1.568073 |
Access Metadata¶
If you forget the simulations settings, you can always access the parameters settings from SLiM in the tree sequence metadata:
post.tree_sequence.metadata["SLiM"]["user_metadata"]
{'file_in': ['None'], 'file_out': ['tut5.trees'], 'full_lifecycles': ['100'], 'gam_archegonia_per': ['1'], 'gam_ceiling': ['10000'], 'gam_clone_rate': ['0.0'], 'gam_clones_per': ['1'], 'gam_female_to_male_ratio': ['0.6666666666666666'], 'gam_maternal_effect': ['0'], 'gam_mutation_rate': ['5e-09'], 'gam_pop_size': ['1000'], 'gam_random_death_chance': ['0'], 'gam_self_rate': ['0.0'], 'gam_self_rate_per_egg': ['0.1'], 'gens_per_lifecycle': ['2'], 'lineage': ['Pteridophyte'], 'mode': ['homosporous'], 'model_source': ['shadie'], 'mutation_rate': ['1e-08'], 'recomb_rate': ['1e-09'], 'slim_gens': ['201'], 'spo_clone_rate': ['0.0'], 'spo_clones_per': ['1'], 'spo_maternal_effect': ['0'], 'spo_mutation_rate': ['5e-09'], 'spo_pop_size': ['500'], 'spo_random_death_chance': ['0'], 'spo_self_rate': ['0.0'], 'spo_self_rate_per_egg': ['0.1'], 'spo_spores_per': ['100']}