gctree.branching_processes.CollapsedTree
- class gctree.branching_processes.CollapsedTree(tree=None, allow_repeats=False)[source]
Bases:
object
A collapsed tree, modeled as an infinite type Galton-Watson process run to extinction.
- tree
ete3.TreeNode
object withabundance
node features
- Parameters:
tree (
Optional
[TreeNode
]) – ete3 tree withabundance
node features. If uncollapsed, it will be collapsed along branches with no mutations. Can be ommitted on initializaion, and later simulated. If a tree is provided, names of nodes with abundance 0 will not be preserved.allow_repeats (
bool
) – tolerate the existence of nodes with the same genotype after collapse, e.g. in sister clades.
Methods
Compare this tree to the other tree.
Generate a colormap based on a continuous tree feature.
Log likelihood of branching process parameters \((p, q)\) given tree topology \(T\) and genotype abundances \(A\).
Add local branching statistics (Neher et al. 2014) as tree node features to the ETE tree attribute.
Maximum likelihood estimate of \((p, q)\).
Write to newick file.
Render to tree image file.
Simulate a collapsed tree as an infinite type Galton-Watson process run to extintion, with branching probability \(p\) and mutation probability \(q\).
Compute support from a list of bootstrap
CollapsedTree
objects, and add to tree attibute.Serialize to pickle file.
- ll(p, q)[source]
Log likelihood of branching process parameters \((p, q)\) given tree topology \(T\) and genotype abundances \(A\).
\[\ell(p, q; T, A) = \log\mathbb{P}(T, A \mid p, q)\]
- mle(**kwargs)[source]
Maximum likelihood estimate of \((p, q)\).
\[(p, q) = \arg\max_{p,q\in [0,1]}\ell(p, q)\]- Parameters:
kwargs – keyword arguments passed along to the branching process likelihood
CollapsedTree.ll()
- Return type:
Tuple
[float64
,float64
]- Returns:
Tuple \((p, q)\) with estimated branching probability and estimated mutation probability
- simulate(p, q, root=True)[source]
Simulate a collapsed tree as an infinite type Galton-Watson process run to extintion, with branching probability \(p\) and mutation probability \(q\). Overwrites existing tree attribute.
- Parameters:
p (
float64
) – branching probabilityq (
float64
) – mutation probabilityroot (
bool
) – flag indicating simulation is being run from the root of the tree, so we should update tree attributes (should usually beTrue
)
- render(outfile, scale=None, branch_margin=0, node_size=None, idlabel=False, colormap=None, frame=None, position_map=None, chain_split=None, frame2=None, position_map2=None, show_support=False, show_nuc_muts=False)[source]
Render to tree image file.
- Parameters:
outfile (
str
) – file name to render to, filetype inferred from suffix, .svg for colorscale (
Optional
[float
]) – branch length scale in pixels (set automatically ifNone
)branch_margin (
float
) – additional leaf branch separation margin, in pixels, to scale tree widthnode_size (
Optional
[float
]) – size of nodes in pixels (set according to abundance ifNone
)idlabel (
bool
) – label nodes with seq ids, and write sequences of all nodes to a fasta file with same base name asoutfile
colormap (
Optional
[Dict
]) – dictionary mapping node names to color names or to dictionaries of color frequenciesframe (
Optional
[int
]) – coding frame for annotating amino acid substitutionsposition_map (
Optional
[List
]) – mapping of position names for sequence indices, to be used with substitution annotations and theframe
argumentchain_split (
Optional
[int
]) – if sequences are a concatenation two gene sequences, this is the index at which the 2nd one starts (requiresframe
andframe2
arguments)frame2 (
Optional
[int
]) – coding frame for 2nd sequence when usingchain_split
position_map2 (
Optional
[List
]) – likeposition_map
, but for 2nd sequence when usingchain_split
show_support (
bool
) – annotate bootstrap support if availableshow_nuc_muts (
bool
) – If True, annotate branches with nucleotide mutations. If False, and frame is provided, then branches will be annotated with amino acid mutations.
- feature_colormap(feature, cmap='viridis', vmin=None, vmax=None, scale='linear', **kwargs)[source]
Generate a colormap based on a continuous tree feature.
- Parameters:
feature (
str
) – feature name (all nodes in tree attribute must have this feature)cmap (
str
) – any matplotlib color palette namevmin (
Optional
[float
]) – minimum value for colormap (default to minimum of the feature over the tree)vmax (
Optional
[float
]) – maximum value for colormap (default to maximum of the feature over the tree)scale (
str
) –linear
(default),log
, orsymlog
(must also providelinthresh
kwarg)kwargs – additional keyword arguments for scale transformation
- Return type:
- Returns:
Dictionary of node names to hex color strings, which may be used as the colormap in
gctree.CollapsedTree.render()
- write(file_name)[source]
Serialize to pickle file.
- Parameters:
file_name (
str
) – file name (.p suffix recommended)
- newick(file_name)[source]
Write to newick file.
- Parameters:
file_name (
str
) – file name (.nk suffix recommended)
- compare(tree2, method='identity')[source]
Compare this tree to the other tree.
- Parameters:
tree2 (
CollapsedTree
) – another object of this typemethod (
str
) – comparison type (identity
,MRCA
, orRF
)
- Return type:
- Returns:
tree difference
- support(bootstrap_trees_list, weights=None, compatibility=False)[source]
Compute support from a list of bootstrap
CollapsedTree
objects, and add to tree attibute.- Parameters:
bootstrap_trees_list (
List
[CollapsedTree
]) – List of treesweights (
Optional
[List
[float64
]]) – weights for each tree, perhaps for weighting parsimony degenerate treescompatibility (
bool
) – counts trees that don’t disconfirm the split.
- local_branching(tau=1, tau0=1, infinite_root_branch=True, nan_root_lbr=False)[source]
Add local branching statistics (Neher et al. 2014) as tree node features to the ETE tree attribute. After execution, all nodes will have new features
LBI
(local branching index) andLBR
(local branching ratio, below Vs above the node)- Parameters:
tau – decay timescale for exponential filter
tau0 – effective branch length for branches with zero mutations
infinite_root_branch – calculate assuming the root node has an infinite branch
nan_root_lbr – replace the root LBR value with
np.nan