torchdms.model¶
Generalized global epistasis models.
Classes
Allows the latent space for the second output feature (i.e. |
|
Conditional with sequential training: stability then binding. |
|
A model of viral escape with multiple epitopes, each with its own latent layer. |
|
A flexible fully connected neural network model. |
|
Parallel and independent submodules of type |
|
A linear model, expressed as a single layer neural network with no nonlinear activations. |
|
An abstract superclass for our models to combine shared behavior. |
- class torchdms.model.TorchdmsModel(input_size, target_names, alphabet, monotonic_sign=None, non_lin_bias=False, output_bias=False)[source]¶
An abstract superclass for our models to combine shared behavior.
- Parameters
input_size (
int
) – number of one-hot sequence indicators \(L\times |\mathcal{A}|\), for sequence of length \(L\) and amino acid alphabet \(\mathcal{A}\).target_names (
List
[str
]) – output names (e.g."expression"
,"stability"
).alphabet (
str
) – amino acid alphabet \(\mathcal{A}\).monotonic_sign (
Optional
[bool
]) –1
or-1
for monotonicity, orNone
- __init__(input_size, target_names, alphabet, monotonic_sign=None, non_lin_bias=False, output_bias=False)[source]¶
- training_style_sequence: List[Callable]¶
list of training style functions that (de)activate gradients in submodules
- abstract property characteristics: Dict¶
Salient characteristics of the model that aren’t represented in the PyTorch description.
- Return type
- abstract regularization_loss()[source]¶
Compute regularization penalty, a scalar-valued
torch.Tensor
- Return type
- abstract to_latent(x, **kwargs)[source]¶
Latent space representation \(Z\)
\[z \equiv \phi(x) \equiv \beta^\intercal x\]
- abstract from_latent_to_output(z, **kwargs)[source]¶
Evaluate the mapping from the latent space to output.
\[y = g(z)\]
- abstract fix_gauge(gauge_mask)[source]¶
Perform gauge-fixing procedure on latent space parameters.
- Parameters
gauge_mask (
Tensor
) – 0/1 mask array the same shape as latent space input with 1s for parameters that should be zeroed
- property sequence_length¶
input amino acid sequence length
- numpy_single_mutant_predictions()[source]¶
Single mutant predictions as a numpy array of shape (AAs, sites, outputs).
- Return type
ndarray
- single_mutant_predictions()[source]¶
Single mutant predictions as a list (across output dimensions) of Pandas dataframes.
- Return type
List
[DataFrame
]
- monotonic_params_from_latent_space()[source]¶
Yields parameters to be floored to zero in a monotonic model. This is every parameter after the latent space excluding bias parameters.
We follow the convention that the latent layers of a network are named like
latent_layer*
and the weights and bias are denotedlayer_name*.weight
andlayer_name*.bias
.Layers from nested modules will be prefixed (e.g. with Independent).
- class torchdms.model.Linear(*args, **kwargs)[source]¶
A linear model, expressed as a single layer neural network with no nonlinear activations.
- Parameters
args – positional arguments, see
torchdms.model.TorchdmsModel
kwargs – keyword arguments, see
torchdms.model.TorchdmsModel
- property characteristics: Dict¶
Salient characteristics of the model that aren’t represented in the PyTorch description.
- Return type
- from_latent_to_output(z, **kwargs)[source]¶
Evaluate the mapping from the latent space to output.
\[y = g(z)\]
- regularization_loss()[source]¶
Compute regularization penalty, a scalar-valued
torch.Tensor
- Return type
- class torchdms.model.Escape(num_epitopes, *args, beta_l1_coefficient=0, **kwargs)[source]¶
A model of viral escape with multiple epitopes, each with its own latent layer.
The output is assumed to be escape fraction \(\in (0, 1)\). The nonlinearity is fixed as the product of Hill functions for each epitope.
Todo
model math markup, and schematic needed here
- Parameters
num_epitopes – number of epitopes (latent dimensions)
args – base positional arguments, see
torchdms.model.TorchdmsModel
beta_l1_coefficient (
float
) – lasso penalty on latent space \(beta\) coefficientskwargs – base keyword arguments, see
torchdms.model.TorchdmsModel
- property characteristics: Dict¶
Salient characteristics of the model that aren’t represented in the PyTorch description.
- Return type
- to_latent(x, **kwargs)[source]¶
Latent space representation \(Z\)
\[z \equiv \phi(x) \equiv \beta^\intercal x\]
- from_latent_to_output(z, concentrations=None)[source]¶
Evaluate the mapping from the latent space to output.
\[y = g(z)\]
- regularization_loss()[source]¶
Lasso penalty of single mutant effects, a scalar-valued
torch.Tensor
- class torchdms.model.FullyConnected(layer_sizes, activations, *args, beta_l1_coefficient=0.0, interaction_l1_coefficient=0.0, **kwargs)[source]¶
A flexible fully connected neural network model.
Todo
schematic image
- Parameters
layer_sizes (
List
[int
]) – Sequence of widths for each layer between input and output.activations (
List
[Callable
]) –Corresponding activation functions for each layer. The first layer with
None
ornn.Identity()
activation is the latent space.Todo
allowable activation function names?
args – base positional arguments, see
torchdms.model.TorchdmsModel
beta_l1_coefficient (
float
) – lasso penalty on latent space \(\beta\) coefficientsinteraction_l1_coefficient (
float
) – lasso penalty on parameters in the pre-latent interaction layer(s)kwargs – base keyword arguments, see
torchdms.model.TorchdmsModel
Example
With
layer_sizes = [10, 2, 10, 10]
andactivations = ["relu", None, "relu", "relu"]
we have a latent space of 2 nodes, feeding into two more dense layers, each with 10 nodes, before the output. Layers before the latent layer are a nonlinear module for site-wise interactions, in this case one layer of 10 nodes. The latent space has skip connections directly from the input layer, so we always model single mutation effects.- __init__(layer_sizes, activations, *args, beta_l1_coefficient=0.0, interaction_l1_coefficient=0.0, **kwargs)[source]¶
- property characteristics: Dict¶
Salient characteristics of the model that aren’t represented in the PyTorch description.
- Return type
- fix_gauge(gauge_mask)[source]¶
Perform gauge-fixing procedure on latent space parameters.
- Parameters
gauge_mask – 0/1 mask array the same shape as latent space input with 1s for parameters that should be zeroed
- to_latent(x, **kwargs)[source]¶
Latent space representation \(Z\)
\[z \equiv \phi(x) \equiv \beta^\intercal x\]
- from_latent_to_output(z, **kwargs)[source]¶
Evaluate the mapping from the latent space to output.
\[y = g(z)\]
- beta_coefficients()[source]¶
Beta coefficients (single mutant effects only, no interaction terms)
- Return type
- regularization_loss()[source]¶
Lasso penalty of single mutant effects and pre-latent interaction weights, a scalar-valued
torch.Tensor
- Return type
- class torchdms.model.Independent(layer_sizes, activations, *args, beta_l1_coefficients=(0.0, 0.0), interaction_l1_coefficients=(0.0, 0.0), **kwargs)[source]¶
Parallel and independent submodules of type
torchdms.model.FullyConnected
for each of two output dimensions.Todo
schematic image
- Parameters
layer_sizes (
List
[int
]) – Sequence of widths for each layer between input and output for both submodules.activations (
List
[Callable
]) –Corresponding activation functions for each layer for both submodules. The first layer with
None
ornn.Identity()
activation is the latent space.Todo
allowable activation function names?
args – base positional arguments, see
torchdms.model.TorchdmsModel
beta_l1_coefficients (
Tuple
[float
]) – lasso penalties on latent space \(\beta\) coefficients for each submoduleinteraction_l1_coefficients (
Tuple
[float
]) – lasso penalty on parameters in the pre-latent interaction layer(s) for each submodulekwargs – base keyword arguments, see
torchdms.model.TorchdmsModel
- __init__(layer_sizes, activations, *args, beta_l1_coefficients=(0.0, 0.0), interaction_l1_coefficients=(0.0, 0.0), **kwargs)[source]¶
- property characteristics: Dict¶
Salient characteristics of the model that aren’t represented in the PyTorch description.
- Return type
- from_latent_to_output(z, **kwargs)[source]¶
Evaluate the mapping from the latent space to output.
\[y = g(z)\]
- to_latent(x, **kwargs)[source]¶
Latent space representation \(Z\)
\[z \equiv \phi(x) \equiv \beta^\intercal x\]
- beta_coefficients()[source]¶
beta coefficients, skip coefficients only if interaction module.
- Return type
- regularization_loss()[source]¶
Compute regularization penalty, a scalar-valued
torch.Tensor
- Return type
- class torchdms.model.Conditional(*args, **kwargs)[source]¶
Allows the latent space for the second output feature (i.e. stability) to feed forward into the network for the first output feature (i.e. binding).
This requires a one-dimensional latent space (see
torchdms.model.Conditional.from_latent_to_output()
to see why).Todo
schematic image: https://user-images.githubusercontent.com/1173298/89943302-d4524d00-dbd2-11ea-827d-6ad6c238ff52.png
- Parameters
args – positional arguments, see
torchdms.model.Independent
kwargs – keyword arguments, see
torchdms.model.Independent
- class torchdms.model.ConditionalSequential(*args, **kwargs)[source]¶
Conditional with sequential training: stability then binding.
Todo
schematic image
- Parameters
args – positional arguments, see
torchdms.model.Conditional
kwargs – keyword arguments, see
torchdms.model.Conditional