Command line interface¶
tdms¶
Train and evaluate neural networks on deep mutational scanning data.
tdms [OPTIONS] COMMAND [ARGS]...
Options
- -v, --version¶
Print version and exit. Note that as per git describe, the SHA is prefixed by a g.
beta¶
Plot beta coefficients as a heatmap.
tdms beta [OPTIONS] MODEL_PATH DATA_PATH
Options
- --out <out>¶
Required
- --config <config>¶
Read configuration from FILE.
Arguments
- MODEL_PATH¶
Required argument
- DATA_PATH¶
Required argument
cartesian¶
Take the cartesian product of the variable options in a config file, and put it all in an _output directory.
tdms cartesian [OPTIONS] CHOICE_JSON_PATH
Arguments
- CHOICE_JSON_PATH¶
Required argument
create¶
Create a model.
See the documentation for each model to see an example model string.
tdms create [OPTIONS] DATA_PATH OUT_PATH MODEL_STRING
Options
- --monotonic <monotonic>¶
If this option is used, then the model will be initialized with weights greater than zero. During training with this model then, tdms will put a floor of 0 on all non-bias weights. It will also multiply the output by the value provided as an option argument here, so use -1.0 if you want your nonlinearity to be monotonically decreasing, or 1.0 if you want it to be increasing.
- --beta-l1-coefficients <beta_l1_coefficients>¶
Coefficients with which to l1-regularize beta coefficients, a comma-seperated list of coefficients for each latent dimension.
- --interaction-l1-coefficients <interaction_l1_coefficients>¶
Coefficients with which to l1-regularize site interaction weights, a comma-seperated list of coefficients for each latent dimension
- --non-lin-bias, --no-non-lin-bias¶
- --output-bias, --no-output-bias¶
- --seed <seed>¶
Set random seed. Seed is uninitialized if not set.
- --config <config>¶
Read configuration from FILE.
Arguments
- DATA_PATH¶
Required argument
- OUT_PATH¶
Required argument
- MODEL_STRING¶
Required argument
error¶
Evaluate and produce plot of error.
tdms error [OPTIONS] MODEL_PATH DATA_PATH
Options
- --out <out>¶
Required
- --show-points¶
Show points in addition to LOWESS curves.
- --device <device>¶
- --include-details¶
Include details from config file in error summary.
- --config <config>¶
Read configuration from FILE.
Arguments
- MODEL_PATH¶
Required argument
- DATA_PATH¶
Required argument
evaluate¶
Evaluate the performance of a model.
Dump to a dictionary containing the results.
tdms evaluate [OPTIONS] MODEL_PATH DATA_PATH
Options
- --out <out>¶
Required
- --device <device>¶
- --config <config>¶
Read configuration from FILE.
Arguments
- MODEL_PATH¶
Required argument
- DATA_PATH¶
Required argument
geplot¶
Make a “global epistasis” plot showing the fit to the nonlinearity.
tdms geplot [OPTIONS] MODEL_PATH DATA_PATH
Options
- --steps <steps>¶
- Default
100
- --out <out>¶
Required
- --device <device>¶
- --config <config>¶
Read configuration from FILE.
Arguments
- MODEL_PATH¶
Required argument
- DATA_PATH¶
Required argument
go¶
Run a common sequence of commands: create, train, scatter, and beta.
Then touch a .sentinel file to signal successful completion.
tdms go [OPTIONS]
Options
- --config <config>¶
Read configuration from FILE.
heatmap¶
Plot single mutant predictions as a heatmap.
Note/warning: because of the way we have set up the encoding, the heatmap values cannot be interpreted in a straightfoward way.
tdms heatmap [OPTIONS] MODEL_PATH
Options
- --out <out>¶
Required
- --config <config>¶
Read configuration from FILE.
Arguments
- MODEL_PATH¶
Required argument
prep¶
Prepare data for training.
IN_PATH should point to a pickle dump’d Pandas DataFrame containing the string encoded aa_substitutions column along with any TARGETS you specify. OUT_PREFIX is the location to dump the prepped data to another pickle file.
tdms prep [OPTIONS] IN_PATH OUT_PREFIX TARGETS...
Options
- --per-stratum-variants-for-test <per_stratum_variants_for_test>¶
This is the number of variants for each stratum to hold out for testing, with the same number used for validation. The rest of the examples will be used for training the model.
- Default
100
- --skip-stratum-if-count-is-smaller-than <skip_stratum_if_count_is_smaller_than>¶
If the total number of examples for any particular stratum is lower than this number, we throw out the stratum completely.
- Default
250
- --drop-nans¶
Drop all rows that contain a nan.
- --export-dataframe <export_dataframe>¶
Filename prefix for exporting the original dataframe in a .pkl file with an appended in_test column.
- --partition-by <partition_by>¶
Column name containing a feature by which the data should be split into independent datasets for partitioning; e.g. ‘library’.
- --train-on-all-single-mutants¶
Place all single-mutants into training set.
- --dry-run¶
Only print paths and files to be made, rather than actually making them.
- --seed <seed>¶
Set random seed. Seed is uninitialized if not set.
- --config <config>¶
Read configuration from FILE.
Arguments
- IN_PATH¶
Required argument
- OUT_PREFIX¶
Required argument
- TARGETS¶
Required argument(s)
profiles¶
Plot amino acid and site profiles from low-rank approximation.
tdms profiles [OPTIONS] MODEL_PATH DATA_PATH
Options
- --out <out>¶
Required
- --config <config>¶
Read configuration from FILE.
Arguments
- MODEL_PATH¶
Required argument
- DATA_PATH¶
Required argument
scatter¶
Evaluate and produce scatter plot of observed vs predicted targets on the test set provided.
tdms scatter [OPTIONS] MODEL_PATH DATA_PATH
Options
- --out <out>¶
Required
- --device <device>¶
- --config <config>¶
Read configuration from FILE.
Arguments
- MODEL_PATH¶
Required argument
- DATA_PATH¶
Required argument
summarize¶
Report various summaries of the data.
tdms summarize [OPTIONS] DATA_PATH
Options
- --out-prefix <out_prefix>¶
If this flag is set, make pdf plots summarizing the data.
- --config <config>¶
Read configuration from FILE.
Arguments
- DATA_PATH¶
Required argument
svd¶
Plot singular values of beta matricies.
tdms svd [OPTIONS] MODEL_PATH DATA_PATH
Options
- --out <out>¶
Required
- --config <config>¶
Read configuration from FILE.
Arguments
- MODEL_PATH¶
Required argument
- DATA_PATH¶
Required argument
train¶
Train a model, saving trained model to original location.
tdms train [OPTIONS] MODEL_PATH DATA_PATH
Options
- --loss-fn <loss_fn>¶
Loss function for training.
- Default
l1
- --loss-weight-span <loss_weight_span>¶
If this option is used, add a weight to a mean-absolute-deviation loss equal to the exponential of a loss decay times the true score.
- --batch-size <batch_size>¶
Batch size for training.
- Default
500
- --learning-rate <learning_rate>¶
Initial learning rate.
- Default
0.001
- --min-lr <min_lr>¶
Minimum learning rate before early stopping on training.
- Default
1e-05
- --patience <patience>¶
Patience for ReduceLROnPlateau.
- Default
10
- --device <device>¶
Device used to train nn
- Default
cpu
- --independent-starts <independent_starts>¶
Number of independent training starts to use. Each training start gets trained independently and the best start is used for full training.
- Default
5
- --independent-start-epochs <independent_start_epochs>¶
How long to train each independent start. If not set, 10% of the full number of epochs is used.
- --simple-training¶
Ignore all fancy training options: do bare-bones training for a fixed number of epochs. Fail if data contains nans.
- --exp-target <exp_target>¶
Provide base to be exponentiated by functional scores of variants.Emphasizes fitting highly functional variants. If on, weight decay will be turned off.
- --beta-rank <beta_rank>¶
What number of dimensions to use in the low-rank reconstructions of betas.
- --epochs <epochs>¶
Number of epochs for full training.
- Default
100
- --site-path <site_path>¶
Path to .JSON file containing both site numbers and site numbers.
- --dry-run¶
Only print paths and files to be made, rather than actually making them.
- --seed <seed>¶
Set random seed. Seed is uninitialized if not set.
- --config <config>¶
Read configuration from FILE.
Arguments
- MODEL_PATH¶
Required argument
- DATA_PATH¶
Required argument
transfer¶
Transfer beta coefficients from one tdms model to another.
tdms transfer [OPTIONS] SOURCE_PATH DEST_PATH
Arguments
- SOURCE_PATH¶
Required argument
- DEST_PATH¶
Required argument
validate¶
Validate that a given data set is sane.
tdms validate [OPTIONS] DATA_PATH
Arguments
- DATA_PATH¶
Required argument