Welcome to Thermite, a CLI generator
What are the main things that this package provides.
- run any python function or class that has type annotations
- Use docstrings as the source of help
- not require changing the signature of existing functions to customize
- allow for classes as parameter annotations in functions that will be translated into grouped options
- Allow for custom classes to be used as type annotations.
- provides the possibility to change the defaults in the CLI by using YAML or JSON definitions (an easy way to use configuration files with CLIs)
- provides a plugin-interface to extend functionality (e.g. the help itself is just a plugin)
Installation
The package is available on pip, so can be installed with
Getting started
For any function, class or instance, just use
and the package does the rest.Motivating example
As a motivating example I wanted to use a typical piece of code as you find it often in machine learning today. This repo of the SimCLR model and specifically the implementation of the CLI in run.py. It is a well written repo and like many others of its type, provides a command line interface using argparse. This is very normal and it works well, but comes with certain drawbacks:
- The variables are inside a namespace, without typing. In general it is somewhat opaque where the variables are being used.
- The documentation of the variables in the argparse definition, not in docstrings. So to maintain proper documentation, would need to write it twice.
- They are all ungrouped in one big list, which may be hard to distinguish what is intended for what part of the algorithm.
With thermite, it is easy to arrange this differently. We can keep the configuration in a dataclass or even nested sub-dataclasses. In this case, we attach the training code as a method to the dataclass, but we could also just use a function.
As a result, with a single or few lines, a properly written and documented code is turned into the CLI. Here we can especially see the highlight of automatically supported classes as parameters that results in grouped options. Especially for machine learning models that can have very many parameters, this can help the user to distinguish what parameters are for and encourages the coder to separate parameters by their use.
Help outpute and code for argparse and thermite.
> python examples/adv/simclr_argparse.py --help
usage: simclr_argparse.py [-h] [-data DIR] [-dataset-name {stl10,cifar10}]
[-a ARCH] [-j N] [--epochs N] [-b N] [--lr LR]
[--wd W] [--seed SEED] [--disable-cuda]
[--fp16-precision] [--out_dim OUT_DIM]
[--log-every-n-steps LOG_EVERY_N_STEPS]
[--temperature TEMPERATURE] [--n-views N]
[--gpu-index GPU_INDEX]
PyTorch SimCLR
options:
-h, --help show this help message and exit
-data DIR path to dataset
-dataset-name {stl10,cifar10}
dataset name
-a ARCH, --arch ARCH model architecture: resnet18 | resnet50 (default:
resnet50)
-j N, --workers N number of data loading workers (default: 32)
--epochs N number of total epochs to run
-b N, --batch-size N mini-batch size (default: 256), this is the total
batch size of all GPUs on the current node when using
Data Parallel or Distributed Data Parallel
--lr LR, --learning-rate LR
initial learning rate
--wd W, --weight-decay W
weight decay (default: 1e-4)
--seed SEED seed for initializing training.
--disable-cuda Disable CUDA
--fp16-precision Whether or not to use 16-bit precision GPU training.
--out_dim OUT_DIM feature dimension (default: 128)
--log-every-n-steps LOG_EVERY_N_STEPS
Log every n steps
--temperature TEMPERATURE
softmax temperature (default: 0.07)
--n-views N Number of views for contrastive learning training.
--gpu-index GPU_INDEX
Gpu index.
> python examples/adv/simclr.py --help
PyTorch SimCLR training
Usage: examples/adv/simclr.py [OPTIONS] SUBCOMMAND
╭─ Eager Callbacks ────────────────────────────────────────────────────────────────────────────────╮
│ --help Display the help message │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
│ --data Path datasets Path to dataset │
│ --dataset-name Literal['stl10', stl10 Name of the dataset to │
│ 'cifar10'] use │
│ --arch Literal['resnet18', resnet50 Model architectures │
│ 'resnet50'] │
│ --workers int 12 Number of data loading │
│ workers │
│ --epochs int 200 Number of epochs to run │
│ --batch-size int 256 Mini-batch-size. Total of │
│ all GPUs on a node │
│ --learning-rate float 0.0003 Initial learning rate │
│ --weight-decay float 0.0001 Optimizer weight decay │
│ --seed int Seed for initializing │
│ training │
│ --fp16-precision bool False Whether or not to use │
│ 16bit GPU precision │
│ --no-fp16-precision bool │
│ --disable-cuda bool False Disable CUDA │
│ --no-disable-cuda bool │
│ --out-dim int 128 Feature Dimension of │
│ SimCLR projection │
│ --log-every-n-steps int 100 Number of steps between │
│ logging │
│ --temperature float 0.07 Softmax temperature │
│ --n-views int 2 Number of views for │
│ contrastive learning │
│ --gpu-index int 0 Gpu index │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────────────────╮
│ train Training the model. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
> python examples/adv/simclr_nested.py --help
PyTorch SimCLR training
Usage: examples/adv/simclr_nested.py SUBCOMMAND
╭─ Eager Callbacks ────────────────────────────────────────────────────────────────────────────────╮
│ --help Display the help message │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
│ ╭─ data ───────────────────────────────────────────────────────────────────────────────────────╮ │
│ │ Data description │ │
│ │ --data-path Path datasets │ │
│ │ --data-name Literal['stl10', 'cifar10'] stl10 │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ ╭─ train_vars ─────────────────────────────────────────────────────────────────────────────────╮ │
│ │ Config for training │ │
│ │ --workers, -j int 12 Number of data loading workers │ │
│ │ --epochs int 200 Number of epochs to run │ │
│ │ --batch-size, -b int 256 Mini-batch-size. Total of all GPUs on a │ │
│ │ node │ │
│ │ --learning-rate, --lr float 0.0003 Initial learning rate │ │
│ │ --weight-decay, --wd float 0.0001 Optimizer weight decay │ │
│ │ --seed int Seed for initializing training │ │
│ │ --log-every-n-steps int 100 Number of steps between logging │ │
│ │ --temperature float 0.07 Softmax temperature │ │
│ │ --n-views int 2 Number of views for contrastive learning │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ ╭─ gpu ────────────────────────────────────────────────────────────────────────────────────────╮ │
│ │ GPU settings │ │
│ │ --gpu-fp16-precision bool False Whether or not to use 16bit GPU precision │ │
│ │ --no-gpu-fp16-precision bool │ │
│ │ --gpu-disable-cuda bool False Disable CUDA │ │
│ │ --no-gpu-disable-cuda bool │ │
│ │ --gpu-index int 0 │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ ╭─ model ──────────────────────────────────────────────────────────────────────────────────────╮ │
│ │ Model description │ │
│ │ --model-arch, -a Literal['resnet18', resnet50 Model architectures │ │
│ │ 'resnet50'] │ │
│ │ --model-out-dim int 128 Feature Dimension of │ │
│ │ SimCLR projection │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────────────────╮
│ train Training the model. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
"""
Code taken and adapted from:
https://github.com/sthalles/SimCLR/blob/master/run.py
Original code under MIT license.
"""
import argparse
model_names = ["resnet18", "resnet50"]
parser = argparse.ArgumentParser(description="PyTorch SimCLR")
parser.add_argument(
"-data", metavar="DIR", default="./datasets", help="path to dataset"
)
parser.add_argument(
"-dataset-name", default="stl10", help="dataset name", choices=["stl10", "cifar10"]
)
parser.add_argument(
"-a",
"--arch",
metavar="ARCH",
default="resnet18",
choices=model_names,
help="model architecture: " + " | ".join(model_names) + " (default: resnet50)",
)
parser.add_argument(
"-j",
"--workers",
default=12,
type=int,
metavar="N",
help="number of data loading workers (default: 32)",
)
parser.add_argument(
"--epochs", default=200, type=int, metavar="N", help="number of total epochs to run"
)
parser.add_argument(
"-b",
"--batch-size",
default=256,
type=int,
metavar="N",
help="mini-batch size (default: 256), this is the total "
"batch size of all GPUs on the current node when "
"using Data Parallel or Distributed Data Parallel",
)
parser.add_argument(
"--lr",
"--learning-rate",
default=0.0003,
type=float,
metavar="LR",
help="initial learning rate",
dest="lr",
)
parser.add_argument(
"--wd",
"--weight-decay",
default=1e-4,
type=float,
metavar="W",
help="weight decay (default: 1e-4)",
dest="weight_decay",
)
parser.add_argument(
"--seed", default=None, type=int, help="seed for initializing training. "
)
parser.add_argument("--disable-cuda", action="store_true", help="Disable CUDA")
parser.add_argument(
"--fp16-precision",
action="store_true",
help="Whether or not to use 16-bit precision GPU training.",
)
parser.add_argument(
"--out_dim", default=128, type=int, help="feature dimension (default: 128)"
)
parser.add_argument(
"--log-every-n-steps", default=100, type=int, help="Log every n steps"
)
parser.add_argument(
"--temperature",
default=0.07,
type=float,
help="softmax temperature (default: 0.07)",
)
parser.add_argument(
"--n-views",
default=2,
type=int,
metavar="N",
help="Number of views for contrastive learning training.",
)
parser.add_argument("--gpu-index", default=0, type=int, help="Gpu index.")
def main():
args = parser.parse_args()
# production code would follow here
if __name__ == "__main__":
main()
from dataclasses import dataclass
from pathlib import Path
from typing import Literal
from thermite import run
@dataclass(kw_only=True)
class PytorchSimCLR:
"""
PyTorch SimCLR training
Args:
data: Path to dataset
dataset_name: Name of the dataset to use
arch: Model architectures
workers: Number of data loading workers
epochs: Number of epochs to run
batch_size: Mini-batch-size. Total of all GPUs on a node
learning_rate: Initial learning rate
weight_decay: Optimizer weight decay
seed: Seed for initializing training
fp16_precision: Whether or not to use 16bit GPU precision
disable_cuda: Disable CUDA
out_dim: Feature Dimension of SimCLR projection
log_every_n_steps: Number of steps between logging
temperature: Softmax temperature
n_views: Number of views for contrastive learning
gpu_index: Gpu index
"""
data: Path = Path("./datasets")
dataset_name: Literal["stl10", "cifar10"] = "stl10"
arch: Literal["resnet18", "resnet50"] = "resnet50"
workers: int = 12
epochs: int = 200
batch_size: int = 256
learning_rate: float = 0.0003
weight_decay: float = 1e-4
seed: int
fp16_precision: bool = False
disable_cuda: bool = False
out_dim: int = 128
log_every_n_steps: int = 100
temperature: float = 0.07
n_views: int = 2
gpu_index: int = 0
def train(self):
"""Training the model."""
...
if __name__ == "__main__":
run(PytorchSimCLR)
from dataclasses import dataclass
from pathlib import Path
from typing import Literal
from thermite import Config, Event, run
from thermite.pp_utils import multi_extend, multi_str_replace, pg_trigger_map
@dataclass(kw_only=True)
class GPU:
"""
GPU settings
Args:
fp16_precision: Whether or not to use 16bit GPU precision
disable_cuda: Disable CUDA
gpu_index: Gpu index
"""
fp16_precision: bool = False
disable_cuda: bool = False
index: int = 0
@dataclass(kw_only=True)
class Model:
"""
Model description
Args:
arch: Model architectures
out_dim: Feature Dimension of SimCLR projection
"""
arch: Literal["resnet18", "resnet50"] = "resnet50"
out_dim: int = 128
@dataclass(kw_only=True)
class Data:
"""
Data description
Args:
data: Path to dataset
dataset_name: Name of the dataset to use
"""
path: Path = Path("./datasets")
name: Literal["stl10", "cifar10"] = "stl10"
@dataclass(kw_only=True)
class Training:
"""
Config for training
Args:
workers: Number of data loading workers
epochs: Number of epochs to run
batch_size: Mini-batch-size. Total of all GPUs on a node
learning_rate: Initial learning rate
weight_decay: Optimizer weight decay
seed: Seed for initializing training
log_every_n_steps: Number of steps between logging
temperature: Softmax temperature
n_views: Number of views for contrastive learning
"""
workers: int = 12
epochs: int = 200
batch_size: int = 256
learning_rate: float = 0.0003
weight_decay: float = 1e-4
seed: int
log_every_n_steps: int = 100
temperature: float = 0.07
n_views: int = 2
@dataclass(kw_only=True)
class PytorchSimCLR:
"""
PyTorch SimCLR training
Args:
data: dataset config
gpu: Gpu settings
model: Model description
"""
data: Data
train_vars: Training
gpu: GPU
model: Model
def train(self):
"""Training the model."""
...
if __name__ == "__main__":
config = Config()
config.event_cb_deco(Event.PG_POST_CREATE, PytorchSimCLR)(
pg_trigger_map(multi_str_replace({"--train-vars-": "--"}))
)
config.event_cb_deco(Event.PG_POST_CREATE, PytorchSimCLR)(
pg_trigger_map(
multi_extend(
{
"--model-arch": "-a",
"--workers": "-j",
"--batch-size": "-b",
"--learning-rate": "--lr",
"--weight-decay": "--wd",
}
)
)
)
run(PytorchSimCLR, config=config)
Customization options
The package allows plenty of different customization options, enabled through the plugin system. More information is available on other pages of this documentation.
Examples of common customizations
For various examples on how to customize the CLI, please see the Table of Contents on the side.
Bash completion
Not yet implemented. Plan is to have a JSON specification of the core of the commands. This will then be run by bash using only minimal dependencies so that loading the completion is fast, even if the CLI underneath can be slow due to heavy dependencies (e.g. pytorch).
Other CLI generators
There are already lots of CLI generators for python, many with lots of usage and great functionality that have inspired this package. Check them out.
- argparse
- click
- typer
- fire
- docopt