Skip to content

Welcome to Thermite, a CLI generator

What are the main things that this package provides.

  • run any python function or class that has type annotations
  • Use docstrings as the source of help
  • not require changing the signature of existing functions to customize
  • allow for classes as parameter annotations in functions that will be translated into grouped options
  • Allow for custom classes to be used as type annotations.
  • provides the possibility to change the defaults in the CLI by using YAML or JSON definitions (an easy way to use configuration files with CLIs)
  • provides a plugin-interface to extend functionality (e.g. the help itself is just a plugin)

Installation

The package is available on pip, so can be installed with

pip install thermite

Getting started

For any function, class or instance, just use

from thermite import run

if __name__ == "__main__":
    run(obj)
and the package does the rest.

Motivating example

As a motivating example I wanted to use a typical piece of code as you find it often in machine learning today. This repo of the SimCLR model and specifically the implementation of the CLI in run.py. It is a well written repo and like many others of its type, provides a command line interface using argparse. This is very normal and it works well, but comes with certain drawbacks:

  • The variables are inside a namespace, without typing. In general it is somewhat opaque where the variables are being used.
  • The documentation of the variables in the argparse definition, not in docstrings. So to maintain proper documentation, would need to write it twice.
  • They are all ungrouped in one big list, which may be hard to distinguish what is intended for what part of the algorithm.

With thermite, it is easy to arrange this differently. We can keep the configuration in a dataclass or even nested sub-dataclasses. In this case, we attach the training code as a method to the dataclass, but we could also just use a function.

As a result, with a single or few lines, a properly written and documented code is turned into the CLI. Here we can especially see the highlight of automatically supported classes as parameters that results in grouped options. Especially for machine learning models that can have very many parameters, this can help the user to distinguish what parameters are for and encourages the coder to separate parameters by their use.

Help outpute and code for argparse and thermite.
> python examples/adv/simclr_argparse.py --help
usage: simclr_argparse.py [-h] [-data DIR] [-dataset-name {stl10,cifar10}]
                          [-a ARCH] [-j N] [--epochs N] [-b N] [--lr LR]
                          [--wd W] [--seed SEED] [--disable-cuda]
                          [--fp16-precision] [--out_dim OUT_DIM]
                          [--log-every-n-steps LOG_EVERY_N_STEPS]
                          [--temperature TEMPERATURE] [--n-views N]
                          [--gpu-index GPU_INDEX]

PyTorch SimCLR

options:
  -h, --help            show this help message and exit
  -data DIR             path to dataset
  -dataset-name {stl10,cifar10}
                        dataset name
  -a ARCH, --arch ARCH  model architecture: resnet18 | resnet50 (default:
                        resnet50)
  -j N, --workers N     number of data loading workers (default: 32)
  --epochs N            number of total epochs to run
  -b N, --batch-size N  mini-batch size (default: 256), this is the total
                        batch size of all GPUs on the current node when using
                        Data Parallel or Distributed Data Parallel
  --lr LR, --learning-rate LR
                        initial learning rate
  --wd W, --weight-decay W
                        weight decay (default: 1e-4)
  --seed SEED           seed for initializing training.
  --disable-cuda        Disable CUDA
  --fp16-precision      Whether or not to use 16-bit precision GPU training.
  --out_dim OUT_DIM     feature dimension (default: 128)
  --log-every-n-steps LOG_EVERY_N_STEPS
                        Log every n steps
  --temperature TEMPERATURE
                        softmax temperature (default: 0.07)
  --n-views N           Number of views for contrastive learning training.
  --gpu-index GPU_INDEX
                        Gpu index.
> python examples/adv/simclr.py --help
PyTorch SimCLR training

Usage: examples/adv/simclr.py [OPTIONS] SUBCOMMAND

╭─ Eager Callbacks ────────────────────────────────────────────────────────────────────────────────╮
│   --help     Display the help message                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
│   --data                  Path                          datasets     Path to dataset             │
│   --dataset-name          Literal['stl10',              stl10        Name of the dataset to      │
│                           'cifar10']                                 use                         │
│   --arch                  Literal['resnet18',           resnet50     Model architectures         │
│                           'resnet50']                                                            │
│   --workers               int                           12           Number of data loading      │
│                                                                      workers                     │
│   --epochs                int                           200          Number of epochs to run     │
│   --batch-size            int                           256          Mini-batch-size. Total of   │
│                                                                      all GPUs on a node          │
│   --learning-rate         float                         0.0003       Initial learning rate       │
│   --weight-decay          float                         0.0001       Optimizer weight decay      │
│   --seed                  int                                        Seed for initializing       │
│                                                                      training                    │
│   --fp16-precision        bool                          False        Whether or not to use       │
│                                                                      16bit GPU precision         │
│   --no-fp16-precision     bool                                                                   │
│   --disable-cuda          bool                          False        Disable CUDA                │
│   --no-disable-cuda       bool                                                                   │
│   --out-dim               int                           128          Feature Dimension of        │
│                                                                      SimCLR projection           │
│   --log-every-n-steps     int                           100          Number of steps between     │
│                                                                      logging                     │
│   --temperature           float                         0.07         Softmax temperature         │
│   --n-views               int                           2            Number of views for         │
│                                                                      contrastive learning        │
│   --gpu-index             int                           0            Gpu index                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────────────────╮
│ train  Training the model.                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
> python examples/adv/simclr_nested.py --help
PyTorch SimCLR training

Usage: examples/adv/simclr_nested.py SUBCOMMAND

╭─ Eager Callbacks ────────────────────────────────────────────────────────────────────────────────╮
│   --help     Display the help message                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
│ ╭─ data ───────────────────────────────────────────────────────────────────────────────────────╮ │
│ │ Data description                                                                             │ │
│ │   --data-path     Path                            datasets                                   │ │
│ │   --data-name     Literal['stl10', 'cifar10']     stl10                                      │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ ╭─ train_vars ─────────────────────────────────────────────────────────────────────────────────╮ │
│ │ Config for training                                                                          │ │
│ │   --workers, -j             int       12         Number of data loading workers              │ │
│ │   --epochs                  int       200        Number of epochs to run                     │ │
│ │   --batch-size, -b          int       256        Mini-batch-size. Total of all GPUs on a     │ │
│ │                                                  node                                        │ │
│ │   --learning-rate, --lr     float     0.0003     Initial learning rate                       │ │
│ │   --weight-decay, --wd      float     0.0001     Optimizer weight decay                      │ │
│ │   --seed                    int                  Seed for initializing training              │ │
│ │   --log-every-n-steps       int       100        Number of steps between logging             │ │
│ │   --temperature             float     0.07       Softmax temperature                         │ │
│ │   --n-views                 int       2          Number of views for contrastive learning    │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ ╭─ gpu ────────────────────────────────────────────────────────────────────────────────────────╮ │
│ │ GPU settings                                                                                 │ │
│ │   --gpu-fp16-precision        bool     False     Whether or not to use 16bit GPU precision   │ │
│ │   --no-gpu-fp16-precision     bool                                                           │ │
│ │   --gpu-disable-cuda          bool     False     Disable CUDA                                │ │
│ │   --no-gpu-disable-cuda       bool                                                           │ │
│ │   --gpu-index                 int      0                                                     │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ ╭─ model ──────────────────────────────────────────────────────────────────────────────────────╮ │
│ │ Model description                                                                            │ │
│ │   --model-arch, -a     Literal['resnet18',           resnet50     Model architectures        │ │
│ │                        'resnet50']                                                           │ │
│ │   --model-out-dim      int                           128          Feature Dimension of       │ │
│ │                                                                   SimCLR projection          │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────────────────╮
│ train  Training the model.                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
"""
Code taken and adapted from: 
https://github.com/sthalles/SimCLR/blob/master/run.py
Original code under MIT license.
"""
import argparse

model_names = ["resnet18", "resnet50"]


parser = argparse.ArgumentParser(description="PyTorch SimCLR")
parser.add_argument(
    "-data", metavar="DIR", default="./datasets", help="path to dataset"
)
parser.add_argument(
    "-dataset-name", default="stl10", help="dataset name", choices=["stl10", "cifar10"]
)
parser.add_argument(
    "-a",
    "--arch",
    metavar="ARCH",
    default="resnet18",
    choices=model_names,
    help="model architecture: " + " | ".join(model_names) + " (default: resnet50)",
)
parser.add_argument(
    "-j",
    "--workers",
    default=12,
    type=int,
    metavar="N",
    help="number of data loading workers (default: 32)",
)
parser.add_argument(
    "--epochs", default=200, type=int, metavar="N", help="number of total epochs to run"
)
parser.add_argument(
    "-b",
    "--batch-size",
    default=256,
    type=int,
    metavar="N",
    help="mini-batch size (default: 256), this is the total "
    "batch size of all GPUs on the current node when "
    "using Data Parallel or Distributed Data Parallel",
)
parser.add_argument(
    "--lr",
    "--learning-rate",
    default=0.0003,
    type=float,
    metavar="LR",
    help="initial learning rate",
    dest="lr",
)
parser.add_argument(
    "--wd",
    "--weight-decay",
    default=1e-4,
    type=float,
    metavar="W",
    help="weight decay (default: 1e-4)",
    dest="weight_decay",
)
parser.add_argument(
    "--seed", default=None, type=int, help="seed for initializing training. "
)
parser.add_argument("--disable-cuda", action="store_true", help="Disable CUDA")
parser.add_argument(
    "--fp16-precision",
    action="store_true",
    help="Whether or not to use 16-bit precision GPU training.",
)

parser.add_argument(
    "--out_dim", default=128, type=int, help="feature dimension (default: 128)"
)
parser.add_argument(
    "--log-every-n-steps", default=100, type=int, help="Log every n steps"
)
parser.add_argument(
    "--temperature",
    default=0.07,
    type=float,
    help="softmax temperature (default: 0.07)",
)
parser.add_argument(
    "--n-views",
    default=2,
    type=int,
    metavar="N",
    help="Number of views for contrastive learning training.",
)
parser.add_argument("--gpu-index", default=0, type=int, help="Gpu index.")


def main():
    args = parser.parse_args()

    # production code would follow here


if __name__ == "__main__":
    main()
from dataclasses import dataclass
from pathlib import Path
from typing import Literal

from thermite import run


@dataclass(kw_only=True)
class PytorchSimCLR:
    """
    PyTorch SimCLR training

    Args:
        data: Path to dataset
        dataset_name: Name of the dataset to use
        arch: Model architectures
        workers: Number of data loading workers
        epochs: Number of epochs to run
        batch_size: Mini-batch-size. Total of all GPUs on a node
        learning_rate: Initial learning rate
        weight_decay: Optimizer weight decay
        seed: Seed for initializing training
        fp16_precision: Whether or not to use 16bit GPU precision
        disable_cuda: Disable CUDA
        out_dim: Feature Dimension of SimCLR projection
        log_every_n_steps: Number of steps between logging
        temperature: Softmax temperature
        n_views: Number of views for contrastive learning
        gpu_index: Gpu index
    """

    data: Path = Path("./datasets")
    dataset_name: Literal["stl10", "cifar10"] = "stl10"
    arch: Literal["resnet18", "resnet50"] = "resnet50"
    workers: int = 12
    epochs: int = 200
    batch_size: int = 256
    learning_rate: float = 0.0003
    weight_decay: float = 1e-4
    seed: int
    fp16_precision: bool = False
    disable_cuda: bool = False
    out_dim: int = 128
    log_every_n_steps: int = 100
    temperature: float = 0.07
    n_views: int = 2
    gpu_index: int = 0

    def train(self):
        """Training the model."""
        ...


if __name__ == "__main__":
    run(PytorchSimCLR)
from dataclasses import dataclass
from pathlib import Path
from typing import Literal

from thermite import Config, Event, run
from thermite.pp_utils import multi_extend, multi_str_replace, pg_trigger_map


@dataclass(kw_only=True)
class GPU:
    """
    GPU settings

    Args:
        fp16_precision: Whether or not to use 16bit GPU precision
        disable_cuda: Disable CUDA
        gpu_index: Gpu index

    """

    fp16_precision: bool = False
    disable_cuda: bool = False
    index: int = 0


@dataclass(kw_only=True)
class Model:
    """
    Model description

    Args:
        arch: Model architectures
        out_dim: Feature Dimension of SimCLR projection

    """

    arch: Literal["resnet18", "resnet50"] = "resnet50"
    out_dim: int = 128


@dataclass(kw_only=True)
class Data:
    """
    Data description

    Args:
        data: Path to dataset
        dataset_name: Name of the dataset to use

    """

    path: Path = Path("./datasets")
    name: Literal["stl10", "cifar10"] = "stl10"


@dataclass(kw_only=True)
class Training:
    """
    Config for training

    Args:
        workers: Number of data loading workers
        epochs: Number of epochs to run
        batch_size: Mini-batch-size. Total of all GPUs on a node
        learning_rate: Initial learning rate
        weight_decay: Optimizer weight decay
        seed: Seed for initializing training
        log_every_n_steps: Number of steps between logging
        temperature: Softmax temperature
        n_views: Number of views for contrastive learning

    """

    workers: int = 12
    epochs: int = 200
    batch_size: int = 256
    learning_rate: float = 0.0003
    weight_decay: float = 1e-4
    seed: int
    log_every_n_steps: int = 100
    temperature: float = 0.07
    n_views: int = 2


@dataclass(kw_only=True)
class PytorchSimCLR:
    """
    PyTorch SimCLR training

    Args:
        data: dataset config
        gpu: Gpu settings
        model: Model description
    """

    data: Data
    train_vars: Training
    gpu: GPU
    model: Model

    def train(self):
        """Training the model."""
        ...


if __name__ == "__main__":
    config = Config()
    config.event_cb_deco(Event.PG_POST_CREATE, PytorchSimCLR)(
        pg_trigger_map(multi_str_replace({"--train-vars-": "--"}))
    )
    config.event_cb_deco(Event.PG_POST_CREATE, PytorchSimCLR)(
        pg_trigger_map(
            multi_extend(
                {
                    "--model-arch": "-a",
                    "--workers": "-j",
                    "--batch-size": "-b",
                    "--learning-rate": "--lr",
                    "--weight-decay": "--wd",
                }
            )
        )
    )
    run(PytorchSimCLR, config=config)

Customization options

The package allows plenty of different customization options, enabled through the plugin system. More information is available on other pages of this documentation.

Examples of common customizations

For various examples on how to customize the CLI, please see the Table of Contents on the side.

Bash completion

Not yet implemented. Plan is to have a JSON specification of the core of the commands. This will then be run by bash using only minimal dependencies so that loading the completion is fast, even if the CLI underneath can be slow due to heavy dependencies (e.g. pytorch).

Other CLI generators

There are already lots of CLI generators for python, many with lots of usage and great functionality that have inspired this package. Check them out.

  • argparse
  • click
  • typer
  • fire
  • docopt