stable_pretraining.methods#

The methods module provides 30 ready-to-use LightningModule subclasses, one per SSL algorithm. Each class pre-wires the backbone, loss function, optimizer, and any required callbacks so you can start training with minimal boilerplate.

All method classes are importable from the top-level namespace:

import stable_pretraining as spt

model = spt.SimCLR(backbone=backbone, projector=projector, temperature=0.1)

Or directly from the sub-package:

from stable_pretraining.methods import SimCLR, BYOL, DINO

See stable_pretraining.forward for the stateless forward-function equivalents and METHODS.md at the repository root for the complete method catalog.

Contrastive Methods#

Methods that learn representations by contrasting positive and negative pairs, or by bootstrapping without explicit negatives.

SimCLR([encoder_name, projector_dims, ...])

SimCLR: contrastive joint-embedding self-supervised learning.

BYOL([encoder_name, projector_dims, ...])

BYOL self-supervised learning with EMA target network.

NNCLR([encoder_name, projector_dims, ...])

NNCLR: SimCLR with a nearest-neighbour queue.

MoCov2([encoder_name, projector_dims, ...])

MoCo v2 with a fixed-size FIFO queue of momentum-encoder keys.

MoCov3([encoder_name, projector_dims, ...])

MoCo v3: ViT-friendly momentum contrastive learning.

SimSiam([encoder_name, projector_dim, ...])

SimSiam: simple siamese SSL with stop-gradient.

PIRL([encoder_name, projector_dim, ...])

PIRL: jigsaw-invariant memory-bank SSL.

TiCO([encoder_name, projector_dims, beta, ...])

TiCO joint-embedding SSL.

Feature Redundancy Reduction#

Methods that learn representations by reducing redundancy or decorrelating feature dimensions rather than using explicit contrastive pairs.

VICReg([encoder_name, projector_dims, ...])

VICReg: variance-invariance-covariance self-supervised learning.

VICRegL([encoder_name, projector_dim, ...])

VICRegL: VICReg with an extra local-feature term.

BarlowTwins([encoder_name, projector_dims, ...])

Barlow Twins self-supervised learning.

WMSE([encoder_name, projector_dims, eps, ...])

W-MSE: whitening + MSE between paired views.

Self-Distillation and Clustering#

Methods that use momentum-updated teacher networks, self-distillation, or online clustering to learn representations without negative pairs.

DINO([encoder_name, projector_hidden_dim, ...])

DINO self-distillation with multi-crop and an EMA teacher.

DINOv2([encoder_name, projector_hidden_dim, ...])

DINOv2: DINO + iBOT with Sinkhorn-Knopp on CLS and patch prototypes.

DINOv3([encoder_name, n_register_tokens, ...])

DINOv3: DINOv2 with register tokens + KoLeo.

iBOT([encoder_name, projector_hidden_dim, ...])

iBOT: DINO on CLS + masked patch self-distillation.

SwAV([encoder_name, projector_dims, ...])

SwAV: prototype-based online clustering for SSL.

MSN([encoder_name, projector_hidden_dim, ...])

MSN: masked siamese DINO-style SSL.

Data2Vec([encoder_name, top_k_blocks, ...])

data2vec for vision: predict EMA-teacher block-averaged features.

Masked Image Modeling#

Methods that learn representations by reconstructing masked regions of the input, either in pixel space or in a latent feature space.

MAE([model_or_model_name, ...])

MAE: Masked Autoencoders Are Scalable Vision Learners.

BEiT([encoder_name, tokenizer, vocab_size, ...])

BEiT masked image modeling with a discrete visual tokenizer.

CMAE([encoder_name, patch_size, mask_ratio, ...])

CMAE: MAE pixel loss + EMA contrastive loss.

MaskFeat([encoder_name, patch_size, ...])

MaskFeat: predict per-patch HOG at masked positions.

SimMIM([encoder_name, patch_size, ...])

SimMIM masked image modeling.

MIMRefiner(pretrained_encoder[, ...])

Refine a pretrained MIM encoder with iBOT-style self-distillation.

iGPT([encoder_name, patch_size, image_size, ...])

Autoregressive image GPT (AIM-style next-patch regression).

IJEPA([model_or_model_name, ...])

I-JEPA: Image-based Joint-Embedding Predictive Architecture.

LeJEPA([encoder_name, projector, n_slices, ...])

LeJEPA: multi-view invariance + sliced Epps-Pulley SIGReg.

SALT([encoder_name, predictor_embed_dim, ...])

SALT Stage 2: Static-teacher Asymmetric Latent Training.

Other#

NEPA([img_size, patch_size, in_chans, ...])

NEPA: Next-Embedding Predictive Autoregression.