RegistryLogger#

class stable_pretraining.registry.RegistryLogger(run_dir: str | Path, run_id: str, *, tags: list[str] | None = None, notes: str | None = None, flush_logs_every_n_steps: int = 50)[source]#

Bases: CSVLogger

CSV logger with a filesystem-indexable sidecar.

The sidecar is an atomically-rewritten JSON file that captures the run’s hparams, latest metric values (summary), status, and checkpoint path. It is the source of truth for the registry scanner.

Parameters:
  • run_dir – Directory this run writes to. CSV logs, sidecar.json and heartbeat all live here.

  • run_id – Unique identifier for this run (typically the SLURM job id or a deterministic hash). Used as the primary key in the registry cache and as the CSV version component.

  • tags – Free-form string tags for grouping runs (e.g. model architecture, experiment name, sweep id). Any SLURM_ARRAY_JOB_ID env var is auto-appended as "sweep:<id>" for array-job convenience.

  • notes – Optional free-text description.

  • flush_logs_every_n_steps – How often the CSV is flushed; the sidecar is rewritten on the same cadence. The heartbeat is touched on every log_metrics call (cheap).

after_save_checkpoint(checkpoint_callback: Any) None[source]#

Called after model checkpoint callback saves a new checkpoint.

Parameters:

checkpoint_callback – the model checkpoint callback instance

finalize(status: str) None[source]#

Do any processing that is necessary to finalize an experiment.

Parameters:

status – Status that the experiment finished with (e.g. success, failed, aborted)

log_hyperparams(params: dict[str, Any] | Any, *args: Any, **kw: Any) None[source]#

Record hyperparameters.

Parameters:
  • paramsNamespace or Dict containing the hyperparameters

  • args – Optional positional arguments, depends on the specific logger being used

  • kwargs – Optional keyword arguments, depends on the specific logger being used

log_image(key: str, images: list, step: int | None = None, caption: list | None = None, **_: Any) None[source]#

Save images under {run_dir}/media/<safe_tag>/<step>_<i>.png.

Compatible with Lightning’s WandbLogger.log_image signature, so existing callbacks that gate on hasattr(logger, "log_image") will start writing media to disk without code changes.

Accepts numpy arrays (HWC or CHW, uint8 or float[0,1]), PIL images, torch tensors, or paths to existing files. Each entry is also appended to media.jsonl so the registry / web viewer can index events without walking the filesystem.

log_metrics(metrics: dict[str, Any], step: int | None = None) None[source]#

Records metrics. This method logs metrics as soon as it received them.

Parameters:
  • metrics – Dictionary with metric names as keys and measured quantities as values

  • step – Step number at which the metrics should be recorded

log_video(key: str, videos: list, step: int | None = None, caption: list | None = None, fps: int | None = None, format: str | None = None, **_: Any) None[source]#

Save videos under {run_dir}/media/<safe_tag>/<step>_<i>.<ext>.

Inputs may be filesystem paths to already-encoded files (preferred — zero re-encoding cost) or raw bytes. The fps and detected format are recorded in media.jsonl so a viewer can play them back at the right rate.

save() None[source]#

Save log data.