Subset

Contents

Subset#

class stable_pretraining.data.Subset(dataset: Dataset, indices: Sequence[int])[source]#

Bases: Dataset

Subset of a dataset at specified indices.

All attributes and methods of the wrapped dataset are accessible directly on the subset via attribute proxying. For example, if the underlying dataset has a column_names property or a custom_method(), they can be called as subset.column_names or subset.custom_method() respectively.

Parameters:
  • dataset – The whole dataset.

  • indices – Indices in the whole set selected for the subset.