RepeatedRandomSampler#
- class stable_pretraining.data.RepeatedRandomSampler(data_source_or_len: int | Iterable, n_views: int = 1, replacement: bool = False, seed: int = 0, pass_view_idx: bool = False)[source]#
Bases:
DistributedSamplerSampler that repeats each dataset index consecutively for multi-view learning.
Important
This sampler repeats each index
n_viewstimes in a row, creating sequences like[0,0,0,0, 1,1,1,1, 2,2,2,2, ...]forn_views=4. This means:The DataLoader will load the SAME image multiple times consecutively.
Each repeated index goes through the transform pipeline separately.
BATCH SIZE: the batch_size in DataLoader refers to total augmented samples. For example,
batch_size=128withn_views=8means only 16 unique images, each appearing 8 times with different augmentations.
Designed to work with RoundRobinMultiViewTransform which uses a counter to apply different augmentations to each repeated occurrence of the same image.
Example behavior with
n_views=3:Dataset indices: [0, 1, 2, 3, 4] Sampler output: [0,0,0, 1,1,1, 2,2,2, 3,3,3, 4,4,4]
- Parameters:
data_source (Dataset) – dataset to sample from
n_views (int) – number of times to repeat each index consecutively, default=1
replacement (bool) – samples are drawn on-demand with replacement if
True, default=``False``seed (int) – random seed for shuffling
pass_view_idx (bool) – whether to pass the view index to the dataset getitem
Note: For an alternative approach that loads each image once, consider using MultiViewTransform with a standard sampler.