Linnaeus 5

Task: Image Classification Classes: 5 Image Size: 256x256

Overview

The Linnaeus 5 dataset is a fine-grained image classification benchmark consisting of 8,000 RGB images across 5 classes: berry, bird, dog, flower, and other. Images are provided as 256×256 pixels. A unique feature of this dataset is the inclusion of a “negative” class (other) to test object recognition capabilities.

  • Train: 6,000 images (1,200 per class)

  • Test: 2,000 images (400 per class)

../../_images/linnaeus5_teaser.png

Data Structure

When accessing an example using ds[i], you will receive a dictionary with the following keys:

Key

Type

Description

image

PIL.Image.Image

256×256 RGB color image

label

int

Class label (0-4)

Usage Example

Basic Usage

from stable_datasets.images.linnaeus5 import Linnaeus5

# First run will download + prepare cache, then return the split as a HF Dataset
ds_train = Linnaeus5(split="train")
ds_test = Linnaeus5(split="test")

# If you omit the split (split=None), you get a DatasetDict with all available splits
ds_all = Linnaeus5(split=None)

sample = ds_train[0]
print(sample.keys())  # {"image", "label"}
print(f"Label: {sample['label']}") # e.g., 0 (Berry)

# Optional: make it PyTorch-friendly
ds_train_torch = ds_train.with_format("torch")
ds_test_torch = ds_test.with_format("torch")

References

Citation

@article{chaladze2017linnaeus,
  title={Linnaeus 5 dataset for machine learning},
  author={Chaladze, G and Kalatozishvili, L},
  journal={chaladze.com},
  year={2017}
}