Linnaeus 5
Overview
The Linnaeus 5 dataset is a fine-grained image classification benchmark consisting of 8,000 RGB images across 5 classes: berry, bird, dog, flower, and other. Images are provided as 256×256 pixels. A unique feature of this dataset is the inclusion of a “negative” class (other) to test object recognition capabilities.
Train: 6,000 images (1,200 per class)
Test: 2,000 images (400 per class)
Data Structure
When accessing an example using ds[i], you will receive a dictionary with the following keys:
Key |
Type |
Description |
|---|---|---|
|
|
256×256 RGB color image |
|
int |
Class label (0-4) |
Usage Example
Basic Usage
from stable_datasets.images.linnaeus5 import Linnaeus5
# First run will download + prepare cache, then return the split as a HF Dataset
ds_train = Linnaeus5(split="train")
ds_test = Linnaeus5(split="test")
# If you omit the split (split=None), you get a DatasetDict with all available splits
ds_all = Linnaeus5(split=None)
sample = ds_train[0]
print(sample.keys()) # {"image", "label"}
print(f"Label: {sample['label']}") # e.g., 0 (Berry)
# Optional: make it PyTorch-friendly
ds_train_torch = ds_train.with_format("torch")
ds_test_torch = ds_test.with_format("torch")
References
Official website: http://chaladze.com/l5/
Citation
@article{chaladze2017linnaeus,
title={Linnaeus 5 dataset for machine learning},
author={Chaladze, G and Kalatozishvili, L},
journal={chaladze.com},
year={2017}
}