Tasks

Metric learning

Metric learning models compute embeddings of their inputs, in such a way that semantically similar samples are close in embedding space, and dissimilar samples are far away.
In this context, samples annotated with the same identity are considered similar, and the model will be trained to cluster their embeddings.
At inference time, one can perform classification using k-NN, or similarity search using a vector database.

Dataset format

Datasets follow this structure:

endpoint_url/bucket
├── prefix/images/
├── prefix/annotations.yaml
└── prefix/metadata.yaml

Dataset images are placed directly inside images/ (subdirectories are ignored).
The metadata file looks something like this:

metadata.yaml

task: metric learning
annotations: annotations.yaml
identities: [id1, id2, id3, id4, id5]

Note that the training identities can differ from the validation identities. In fact, ensuring these identity sets are disjoint helps validate proper generalization.

The annotations field specifies the name of the file containing the ground truth annotations.
Here's an example of annotations file:

annotations.yaml

000.jpg: id4
001.jpg: id2
002.jpg: id5
# ...

Only the identities specified in the metadata should be used.
All images must be assigned an identity (or null, in which case they would be ignored).