Tasks

Multilabel classification

Multilabel classification models predict which subset of a predefined set of possible labels best matches the input image.
The special case in which there is only 1 possible label is called "binary classification".

Dataset format

Datasets follow this structure:

endpoint_url/bucket
├── prefix/images/
├── prefix/annotations.yaml
└── prefix/metadata.yaml

Dataset images are placed directly inside images/ (subdirectories are ignored).
The metadata file looks something like this:

metadata.yaml
task: multilabel classification
annotations: annotations.yaml
labels: [lab1, lab2, lab3]

The annotations field specifies the name of the file containing the ground truth annotations.
Here's an example of annotations file:

annotations.yaml
000.jpg: [lab2, lab3]
001.jpg: [lab1]
002.jpg: []  # no label
# ...
Only the labels specified in the metadata should be used.
If an image has no label associated with it, it has to be explicitly assigned the empty array [].
Images assigned to null are ignored!