Tasks

Semantic segmentation

Semantic segmentation models map each pixel of the input image to a category. It could be seen as pixelwise multiclass classification.

Dataset format

A datasets follows this structure:

endpoint_url/bucket
├── prefix/images/
├── prefix/semantic_maps/
└── prefix/metadata.yaml

Dataset images are placed directly inside images/ (subdirectories are ignored).
The metadata file looks something like this:

metadata.yaml

task: semantic segmentation
annotations: semantic_maps/
categories: [cat1, cat2, cat3, cat4]
colors:
  cat1: [255, 0, 0]  # red
  cat2: [0, 0, 255]  # blue
  cat3: [255, 255, 0]  # yellow

The annotations field specifies the name of the folder containing the ground truth annotations, which share the file name with the image they are associated with (e.g. semantic_maps/000.png annotates images/000.jpg).
An annotation is a PNG color image, using the colors listed in the metadata to indicate the semantic regions of the associated input image (file in images/ with the same name).
Category colors can be any RGB triplet except for black ([0, 0, 0]), which is reserved for pixels to be ignored in training and validation.
Annotation images can have a smaller resolution as their associated input image, but the aspect ratio must be the same.