Keypoint detection models detect instances in an image and produce a set of
keypoint coordinates for each.
A common application of this task is multi-person 2D pose estimation, where instances
are people and keypoints are joints such as right knee, left elbow, head etc...
For every predicted instance, in addition to the pixel coordinates of keypoints,
the model also assigns to each a "presence" flag (which, when false, indicates that the keypoint is predicted to be missing or invisible).
Datasets follow this structure:
endpoint_url/bucket
├── prefix/images/
├── prefix/instances.yaml
└── prefix/metadata.yaml
Dataset images are placed directly inside images/ (subdirectories are ignored).
The metadata file looks something like this:
task: keypoint detection
annotations: instances.yaml
keypoints: [kpt1, kpt2, kpt3]
The annotations field specifies the name of
the file containing the ground truth annotations.
Here's an example of annotations file:
000.jpg:
- kpt1: [0, 3] # (x, y) from the top-left
kpt2: null # invisible or missing
kpt3: [6, 7]
- kpt1: [10, 11]
kpt2: [12, 13]
kpt3: [14, 15]
001.jpg: [] # no instance in this image
002.jpg:
- kpt1: [16,17]
# ...