Waymo Open Dataset - Perception¶
The Waymo Open Dataset (WOD) is a collective term for publicly available datasets from Waymo. The Perception Dataset, abbreviated as WOD-Perception, is a high-quality dataset targeted for perceptions tasks. With 1150 logs each spanning 20 seconds, the dataset includes about 6.4 hours
Overview
Paper |
Scalability in Perception for Autonomous Driving: Waymo Open Dataset |
Download |
|
Code |
|
License |
Waymo Dataset License Agreement for Non-Commercial Use Apache License 2.0 + Code Specific Licenses |
Available splits |
|
Available Modalities¶
Name |
Available |
Description |
|---|---|---|
Ego Vehicle |
✓ |
State of the ego vehicle, including poses, and vehicle parameters, see |
Map |
(✓) |
The HD-Maps are in 3D, but may have artifacts due to polyline to polygon conversion (see below). For more information, see |
Bounding Boxes |
✓ |
The bounding boxes are available with the |
Traffic Lights |
X |
n/a |
Cameras |
✓ |
Includes 5 cameras, see
|
Lidars |
✓ |
Includes 5 Lidars, see
|
Download¶
To download the Waymo Open Dataset for Perception, please visit the official website and follow the instructions provided there.
You will need to register and download the Perception Dataset V1.4.3.
(We currently do not support V2.0.1 due to the missing maps.)
The expected directory structure after downloading and extracting the dataset is as follows:
$WOD_PERCEPTION_DATA_ROOT
├── testing/
| ├── segment-10084636266401282188_1120_000_1140_000_with_camera_labels.tfrecord
| ├── ...
| └── segment-9806821842001738961_4460_000_4480_000_with_camera_labels.tfrecord
├── training/
| ├── segment-10017090168044687777_6380_000_6400_000_with_camera_labels.tfrecord
| ├── ...
| └── segment-9985243312780923024_3049_720_3069_720_with_camera_labels.tfrecord
└── validation/
├── segment-10203656353524179475_7625_000_7645_000_with_camera_labels.tfrecord
├── ...
└── segment-967082162553397800_5102_900_5122_900_with_camera_labels.tfrecord
You can add the dataset root directory to the environment variable WOD_PERCEPTION_DATA_ROOT for easier access.
export WOD_PERCEPTION_DATA_ROOT=/path/to/wod_perception_dataset_root
Optionally, you can adjust the py123d/script/config/common/default_dataset_paths.yaml accordingly.
Installation¶
The Waymo Open Dataset requires additional dependencies that are included as optional dependencies in py123d. You can install them via:
pip install py123d[waymo]
pip install -e .[waymo]
The optional dependencies (tensorflow-cpu and protobuf) are only needed to convert the dataset or read from the raw TFRecord files.
After conversion, you may use any other py123d installation without these dependencies.
Conversion¶
Local mode — data already downloaded to $WOD_PERCEPTION_DATA_ROOT:
py123d-conversion dataset=wod-perception
Note
The conversion of WOD-Perception by default stores the camera images as jpegs and the Lidar point clouds as binary files in the logs.
Thus, the logs need fairly large disk space. Reading from the raw TFRecord files is also supported, but requires the Waymo Open Dataset specific dependencies (see above) and might be slower.
To change the default behavior, you need to adapt the wod-perception.yaml converter configuration.
Streaming mode — dataset=wod-perception-stream attaches a WODPerceptionDownloader
to the parser; it fetches selected segments from GCS into a temp directory at parser
construction time and cleans up on parser GC. Useful when the full ~1 TB dataset is too
large for local disk and you only need a handful of segments for iteration:
# Authenticate once (the perception bucket is not anonymously readable)
gcloud auth application-default login
# Stream the first segment of the validation split only:
py123d-conversion dataset=wod-perception-stream \
dataset.parser.downloader.num_shards=1 \
'dataset.parser.splits=[wod-perception_val]'
# Stream specific segment indices (keyed by 123D split name):
py123d-conversion dataset=wod-perception-stream \
'dataset.parser.downloader.shard_indices={wod-perception_val: [0, 1, 2]}'
# Persist downloads under a dedicated cache dir instead of a tempdir:
py123d-conversion dataset=wod-perception-stream \
dataset.parser.downloader.num_shards=1 \
dataset.parser.downloader.output_dir=/mnt/scratch/wod_perception_cache
Warning
Perception segments are ~1 GB each — even small values of num_shards
imply multiple GB of download traffic.
Note
Unlike the motion bucket, the perception bucket requires an authenticated GCS
client. If neither gcloud auth application-default login nor
dataset.parser.downloader.credentials_file is configured, listing will fail
with a 403.
To pre-stage data outside the conversion pipeline (or preview which segments would be
downloaded), use py123d-download:
# Preview the first 3 segments of the validation split without downloading:
py123d-download dataset=wod-perception \
'dataset.downloader.splits=[wod-perception_val]' \
dataset.downloader.num_shards=3 \
dataset.downloader.dry_run=true
# Download a single training segment to $WOD_PERCEPTION_DATA_ROOT:
py123d-download dataset=wod-perception \
'dataset.downloader.splits=[wod-perception_train]' \
dataset.downloader.num_shards=1
Dataset Specific Issues¶
Map: The HD-Map in Waymo has bugs …
Citation¶
If you use this dataset in your research, please cite:
@inproceedings{Sun2020CVPR,
title={Scalability in perception for autonomous driving: Waymo open dataset},
author={Sun, Pei and Kretzschmar, Henrik and Dotiwalla, Xerxes and Chouard, Aurelien and Patnaik, Vijaysai and Tsui, Paul and Guo, James and Zhou, Yin and Chai, Yuning and Caine, Benjamin and others},
booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
pages={2446--2454},
year={2020}
}