NCore (PhysicalAI-AV NCore)

Warning

Experimental Dataset Support

The NCore dataset integration is currently under active development and should be considered experimental. Features may be incomplete, APIs may change, and unexpected bugs are possible.

If you encounter any issues, please report them on our GitHub Issues page. Your feedback helps us improve!

NCore is NVIDIA’s PhysicalAI-Autonomous-Vehicles-NCore dataset. It ships on the same NVIDIA Hyperion 8.1 sensor platform as the Physical AI AV dataset — 7 f-theta (fisheye) cameras at ~30 fps, a 360° top LiDAR at ~10 Hz, auto-labeled 3D cuboid detections, and rig-to-world egomotion poses — but in the newer NCore V4 component-based format (indexed-tar zarr archives, .zarr.itar) rather than raw parquet/mp4 files.

Overview

Download

Hugging Face (gated)

Code

NVIDIA/ncore

License

Please refer to the dataset’s official license terms.

Available splits

ncore_train (NCore ships as a single collection; the split is synthetic)

Available Modalities

Name

Available

Description

Ego Vehicle

Rig-to-world poses sampled at ~100 Hz. NCore stores poses only (no velocity or acceleration); py123d’s infer_ego_dynamics: true option derives velocity/acceleration via finite differences during conversion. See EgoStateSE3.

Map

X

Not available for this dataset.

Bounding Boxes

Auto-labeled 3D cuboid track observations with the same 10-class taxonomy as Physical AI AV (PhysicalAIAVBoxDetectionLabel). See BoxDetectionsSE3.

Traffic Lights

X

Not available for this dataset.

Cameras

Same 7 f-theta (fisheye) cameras as Physical AI AV, see Camera.

Lidars

1 top-mounted 360° LiDAR, see Lidar.

Download

The dataset is gated on Hugging Face. You need (1) a HF account that has accepted the NVIDIA AV dataset license and (2) an HF token exported as HF_TOKEN (or set via dataset.downloader.hf_token). Downloads run through the unified py123d-download CLI:

pip install py123d[ncore]          # pulls in huggingface_hub and nvidia-ncore
export HF_TOKEN=hf_...

# Download a 5-clip random subset (~12 GB) to $NCORE_DATA_ROOT
py123d-download dataset=ncore \
    dataset.downloader.num_clips=5 \
    dataset.downloader.sample_random=true

# Or the full dataset (~2.4 TB)
py123d-download dataset=ncore

# Or just one modality + a subset
py123d-download dataset=ncore \
    dataset.downloader.num_clips=20 \
    dataset.downloader.modality=lidar

py123d-download dataset=ncore \
    dataset.downloader.num_clips=20 \
    dataset.downloader.modality=cameras \
    'dataset.downloader.cameras=[camera_front_wide_120fov]'

The downloaded dataset has the following per-clip structure:

$NCORE_DATA_ROOT
└── clips/
    └── {clip_id}/
        ├── pai_{clip_id}.json                                    (sequence manifest)
        ├── pai_{clip_id}.ncore4.zarr.itar                        (poses, intrinsics, cuboids, masks)
        ├── pai_{clip_id}.ncore4-lidar_top_360fov.zarr.itar       (~1.0 GB)
        └── pai_{clip_id}.ncore4-camera_{name}.zarr.itar          (~150 MB x 7 cameras)

Installation

Install the ncore extra to pull in the NCore reader plus the HF downloader:

pip install py123d[ncore]

Conversion

Local mode — clips already downloaded to $NCORE_DATA_ROOT (see the Download section above):

export NCORE_DATA_ROOT=/path/to/ncore
py123d-conversion dataset=ncore

# Limit to a few clips during development:
py123d-conversion dataset=ncore dataset.parser.max_clips=2

Streaming modedataset=ncore-stream attaches an NCoreDownloader to each log parser. Each Ray worker downloads its assigned clip to a per-clip temp directory, runs the conversion, and deletes the temp directory before moving on. Useful when disk is tight or you want to convert a one-off subset without committing ~2.4 TB to permanent storage:

# Authenticate once — NCore is a gated HF dataset
export HF_TOKEN=hf_...

# Stream the first 5 clips end-to-end:
py123d-conversion dataset=ncore-stream \
    dataset.parser.max_clips=5

# Stream specific clip UUIDs:
py123d-conversion dataset=ncore-stream \
    'dataset.parser.downloader.clip_ids=[000da9de-0ee5-465a-9a2d-e7e91d3016bb]'

# Stream only one modality to cut per-clip traffic:
py123d-conversion dataset=ncore-stream \
    dataset.parser.max_clips=5 \
    dataset.parser.downloader.modality=lidar

Clip-level parallelism therefore also parallelizes downloads.

Warning

Each NCore clip is ~1.2 GB (1 x LiDAR archive + 7 x camera archives). Even small values of max_clips imply multi-GB of download traffic.

Note

The default conversion stores camera frames as JPEG-binary Arrow columns (NCore already stores JPEG in each frame, so no re-encoding happens) and LiDAR as Draco. Override via the ncore.yaml converter config if needed.

To pre-stage data outside the conversion pipeline (e.g. when you want a persistent local copy shared across multiple conversion runs), use py123d-download — see the Download section above for invocations.

Dataset Issues

  • Auto-labeled detections: Cuboids are auto-generated, so they can be noisier than human-annotated ground truth.

  • No ego dynamics in source: NCore carries rig-to-world poses only. Velocity/acceleration are reconstructed by py123d via finite differences when infer_ego_dynamics is enabled (the default).

  • FTheta 6-coefficient polynomials: NCore’s FTheta camera model uses 6 polynomial coefficients. py123d’s FThetaIntrinsics has been extended to carry 6 coefficients; the Physical AI AV parser pads its native 5-coefficient polynomial with a trailing zero.

  • Anisotropic linear_cde absorbed into polynomials: NCore’s FTheta adds a sensor→image affine term linear_cde = [c, d, e] that py123d’s isotropic FTheta model does not carry. For the typical Hyperion 8 case (d = e = 0, c within a few percent of 1) the conversion absorbs c into the polynomials as a geometric-mean (sqrt(c)) approximation and logs a warning. Non-trivial shear (d or e far from zero) raises — silent acceptance would misproject.

  • No HD map: This dataset does not include map information.

Citation

n/a