Camera¶

Camera Data¶

class py123d.datatypes.Camera[source]¶

A camera observation: image, extrinsic pose, timestamp, and model-specific metadata.

Public Data Attributes:

`timestamp`	The `Timestamp` of the image capture.
`metadata`	The `BaseCameraMetadata` associated with the camera.
`image`	The image captured by the camera, as a numpy array.
`camera_to_global_se3`	The extrinsic `PoseSE3` of the camera in global coordinates.

Inherited from BaseModality

`timestamp`	Returns the timestamp associated with this modality data, if available.
`metadata`	Returns the metadata associated with this modality data.
`modality_type`	Convenience property to access the modality type directly from the modality data.
`modality_id`	Convenience property to access the modality id directly from the modality data.
`modality_key`	Convenience property to access the modality key directly from the modality data.

Public Methods:

project_points_global(points_global)

Project 3D points in global frame to image pixel coordinates.

property modality_id: str | SerialIntEnum | None¶: Convenience property to access the modality id directly from the modality data.

property modality_key: str¶: Convenience property to access the modality key directly from the modality data.

property modality_type: ModalityType¶: Convenience property to access the modality type directly from the modality data.

property timestamp: Timestamp¶: The Timestamp of the image capture.

property metadata: BaseCameraMetadata¶: The BaseCameraMetadata associated with the camera.

property image: ndarray[tuple[Any, ...], dtype[uint8]]¶: The image captured by the camera, as a numpy array.

property camera_to_global_se3: PoseSE3¶: The extrinsic PoseSE3 of the camera in global coordinates.

project_points_global(points_global)[source]¶

Project 3D points in global frame to image pixel coordinates.

Convenience method that transforms points from global to camera frame, then delegates to BaseCameraMetadata.project_to_image().

Parameters:: points_global (ndarray[tuple[Any, ...], dtype[float64]]) – (N, 3) array of 3D points in global coordinates.
Return type:: Tuple[ndarray[tuple[Any, ...], dtype[float64]], ndarray[tuple[Any, ...], dtype[bool]], ndarray[tuple[Any, ...], dtype[float64]]]
Returns:: A tuple of: - pixel_coords: (N, 2) array of (u, v) pixel coordinates. - in_fov_mask: (N,) boolean mask. - depth: (N,) array of signed depths.

Base Camera Metadata¶

class py123d.datatypes.BaseCameraMetadata[source]¶

Base class for camera metadata. Provides the shared interface for all camera models.

Public Data Attributes:

`camera_model`	The projection model of the camera.
`camera_id`	The camera ID, unique within a sensor rig.
`camera_name`	The camera name, according to the dataset naming convention.
`camera_to_imu_se3`	The static extrinsic pose of the camera relative to the IMU frame.
`width`	The width of the camera image in pixels.
`height`	The height of the camera image in pixels.
`channel_type`	The channel type of the camera image.
`modality_type`	Returns the type of the modality that this metadata describes.
`modality_id`	Returns the camera ID as the modality ID.
`aspect_ratio`	The aspect ratio (width / height) of the camera.

Inherited from BaseModalityMetadata

`modality_type`	Returns the type of the modality that this metadata describes.
`modality_id`	Optional identifier for the modality, e.g. sensor ID for sensor modalities.
`modality_key`	Returns a unique key for this modality, combining type and id if applicable.

Public Methods:

project_to_image(points_cam)

Project 3D points in camera frame to image pixel coordinates.

Inherited from BaseMetadata

`to_dict`()	Serialize the metadata instance to a plain Python dictionary.
`from_dict`(data_dict)	Construct a metadata instance from a plain Python dictionary.

Private Methods:

_compute_in_fov_mask(pixel_coords, depth[, eps])

Compute a boolean mask for points in front of the camera and within image bounds.

abstract property camera_model: CameraModel¶: The projection model of the camera.

abstract property camera_id: CameraID¶: The camera ID, unique within a sensor rig.

abstract property camera_name: str¶: The camera name, according to the dataset naming convention.

abstractmethod classmethod from_dict(data_dict)¶

Construct a metadata instance from a plain Python dictionary.

Parameters:: data_dict (Dict[str, Any]) – A dictionary containing the metadata fields.
Return type:: BaseMetadata
Returns:: A metadata instance.

property modality_key: str¶: Returns a unique key for this modality, combining type and id if applicable.

abstractmethod to_dict()¶

Serialize the metadata instance to a plain Python dictionary.

Return type:: Dict[str, Any]
Returns:: A dictionary representation using only default Python types.

abstract property camera_to_imu_se3: PoseSE3¶: The static extrinsic pose of the camera relative to the IMU frame.

abstract property width: int¶: The width of the camera image in pixels.

abstract property height: int¶: The height of the camera image in pixels.

property channel_type: CameraChannelType¶: The channel type of the camera image. Defaults to RGB.

property modality_type: ModalityType¶: Returns the type of the modality that this metadata describes.

property modality_id: str | SerialIntEnum | None¶: Returns the camera ID as the modality ID.

property aspect_ratio: float¶: The aspect ratio (width / height) of the camera.

abstractmethod project_to_image(points_cam)[source]¶

Project 3D points in camera frame to image pixel coordinates.

Parameters:: points_cam (ndarray[tuple[Any, ...], dtype[float64]]) – (N, 3) array of 3D points in the camera coordinate frame.
Return type:: Tuple[ndarray[tuple[Any, ...], dtype[float64]], ndarray[tuple[Any, ...], dtype[bool]], ndarray[tuple[Any, ...], dtype[float64]]]
Returns:: A tuple of (pixel_coords (N,2), in_fov_mask (N,), depth (N,)).

Camera ID¶

class py123d.datatypes.CameraID[source]¶

Enumeration of camera IDs. These are unique within a sensor rig and can be used as modality IDs for camera metadata.

PCAM_F0 = 0¶: Front pinhole camera.

PCAM_B0 = 1¶: Back pinhole camera.

PCAM_L0 = 2¶: Left pinhole camera, first from front to back.

PCAM_L1 = 3¶: Left pinhole camera, second from front to back.

PCAM_L2 = 4¶: Left pinhole camera, third from front to back.

PCAM_R0 = 5¶: Right pinhole camera, first from front to back.

PCAM_R1 = 6¶: Right pinhole camera, second from front to back.

PCAM_R2 = 7¶: Right pinhole camera, third from front to back.

PCAM_STEREO_L = 8¶: Left pinhole stereo camera.

PCAM_STEREO_R = 9¶: Right pinhole stereo camera.

FMCAM_L = 10¶: Left-facing fisheye MEI camera.

FMCAM_R = 11¶: Right-facing fisheye MEI camera.

FTCAM_F0 = 12¶: Front F-theta camera.

FTCAM_TELE_F0 = 13¶: Front telephoto F-theta camera.

FTCAM_TELE_B0 = 18¶: Back telephoto F-theta camera.

FTCAM_L0 = 14¶: Left F-theta camera, first from front to back.

FTCAM_L1 = 15¶: Left F-theta camera, second from front to back.

FTCAM_R0 = 16¶: Right F-theta camera, first from front to back.

FTCAM_R1 = 17¶: Right F-theta camera, second from front to back.

Camera Model¶

class py123d.datatypes.CameraModel[source]¶

Enumeration of camera projection models.

PINHOLE = 0¶: Standard pinhole camera model.

FISHEYE_MEI = 1¶: Fisheye camera using the MEI (mirror) model.

FTHETA = 2¶: F-theta polynomial camera model.

Camera Channel Type¶

class py123d.datatypes.CameraChannelType[source]¶

Enumeration of camera channel types.

RGB = 0¶

GRAYSCALE = 1¶