Python API#
ome_arrow.meta#
Meta-definition for OME-Arrow format.
ome_arrow.tensor#
Tensor view utilities for OME-Arrow pixel data.
- class src.ome_arrow.tensor.LazyTensorView(*, loader: Callable[[], dict[str, Any] | StructScalar | StructArray | ChunkedArray], resolver: Callable[[dict[str, Any]], TensorView] | None = None, t: int | slice | Sequence[int] | None = None, z: int | slice | Sequence[int] | None = None, c: int | slice | Sequence[int] | None = None, roi: tuple[int, int, int, int] | None = None, roi3d: tuple[int, int, int, int, int, int] | None = None, roi_nd: tuple[int, ...] | None = None, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None = None, tile: tuple[int, int] | None = None, layout: str | None = None, dtype: dtype | None = None, chunk_policy: Literal['auto', 'combine', 'keep'] = 'auto', channel_policy: Literal['error', 'first'] = 'error')[source]#
Bases:
objectDeferred TensorView plan with Polars-style collect semantics.
- collect() TensorView[source]#
Materialize this lazy plan into a concrete TensorView.
- property device: str#
Return the tensor storage device.
Note
For unresolved lazy plans, this returns
"cpu"without callingcollect().
- property dtype: dtype#
Return the tensor dtype.
Note
Accessing this property calls
collect()and may materialize data from source files (for example Parquet/TIFF), which can be expensive.
- iter_dlpack(*, batch_size: int | None = None, tile_size: tuple[int, int] | None = None, tiles: tuple[int, int] | None = None, shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Iterator[Any][source]#
Iterate DLPack outputs in batches or 2D tiles.
- Parameters:
batch_size – Number of time indices per batch.
tile_size – Optional tile size as
(tile_h, tile_w).tiles – Deprecated alias for
tile_size.shuffle – Whether to shuffle iteration order.
seed – Optional random seed for deterministic shuffling.
prefetch – Placeholder prefetch count.
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (
"arrow"or"numpy").
- Returns:
Iterator of DLPack-compatible objects.
- Return type:
Iterator[Any]
- iter_tiles_3d(*, tile_size: tuple[int, int, int], shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'numpy') Iterator[Any][source]#
Iterate DLPack outputs in 3D tiles.
- Parameters:
tile_size – Tile shape as
(tile_z, tile_h, tile_w).shuffle – Whether to shuffle iteration order.
seed – Optional random seed for deterministic shuffling.
prefetch – Placeholder prefetch count.
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (currently
"numpy"only).
- Returns:
Iterator of DLPack-compatible objects.
- Return type:
Iterator[Any]
- property layout: str#
Return the effective tensor layout.
Note
Accessing this property calls
collect()and may materialize data from source files (for example Parquet/TIFF), which can be expensive.
- select(*, t: int | slice | Sequence[int] | None | _Unset = _UNSET, z: int | slice | Sequence[int] | None | _Unset = _UNSET, c: int | slice | Sequence[int] | None | _Unset = _UNSET, roi: tuple[int, int, int, int] | None | _Unset = _UNSET, roi3d: tuple[int, int, int, int, int, int] | None | _Unset = _UNSET, roi_nd: tuple[int, ...] | None | _Unset = _UNSET, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None | _Unset = _UNSET, tile: tuple[int, int] | None | _Unset = _UNSET) LazyTensorView[source]#
Return a new lazy plan with updated index/ROI selections.
- property shape: tuple[int, ...]#
Return the tensor shape.
Note
Accessing this property calls
collect()and may materialize data from source files (for example Parquet/TIFF), which can be expensive.
- property strides: tuple[int, ...]#
Return tensor strides in bytes.
Note
Accessing this property calls
collect()and may materialize data from source files (for example Parquet/TIFF), which can be expensive.
- to_dlpack(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Export the planned view as a DLPack object.
- Parameters:
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (
"arrow"or"numpy").
- Returns:
DLPack-compatible object.
- Return type:
Any
- to_jax(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Convert the planned view to a JAX array.
- Parameters:
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (
"arrow"or"numpy").
- Returns:
JAX array when JAX is installed.
- Return type:
Any
- to_numpy(*, contiguous: bool = False) ndarray[source]#
Materialize as a NumPy array.
- Parameters:
contiguous – When True, return a contiguous array copy.
- Returns:
Materialized array.
- Return type:
np.ndarray
- to_torch(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Convert the planned view to a torch tensor.
- Parameters:
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (
"arrow"or"numpy").
- Returns:
torch.Tensorwhen torch is installed.- Return type:
Any
- with_layout(layout: str) LazyTensorView[source]#
Return a new lazy view with an updated layout.
- class src.ome_arrow.tensor.TensorView(data: dict[str, Any] | StructScalar | StructArray | ChunkedArray, *, plane_loader: Callable[[int, int, int], ndarray] | None = None, t: int | slice | Sequence[int] | None = None, z: int | slice | Sequence[int] | None = None, c: int | slice | Sequence[int] | None = None, roi: tuple[int, int, int, int] | None = None, roi3d: tuple[int, int, int, int, int, int] | None = None, roi_nd: tuple[int, ...] | None = None, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None = None, tile: tuple[int, int] | None = None, layout: str | None = None, dtype: dtype | None = None, chunk_policy: Literal['auto', 'combine', 'keep'] = 'auto', channel_policy: Literal['error', 'first'] = 'error')[source]#
Bases:
objectView OME-Arrow pixel data as a tensor-like object.
- Parameters:
data – OME-Arrow dict, StructScalar, or 1-row StructArray/ChunkedArray.
t – Time index selection (int, slice, or sequence). Default: all.
z – Z index selection (int, slice, or sequence). Default: all.
c – Channel index selection (int, slice, or sequence). Default: all.
roi – Spatial crop (x, y, w, h) in pixels. Default: full frame.
roi3d – Spatial + depth crop (x, y, z, w, h, d). This is a convenience alias for
roi=(x, y, w, h)andz=slice(z, z + d).roi_nd – General ROI tuple with min/max bounds, interpreted by
roi_type.roi_type – ROI interpretation mode for
roi_nd. Supported values:"2d"=(ymin, xmin, ymax, xmax);"2d_timelapse"=(tmin, ymin, xmin, tmax, ymax, xmax);"3d"=(zmin, ymin, xmin, zmax, ymax, xmax);"4d"=(tmin, zmin, ymin, xmin, tmax, zmax, ymax, xmax).tile – Tile index (tile_y, tile_x) based on chunk grid.
layout – Desired layout string using TZCYX letters where T=time, Z=depth, C=channel, Y=row axis, X=column axis. TZCHW aliases are also accepted for compatibility.
dtype – Output dtype override. Defaults to pixels_meta.type when valid.
chunk_policy – Handling for
pyarrow.ChunkedArrayinputs. “auto” keeps multi-chunk arrays and unwraps single-chunk arrays. “combine” always combines multi-chunk arrays eagerly. “keep” always keeps chunked storage.channel_policy – Behavior when dropping C from layout while multiple channels are selected. “error” raises (default). “first” keeps the first channel.
- property device: str#
Return the storage device for the view (currently always “cpu”).
- property dtype: dtype#
Return the tensor dtype.
- iter_dlpack(*, batch_size: int | None = None, tile_size: tuple[int, int] | None = None, tiles: tuple[int, int] | None = None, shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Iterator[Any][source]#
Iterate over DLPack capsules in batches or tiles.
- Parameters:
batch_size – Number of T indices per batch. Defaults to full range.
tile_size – Tile size (tile_h, tile_w) in pixels for spatial tiling.
tiles – Deprecated alias for
tile_size.shuffle – Whether to shuffle the iteration order.
seed – Seed for deterministic shuffling.
prefetch – Placeholder for future asynchronous prefetch support. Currently validated but does not change synchronous iteration.
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize contiguous buffers if needed.
mode – Export mode. “arrow” returns 1D values buffers.
- Yields:
DLPack object per batch or tile.
- iter_tiles_3d(*, tile_size: tuple[int, int, int], shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'numpy') Iterator[Any][source]#
Iterate over 3D tiles (z, y, x) as DLPack capsules.
- Parameters:
tile_size – Tile size as
(tile_z, tile_h, tile_w).shuffle – Whether to shuffle the tile order.
seed – Seed for deterministic shuffling.
prefetch – Placeholder for future asynchronous prefetch support.
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize contiguous buffers if needed.
mode – Export mode. Must be
"numpy"for tiled 3D iteration.
- Yields:
DLPack object per 3D tile.
- property layout: str#
Return the effective layout for this view.
- property shape: tuple[int, ...]#
Return the tensor shape for the current layout.
- property strides: tuple[int, ...]#
Return the tensor strides in bytes for the current layout.
- to_dlpack(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Export the view as a DLPack capsule.
- Parameters:
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize a contiguous buffer if needed.
mode – Export mode. “arrow” returns a capsule for the Arrow values buffer (1D). “numpy” materializes a tensor-shaped NumPy view. Zero-copy Arrow mode requires Arrow-backed inputs (typically Parquet/Vortex ingestion with canonical schema); StructScalar and dict inputs are normalized through Python objects.
- Returns:
DLPack object compatible with torch/jax import utilities. The returned object is single-use per DLPack ownership semantics: after a consumer imports it, the capsule must not be reused.
- Raises:
ValueError – If an unsupported device is requested.
RuntimeError – If required optional dependencies are missing.
- to_jax(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Convert the view into a JAX array using DLPack.
- Parameters:
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize a contiguous buffer if needed.
mode – Export mode. “arrow” returns a 1D values buffer.
- Returns:
Array backed by the DLPack capsule.
- Return type:
jax.Array
- to_numpy(*, contiguous: bool = False) ndarray[source]#
Materialize the view as a NumPy array.
- Parameters:
contiguous – When True, return a contiguous array copy.
- Returns:
Array in the requested layout.
- Return type:
np.ndarray
- to_torch(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Convert the view into a torch.Tensor using DLPack.
- Parameters:
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize a contiguous buffer if needed.
mode – Export mode. “arrow” returns a 1D values buffer.
- Returns:
Tensor backed by the DLPack capsule.
- Return type:
torch.Tensor
- with_layout(layout: str) TensorView[source]#
Return a new TensorView with a layout override.
- Parameters:
layout – Desired layout string using TZCYX letters where T=time, Z=depth, C=channel, Y=row axis, X=column axis. TZCHW aliases are also accepted for compatibility.
- Returns:
New view with the requested layout.
- Return type: