Python API#

ome_arrow.meta#

Meta-definition for OME-Arrow format.

ome_arrow.tensor#

Tensor view utilities for OME-Arrow pixel data.

class src.ome_arrow.tensor.LazyTensorView(*, loader: Callable[[], dict[str, Any] | StructScalar | StructArray | ChunkedArray], resolver: Callable[[dict[str, Any]], TensorView] | None = None, t: int | slice | Sequence[int] | None = None, z: int | slice | Sequence[int] | None = None, c: int | slice | Sequence[int] | None = None, roi: tuple[int, int, int, int] | None = None, roi3d: tuple[int, int, int, int, int, int] | None = None, roi_nd: tuple[int, ...] | None = None, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None = None, tile: tuple[int, int] | None = None, layout: str | None = None, dtype: dtype | None = None, chunk_policy: Literal['auto', 'combine', 'keep'] = 'auto', channel_policy: Literal['error', 'first'] = 'error')[source]#

Bases: object

Deferred TensorView plan with Polars-style collect semantics.

collect() TensorView[source]#

Materialize this lazy plan into a concrete TensorView.

property device: str#

Return the tensor storage device.

Note

For unresolved lazy plans, this returns "cpu" without calling collect().

property dtype: dtype#

Return the tensor dtype.

Note

Accessing this property calls collect() and may materialize data from source files (for example Parquet/TIFF), which can be expensive.

iter_dlpack(*, batch_size: int | None = None, tile_size: tuple[int, int] | None = None, tiles: tuple[int, int] | None = None, shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Iterator[Any][source]#

Iterate DLPack outputs in batches or 2D tiles.

Parameters:
  • batch_size – Number of time indices per batch.

  • tile_size – Optional tile size as (tile_h, tile_w).

  • tiles – Deprecated alias for tile_size.

  • shuffle – Whether to shuffle iteration order.

  • seed – Optional random seed for deterministic shuffling.

  • prefetch – Placeholder prefetch count.

  • device – Target device ("cpu" or "cuda").

  • contiguous – When True, materialize contiguous data when needed.

  • mode – Export mode ("arrow" or "numpy").

Returns:

Iterator of DLPack-compatible objects.

Return type:

Iterator[Any]

iter_tiles_3d(*, tile_size: tuple[int, int, int], shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'numpy') Iterator[Any][source]#

Iterate DLPack outputs in 3D tiles.

Parameters:
  • tile_size – Tile shape as (tile_z, tile_h, tile_w).

  • shuffle – Whether to shuffle iteration order.

  • seed – Optional random seed for deterministic shuffling.

  • prefetch – Placeholder prefetch count.

  • device – Target device ("cpu" or "cuda").

  • contiguous – When True, materialize contiguous data when needed.

  • mode – Export mode (currently "numpy" only).

Returns:

Iterator of DLPack-compatible objects.

Return type:

Iterator[Any]

property layout: str#

Return the effective tensor layout.

Note

Accessing this property calls collect() and may materialize data from source files (for example Parquet/TIFF), which can be expensive.

select(*, t: int | slice | Sequence[int] | None | _Unset = _UNSET, z: int | slice | Sequence[int] | None | _Unset = _UNSET, c: int | slice | Sequence[int] | None | _Unset = _UNSET, roi: tuple[int, int, int, int] | None | _Unset = _UNSET, roi3d: tuple[int, int, int, int, int, int] | None | _Unset = _UNSET, roi_nd: tuple[int, ...] | None | _Unset = _UNSET, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None | _Unset = _UNSET, tile: tuple[int, int] | None | _Unset = _UNSET) LazyTensorView[source]#

Return a new lazy plan with updated index/ROI selections.

property shape: tuple[int, ...]#

Return the tensor shape.

Note

Accessing this property calls collect() and may materialize data from source files (for example Parquet/TIFF), which can be expensive.

property strides: tuple[int, ...]#

Return tensor strides in bytes.

Note

Accessing this property calls collect() and may materialize data from source files (for example Parquet/TIFF), which can be expensive.

to_dlpack(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#

Export the planned view as a DLPack object.

Parameters:
  • device – Target device ("cpu" or "cuda").

  • contiguous – When True, materialize contiguous data when needed.

  • mode – Export mode ("arrow" or "numpy").

Returns:

DLPack-compatible object.

Return type:

Any

to_jax(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#

Convert the planned view to a JAX array.

Parameters:
  • device – Target device ("cpu" or "cuda").

  • contiguous – When True, materialize contiguous data when needed.

  • mode – Export mode ("arrow" or "numpy").

Returns:

JAX array when JAX is installed.

Return type:

Any

to_numpy(*, contiguous: bool = False) ndarray[source]#

Materialize as a NumPy array.

Parameters:

contiguous – When True, return a contiguous array copy.

Returns:

Materialized array.

Return type:

np.ndarray

to_torch(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#

Convert the planned view to a torch tensor.

Parameters:
  • device – Target device ("cpu" or "cuda").

  • contiguous – When True, materialize contiguous data when needed.

  • mode – Export mode ("arrow" or "numpy").

Returns:

torch.Tensor when torch is installed.

Return type:

Any

with_layout(layout: str) LazyTensorView[source]#

Return a new lazy view with an updated layout.

class src.ome_arrow.tensor.TensorView(data: dict[str, Any] | StructScalar | StructArray | ChunkedArray, *, plane_loader: Callable[[int, int, int], ndarray] | None = None, t: int | slice | Sequence[int] | None = None, z: int | slice | Sequence[int] | None = None, c: int | slice | Sequence[int] | None = None, roi: tuple[int, int, int, int] | None = None, roi3d: tuple[int, int, int, int, int, int] | None = None, roi_nd: tuple[int, ...] | None = None, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None = None, tile: tuple[int, int] | None = None, layout: str | None = None, dtype: dtype | None = None, chunk_policy: Literal['auto', 'combine', 'keep'] = 'auto', channel_policy: Literal['error', 'first'] = 'error')[source]#

Bases: object

View OME-Arrow pixel data as a tensor-like object.

Parameters:
  • data – OME-Arrow dict, StructScalar, or 1-row StructArray/ChunkedArray.

  • t – Time index selection (int, slice, or sequence). Default: all.

  • z – Z index selection (int, slice, or sequence). Default: all.

  • c – Channel index selection (int, slice, or sequence). Default: all.

  • roi – Spatial crop (x, y, w, h) in pixels. Default: full frame.

  • roi3d – Spatial + depth crop (x, y, z, w, h, d). This is a convenience alias for roi=(x, y, w, h) and z=slice(z, z + d).

  • roi_nd – General ROI tuple with min/max bounds, interpreted by roi_type.

  • roi_type – ROI interpretation mode for roi_nd. Supported values: "2d" = (ymin, xmin, ymax, xmax); "2d_timelapse" = (tmin, ymin, xmin, tmax, ymax, xmax); "3d" = (zmin, ymin, xmin, zmax, ymax, xmax); "4d" = (tmin, zmin, ymin, xmin, tmax, zmax, ymax, xmax).

  • tile – Tile index (tile_y, tile_x) based on chunk grid.

  • layout – Desired layout string using TZCYX letters where T=time, Z=depth, C=channel, Y=row axis, X=column axis. TZCHW aliases are also accepted for compatibility.

  • dtype – Output dtype override. Defaults to pixels_meta.type when valid.

  • chunk_policy – Handling for pyarrow.ChunkedArray inputs. “auto” keeps multi-chunk arrays and unwraps single-chunk arrays. “combine” always combines multi-chunk arrays eagerly. “keep” always keeps chunked storage.

  • channel_policy – Behavior when dropping C from layout while multiple channels are selected. “error” raises (default). “first” keeps the first channel.

property device: str#

Return the storage device for the view (currently always “cpu”).

property dtype: dtype#

Return the tensor dtype.

iter_dlpack(*, batch_size: int | None = None, tile_size: tuple[int, int] | None = None, tiles: tuple[int, int] | None = None, shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Iterator[Any][source]#

Iterate over DLPack capsules in batches or tiles.

Parameters:
  • batch_size – Number of T indices per batch. Defaults to full range.

  • tile_size – Tile size (tile_h, tile_w) in pixels for spatial tiling.

  • tiles – Deprecated alias for tile_size.

  • shuffle – Whether to shuffle the iteration order.

  • seed – Seed for deterministic shuffling.

  • prefetch – Placeholder for future asynchronous prefetch support. Currently validated but does not change synchronous iteration.

  • device – Target device (“cpu” or “cuda”).

  • contiguous – When True, materialize contiguous buffers if needed.

  • mode – Export mode. “arrow” returns 1D values buffers.

Yields:

DLPack object per batch or tile.

iter_tiles_3d(*, tile_size: tuple[int, int, int], shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'numpy') Iterator[Any][source]#

Iterate over 3D tiles (z, y, x) as DLPack capsules.

Parameters:
  • tile_size – Tile size as (tile_z, tile_h, tile_w).

  • shuffle – Whether to shuffle the tile order.

  • seed – Seed for deterministic shuffling.

  • prefetch – Placeholder for future asynchronous prefetch support.

  • device – Target device (“cpu” or “cuda”).

  • contiguous – When True, materialize contiguous buffers if needed.

  • mode – Export mode. Must be "numpy" for tiled 3D iteration.

Yields:

DLPack object per 3D tile.

property layout: str#

Return the effective layout for this view.

property shape: tuple[int, ...]#

Return the tensor shape for the current layout.

property strides: tuple[int, ...]#

Return the tensor strides in bytes for the current layout.

to_dlpack(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#

Export the view as a DLPack capsule.

Parameters:
  • device – Target device (“cpu” or “cuda”).

  • contiguous – When True, materialize a contiguous buffer if needed.

  • mode – Export mode. “arrow” returns a capsule for the Arrow values buffer (1D). “numpy” materializes a tensor-shaped NumPy view. Zero-copy Arrow mode requires Arrow-backed inputs (typically Parquet/Vortex ingestion with canonical schema); StructScalar and dict inputs are normalized through Python objects.

Returns:

DLPack object compatible with torch/jax import utilities. The returned object is single-use per DLPack ownership semantics: after a consumer imports it, the capsule must not be reused.

Raises:
  • ValueError – If an unsupported device is requested.

  • RuntimeError – If required optional dependencies are missing.

to_jax(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#

Convert the view into a JAX array using DLPack.

Parameters:
  • device – Target device (“cpu” or “cuda”).

  • contiguous – When True, materialize a contiguous buffer if needed.

  • mode – Export mode. “arrow” returns a 1D values buffer.

Returns:

Array backed by the DLPack capsule.

Return type:

jax.Array

to_numpy(*, contiguous: bool = False) ndarray[source]#

Materialize the view as a NumPy array.

Parameters:

contiguous – When True, return a contiguous array copy.

Returns:

Array in the requested layout.

Return type:

np.ndarray

to_torch(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#

Convert the view into a torch.Tensor using DLPack.

Parameters:
  • device – Target device (“cpu” or “cuda”).

  • contiguous – When True, materialize a contiguous buffer if needed.

  • mode – Export mode. “arrow” returns a 1D values buffer.

Returns:

Tensor backed by the DLPack capsule.

Return type:

torch.Tensor

with_layout(layout: str) TensorView[source]#

Return a new TensorView with a layout override.

Parameters:

layout – Desired layout string using TZCYX letters where T=time, Z=depth, C=channel, Y=row axis, X=column axis. TZCHW aliases are also accepted for compatibility.

Returns:

New view with the requested layout.

Return type:

TensorView