Python API#
ome_arrow#
Init file for ome_arrow package.
ome_arrow.core#
Core of the ome_arrow package, used for classes and such.
- class ome_arrow.core.OMEArrow(data: str | dict | pa.StructScalar | 'np.ndarray', tcz: Tuple[int, int, int] = (0, 0, 0), *, dim_order: str | None = None, column_name: str = 'ome_arrow', row_index: int = 0, image_type: str | None = None, lazy: bool = False)[source]#
Bases:
objectSmall convenience toolkit for working with ome-arrow data.
If input is a TIFF path, this loads it via tiff_to_ome_arrow. If input is a dict, it will be converted using to_struct_scalar. If input is already a pa.StructScalar, it is used as-is.
In Jupyter, evaluating the instance will render the first plane using matplotlib (via _repr_html_). Call view_matplotlib() to select a specific (z, t, c) plane.
- Parameters:
input – TIFF path, nested dict, or pa.StructScalar.
struct – Expected Arrow StructType (e.g., OME_ARROW_STRUCT).
- collect() OMEArrow[source]#
Materialize deferred source data and return
self.- Returns:
The same instance after materialization.
- Return type:
- property data: StructScalar#
Return the materialized OME-Arrow StructScalar.
- Returns:
Materialized OME-Arrow record.
- Return type:
pa.StructScalar
- Raises:
RuntimeError – If the record could not be initialized.
- export(how: str = 'numpy', dtype: np.dtype = np.uint16, strict: bool = True, clamp: bool = False, *, out: str | None = None, dim_order: str = 'TCZYX', compression: str | None = 'zlib', compression_level: int = 6, tile: tuple[int, int] | None = None, chunks: tuple[int, int, int, int, int] | None = None, zarr_compressor: str | None = 'zstd', zarr_level: int = 7, use_channel_colors: bool = False, parquet_column_name: str = 'ome_arrow', parquet_compression: str | None = 'zstd', parquet_metadata: dict[str, str] | None = None, vortex_column_name: str = 'ome_arrow', vortex_metadata: dict[str, str] | None = None) np.array | dict | pa.StructScalar | str[source]#
Export the OME-Arrow content in a chosen representation.
- Parameters:
how – “numpy” → TCZYX np.ndarray “dict” → plain Python dict “scalar” → pa.StructScalar (as-is) “ome-tiff” → write OME-TIFF via BioIO “ome-zarr” → write OME-Zarr (OME-NGFF) via BioIO “parquet” → write a single-row Parquet with one struct column “vortex” → write a single-row Vortex file with one struct column
dtype – Target dtype for “numpy”/writers (default: np.uint16).
strict – For “numpy”: raise if a plane has wrong pixel length.
clamp – For “numpy”/writers: clamp values into dtype range before cast.
specific) (Keyword-only (writer)
------------------------------
out – Output path (required for ‘ome-tiff’, ‘ome-zarr’, and ‘parquet’).
dim_order – Axes string for BioIO writers; default “TCZYX”.
tile (compression / compression_level /) – OME-TIFF options (passed through to tifffile via BioIO).
zarr_level (chunks / zarr_compressor /) – OME-Zarr options (chunk shape, compressor hint, level). If chunks is None, a TCZYX default is chosen (1,1,<=4,<=512,<=512).
use_channel_colors – Try to embed per-channel display colors when safe; otherwise omitted.
parquet_* – Options for Parquet export (column name, compression, file metadata).
vortex_* – Options for Vortex export (column name, file metadata).
- Returns:
“numpy”: np.ndarray (T, C, Z, Y, X)
”dict”: dict
”scalar”: pa.StructScalar
”ome-tiff”: output path (str)
”ome-zarr”: output path (str)
”parquet”: output path (str)
”vortex”: output path (str)
- Return type:
Any
- Raises:
ValueError: – Unknown ‘how’ or missing required params.
- info() Dict[str, Any][source]#
Describe the OME-Arrow data structure.
- Returns:
shape: (T, C, Z, Y, X)
type: classification string
summary: human-readable text
- Return type:
dict with keys
- property is_lazy: bool#
Return whether this instance still has deferred work.
- classmethod scan(data: str, *, tcz: Tuple[int, int, int] = (0, 0, 0), column_name: str = 'ome_arrow', row_index: int = 0, image_type: str | None = None) OMEArrow[source]#
Create a lazily-loaded OMEArrow, similar to Polars scan semantics.
- Parameters:
data – Input source path/URL.
tcz – Default (t, c, z) indices used for view helpers.
column_name – OME-Arrow column name for tabular sources.
row_index – Row index for tabular sources.
image_type – Optional image type override.
- Returns:
Lazily planned OMEArrow instance.
- Return type:
- slice(x_min: int, x_max: int, y_min: int, y_max: int, t_indices: Iterable[int] | None = None, c_indices: Iterable[int] | None = None, z_indices: Iterable[int] | None = None, fill_missing: bool = True) OMEArrow[source]#
Create a cropped copy of an OME-Arrow record.
Crops spatially to [y_min:y_max, x_min:x_max] (half-open) and, if provided, filters/reindexes T/C/Z to the given index sets.
- Parameters:
x_min (int) – Half-open crop bounds in pixels (0-based).
x_max (int) – Half-open crop bounds in pixels (0-based).
y_min (int) – Half-open crop bounds in pixels (0-based).
y_max (int) – Half-open crop bounds in pixels (0-based).
t_indices (Iterable[int] | None) – Optional explicit indices to keep for T, C, Z. If None, keep all. Selected indices are reindexed to 0..len-1 in the output.
c_indices (Iterable[int] | None) – Optional explicit indices to keep for T, C, Z. If None, keep all. Selected indices are reindexed to 0..len-1 in the output.
z_indices (Iterable[int] | None) – Optional explicit indices to keep for T, C, Z. If None, keep all. Selected indices are reindexed to 0..len-1 in the output.
fill_missing (bool) – If True, any missing (t,c,z) planes in the selection are zero-filled.
- Returns:
New OME-Arrow record with updated sizes and planes.
- Return type:
OMEArrow object
- slice_lazy(x_min: int, x_max: int, y_min: int, y_max: int, t_indices: Iterable[int] | None = None, c_indices: Iterable[int] | None = None, z_indices: Iterable[int] | None = None, fill_missing: bool = True) OMEArrow[source]#
Return a lazily planned slice, collected on first execution.
For lazy sources created with
OMEArrow.scan(...), this queues a deferred slice operation and returns a new lazy OMEArrow plan produced fromOMEArrow.scan(...). For already materialized sources, this falls back to eagerslice(). This method does not mutateself.Notes
slice_lazyalways returns a new plan object. Internally, the returned plan gets a fresh_lazy_sliceslist ([*self._lazy_slices, new_slice]), so chained plans do not share mutable slice state with the originalOMEArrow. A common footgun is:oa.slice_lazy(...).collect()followed byoa.tensor_view(...). Those calls can load/materialize the same source twice becauseoaremains the original plan. For a single-load workflow, keep working from the value returned byslice_lazy/collect.- Parameters:
x_min – Inclusive minimum X index for the crop.
x_max – Exclusive maximum X index for the crop.
y_min – Inclusive minimum Y index for the crop.
y_max – Exclusive maximum Y index for the crop.
t_indices – Optional time indices to retain.
c_indices – Optional channel indices to retain.
z_indices – Optional depth indices to retain.
fill_missing – Whether to zero-fill missing (t, c, z) planes.
- Returns:
Lazy plan when source is lazy; eager slice result otherwise.
- Return type:
- tensor_view(*, scene: int | None = None, t: int | slice | Sequence[int] | None = None, z: int | slice | Sequence[int] | None = None, c: int | slice | Sequence[int] | None = None, roi: tuple[int, int, int, int] | None = None, roi3d: tuple[int, int, int, int, int, int] | None = None, roi_nd: tuple[int, ...] | None = None, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None = None, tile: tuple[int, int] | None = None, layout: str | None = None, dtype: dtype | None = None, chunk_policy: Literal['auto', 'combine', 'keep'] = 'auto', channel_policy: Literal['error', 'first'] = 'error') TensorView | LazyTensorView[source]#
Create a TensorView of the pixel data.
- Parameters:
scene – Scene index (only 0 is supported for single-image records).
t – Time index selection (int, slice, or sequence). Default: all.
z – Z index selection (int, slice, or sequence). Default: all.
c – Channel index selection (int, slice, or sequence). Default: all.
roi – Spatial crop (x, y, w, h) in pixels.
roi3d – Spatial + depth crop (x, y, z, w, h, d) in pixels/planes. This is a convenience alias for
roi=(x, y, w, h)andz=slice(z, z + d).roi_nd – General ROI tuple with min/max bounds.
roi_type – ROI interpretation mode for
roi_nd. Supported values:"2d","2d_timelapse","3d", and"4d".tile – Tile index (tile_y, tile_x) based on chunk grid.
layout – Desired layout string using TZCYX letters where T=time, Z=depth, C=channel, Y=row axis, X=column axis. TZCHW aliases are also accepted for compatibility.
dtype – Output dtype override.
chunk_policy – Handling for
pyarrow.ChunkedArrayinputs.channel_policy – Behavior when dropping C from layout while multiple channels are selected. “error” raises (default). “first” keeps the first channel.
- Returns:
Tensor view over selected pixels. In lazy mode, this returns a deferred
LazyTensorViewthat resolves on first execution call (for exampleto_numpy()) without forcingselfto materialize unless deferredslice_lazyoperations are queued.- Return type:
- Raises:
ValueError – If an unsupported scene is requested.
- view(how: str = 'matplotlib', tcz: tuple[int, int, int] = (0, 0, 0), autoscale: bool = True, vmin: int | None = None, vmax: int | None = None, cmap: str = 'gray', show: bool = True, c: int | None = None, downsample: int = 1, opacity: str | float = 'sigmoid', clim: tuple[float, float] | None = None, show_axes: bool = True, scaling_values: tuple[float, float, float] | None = None) tuple[matplotlib.figure.Figure, Any, Any] | 'pyvista.Plotter'[source]#
Render an OME-Arrow record using Matplotlib or PyVista.
This convenience method supports two rendering backends:
how="matplotlib"renders a single(t, c, z)plane as a 2D image.how="pyvista"creates an interactive 3D PyVista visualization.
- Parameters:
how – Rendering backend. One of
"matplotlib"or"pyvista".tcz –
(t, c, z)indices used for plane display.autoscale – Infer Matplotlib display limits from image range when
vmin/vmaxare not provided.vmin – Lower display limit for Matplotlib intensity scaling.
vmax – Upper display limit for Matplotlib intensity scaling.
cmap – Matplotlib colormap name for single-channel display.
show – Whether to display the plot immediately.
c – Channel index override for PyVista. If
None, usestcz[1].downsample – Integer downsampling factor for PyVista views. Higher values render faster for large volumes but reduce spatial resolution.
opacity – Opacity for PyVista. Either a float in
[0, 1]or"sigmoid".clim – Contrast limits
(low, high)for PyVista rendering.show_axes – Whether to display axes in the PyVista scene.
scaling_values – Physical scale multipliers
(x, y, z)used by PyVista. IfNone, uses OME metadata-derived scaling.
- Returns:
tuple[matplotlib.figure.Figure, matplotlib.axes.Axes, matplotlib.image.AxesImage] | pyvista.Plotter: For
how="matplotlib", returns the tuple emitted byome_arrow.view.view_matplotlib()as(figure, axes, image). Forhow="pyvista", returns apyvista.Plotter.- Raises:
ValueError – If a requested plane is not found or the render mode is unsupported.
TypeError – If parameter types are invalid.
Notes
The
how="pyvista"mode normally outputs an interactive visualization, but attempts to embed a static PNG snapshot for non-interactive renderers (for example, static docs builds, nbconvert HTML/PDF exports, rendered/read-only notebook views such as GitHub notebook previews, and CI log viewers).When
show=Falseandhow="pyvista", the returnedpyvista.Plottercan be shown later.
ome_arrow.ingest#
Converting to and from OME-Arrow formats.
- ome_arrow.ingest.from_jax_array(arr: Any, *, dim_order: str | None = None, image_id: str | None = None, name: str | None = None, image_type: str | None = None, channel_names: Sequence[str] | None = None, acquisition_datetime: datetime | None = None, clamp_to_uint16: bool = True, chunk_shape: Tuple[int, int, int] | None=(1, 512, 512), chunk_order: str = "ZYX", build_chunks: bool = True, physical_size_x: float = 1.0, physical_size_y: float = 1.0, physical_size_z: float = 1.0, physical_size_unit: str = "µm", dtype_meta: str | None = None) StructScalar[source]#
Build an OME-Arrow StructScalar from a JAX array.
This is useful when your pipeline already works with
jax.Arrayobjects and you want a direct path into the canonical OME-Arrow struct without manual conversion boilerplate in user code.- Parameters:
arr –
jax.Arrayimage data.dim_order – Axis labels for
arr. If None, infer from rank: 2D->”YX”, 3D->”ZYX”, 4D->”TCYX”, 5D->”TCZYX”.image_id – Optional stable image identifier.
name – Optional human label.
image_type – Open-ended image kind (e.g., “image”, “label”).
channel_names – Optional channel names. Defaults to
None. WhenNone(or length does not match channel count), names are auto-generated asC0..C{n-1}(for example, 3 channels becomeC0,C1,C2).acquisition_datetime – Defaults to now (UTC) if None.
clamp_to_uint16 – If True, clamp/cast planes to uint16 before serialization.
chunk_shape – Chunk shape as (Z, Y, X). Defaults to (1, 512, 512).
chunk_order – Flattening order for chunk pixels (default “ZYX”).
build_chunks – If True, build chunked pixels from planes.
physical_size_x – Spatial pixel size (µm) for X.
physical_size_y – Spatial pixel size (µm) for Y.
physical_size_z – Spatial pixel size (µm) for Z when present.
physical_size_unit – Unit string for spatial axes (default “µm”).
dtype_meta – Pixel dtype string to place in metadata.
- Returns:
Typed OME-Arrow record.
- Return type:
pa.StructScalar
- ome_arrow.ingest.from_numpy(arr: ndarray, *, dim_order: str = "TCZYX", image_id: str | None = None, name: str | None = None, image_type: str | None = None, channel_names: Sequence[str] | None = None, acquisition_datetime: datetime | None = None, clamp_to_uint16: bool = True, chunk_shape: Tuple[int, int, int] | None=(1, 512, 512), chunk_order: str = "ZYX", build_chunks: bool = True, physical_size_x: float = 1.0, physical_size_y: float = 1.0, physical_size_z: float = 1.0, physical_size_unit: str = "µm", dtype_meta: str | None = None) StructScalar[source]#
Build an OME-Arrow StructScalar from a NumPy array.
- Parameters:
arr – Image data with axes described by dim_order.
dim_order – Axis labels for arr. Must include “Y” and “X”. Supported examples: “YX”, “ZYX”, “CYX”, “CZYX”, “TYX”, “TCYX”, “TCZYX”.
image_id – Optional stable image identifier.
name – Optional human label.
image_type – Open-ended image kind (e.g., “image”, “label”).
channel_names – Optional channel names. Defaults to
None. WhenNone(or length does not match channel count), names are auto-generated asC0..C{n-1}(for example, 3 channels becomeC0,C1,C2).acquisition_datetime – Defaults to now (UTC) if None.
clamp_to_uint16 – If True, clamp/cast planes to uint16 before serialization.
chunk_shape – Chunk shape as (Z, Y, X). Defaults to (1, 512, 512).
chunk_order – Flattening order for chunk pixels (default “ZYX”).
build_chunks – If True, build chunked pixels from planes.
physical_size_x – Spatial pixel size (µm) for X.
physical_size_y – Spatial pixel size (µm) for Y.
physical_size_z – Spatial pixel size (µm) for Z when present.
physical_size_unit – Unit string for spatial axes (default “µm”).
dtype_meta – Pixel dtype string to place in metadata; if None, inferred from the (possibly cast) array’s dtype.
- Returns:
Typed OME-Arrow record (schema = OME_ARROW_STRUCT).
- Return type:
pa.StructScalar
- Raises:
TypeError – If arr is not a NumPy ndarray.
ValueError – If dim_order is invalid or dimensions are non-positive.
Notes
If Z is not in dim_order, size_z will be 1 and the meta dimension_order becomes “XYCT”; otherwise “XYZCT”.
If T/C are absent in dim_order, they default to size 1.
- ome_arrow.ingest.from_ome_parquet(parquet_path: str | Path, *, column_name: str | None = 'ome_arrow', row_index: int = 0, strict_schema: bool = False, return_array: bool = False) StructScalar | tuple[StructScalar, StructArray][source]#
Read an OME-Arrow record from a Parquet file.
- Parameters:
parquet_path – Path to the Parquet file.
column_name – Column to read; auto-detected when None or invalid.
row_index – Row index to extract.
strict_schema – Require the exact OME-Arrow schema if True.
return_array – When True, also return a 1-row StructArray.
- Returns:
A typed OME-Arrow StructScalar, or (StructScalar, StructArray) when return_array=True.
- Raises:
FileNotFoundError – If the Parquet path does not exist.
ValueError – If the row index is out of range or no suitable column exists.
Notes
This reader targets the row group containing
row_indexand requests onlycolumn_namewhen provided, avoiding eager full-table reads.
- ome_arrow.ingest.from_ome_vortex(vortex_path: str | Path, *, column_name: str | None = 'ome_arrow', row_index: int = 0, strict_schema: bool = False, return_array: bool = False) StructScalar | tuple[StructScalar, StructArray][source]#
Read an OME-Arrow record from a Vortex file.
- Parameters:
vortex_path – Path to the Vortex file.
column_name – Column to read; auto-detected when None or invalid.
row_index – Row index to extract.
strict_schema – Require the exact OME-Arrow schema if True.
return_array – When True, also return a 1-row StructArray.
- Returns:
A typed OME-Arrow StructScalar, or (StructScalar, StructArray) when return_array=True.
- Raises:
FileNotFoundError – If the Vortex path does not exist.
ImportError – If the optional vortex-data dependency is missing.
ValueError – If the row index is out of range or no suitable column exists.
- ome_arrow.ingest.from_ome_zarr(zarr_path: str | Path, image_id: str | None = None, name: str | None = None, image_type: str | None = None, channel_names: Sequence[str] | None = None, acquisition_datetime: datetime | None = None, clamp_to_uint16: bool = True) StructScalar[source]#
Read an OME-Zarr directory and return a typed OME-Arrow StructScalar.
Uses BioIO with the OMEZarrReader backend to read TCZYX (or XY) data, flattens each YX plane into OME-Arrow planes, and builds a validated StructScalar via to_ome_arrow.
- Parameters:
zarr_path – Path to the OME-Zarr directory (e.g., “image.ome.zarr”).
image_id – Optional stable image identifier (defaults to directory stem).
name – Optional display name (defaults to directory name).
image_type – Optional image kind (e.g., “image”, “label”).
channel_names – Optional list of channel names. Defaults to C0, C1, …
acquisition_datetime – Optional datetime (defaults to UTC now).
clamp_to_uint16 – If True, cast pixels to uint16.
- Returns:
Validated OME-Arrow struct for this image.
- Return type:
pa.StructScalar
- ome_arrow.ingest.from_stack_pattern_path(pattern_path: str | Path, default_dim_for_unspecified: str = 'C', map_series_to: str | None = 'T', clamp_to_uint16: bool = True, channel_names: List[str] | None = None, image_id: str | None = None, name: str | None = None, image_type: str | None = None) StructScalar[source]#
Build an OME-Arrow record from a filename pattern describing a stack.
- Parameters:
pattern_path – Path or pattern string describing the stack layout.
default_dim_for_unspecified – Dimension to use when tokens lack a dim.
map_series_to – Dimension to map series tokens to (e.g., “T”), or None.
clamp_to_uint16 – Whether to clamp pixel values to uint16.
channel_names – Optional list of channel names to apply.
image_id – Optional image identifier override.
name – Optional display name override.
image_type – Optional image kind (e.g., “image”, “label”).
- Returns:
A validated OME-Arrow StructScalar describing the stack.
- ome_arrow.ingest.from_tiff(tiff_path: str | Path, image_id: str | None = None, name: str | None = None, image_type: str | None = None, channel_names: Sequence[str] | None = None, acquisition_datetime: datetime | None = None, clamp_to_uint16: bool = True) StructScalar[source]#
Read a TIFF and return a typed OME-Arrow StructScalar.
Uses bioio to read TCZYX (or XY) data, flattens each YX plane, and delegates struct creation to to_struct_scalar.
- Parameters:
tiff_path – Path to a TIFF readable by bioio.
image_id – Optional stable image identifier (defaults to stem).
name – Optional human label (defaults to file name).
image_type – Optional image kind (e.g., “image”, “label”).
channel_names – Optional channel names; defaults to C0..C{n-1}.
acquisition_datetime – Optional acquisition time (UTC now if None).
clamp_to_uint16 – If True, clamp/cast planes to uint16.
- Returns:
pa.StructScalar validated against struct.
- ome_arrow.ingest.from_torch_array(arr: Any, *, dim_order: str | None = None, image_id: str | None = None, name: str | None = None, image_type: str | None = None, channel_names: Sequence[str] | None = None, acquisition_datetime: datetime | None = None, clamp_to_uint16: bool = True, chunk_shape: Tuple[int, int, int] | None=(1, 512, 512), chunk_order: str = "ZYX", build_chunks: bool = True, physical_size_x: float = 1.0, physical_size_y: float = 1.0, physical_size_z: float = 1.0, physical_size_unit: str = "µm", dtype_meta: str | None = None) StructScalar[source]#
Build an OME-Arrow StructScalar from a torch tensor.
This is useful when your pipeline already works with
torch.Tensorobjects (for example model inputs/outputs) and you want a direct path into the canonical OME-Arrow struct without manually converting and reshaping in user code.- Parameters:
arr –
torch.Tensorimage data.dim_order – Axis labels for
arr. If None, infer from rank: 2D->”YX”, 3D->”ZYX”, 4D->”TCYX”, 5D->”TCZYX”.image_id – Optional stable image identifier.
name – Optional human label.
image_type – Open-ended image kind (e.g., “image”, “label”).
channel_names – Optional channel names. Defaults to
None. WhenNone(or length does not match channel count), names are auto-generated asC0..C{n-1}(for example, 3 channels becomeC0,C1,C2).acquisition_datetime – Defaults to now (UTC) if None.
clamp_to_uint16 – If True, clamp/cast planes to uint16 before serialization.
chunk_shape – Chunk shape as (Z, Y, X). Defaults to (1, 512, 512).
chunk_order – Flattening order for chunk pixels (default “ZYX”).
build_chunks – If True, build chunked pixels from planes.
physical_size_x – Spatial pixel size (µm) for X.
physical_size_y – Spatial pixel size (µm) for Y.
physical_size_z – Spatial pixel size (µm) for Z when present.
physical_size_unit – Unit string for spatial axes (default “µm”).
dtype_meta – Pixel dtype string to place in metadata.
- Returns:
Typed OME-Arrow record.
- Return type:
pa.StructScalar
- ome_arrow.ingest.open_lazy_plane_source(source: str) tuple[dict[str, Any], Callable[[int, int, int], ndarray]] | None[source]#
Open a source-backed per-plane loader for lazy tensor execution.
- Parameters:
source – Input path/URL string for TIFF or OME-Zarr sources.
- Returns:
A tuple of
(pixels_meta, plane_loader)when source-backed lazy plane loading is supported forsource; otherwiseNone.
- ome_arrow.ingest.to_ome_arrow(type_: str = OME_ARROW_TAG_TYPE, version: str = OME_ARROW_TAG_VERSION, image_id: str = "unnamed", name: str = "unknown", image_type: str | None = "image", acquisition_datetime: datetime | None = None, dimension_order: str = "XYZCT", dtype: str = "uint16", size_x: int = 1, size_y: int = 1, size_z: int = 1, size_c: int = 1, size_t: int = 1, physical_size_x: float = 1.0, physical_size_y: float = 1.0, physical_size_z: float = 1.0, physical_size_unit: str = "µm", channels: Dict[str, ~typing.Any]] | None=None, planes: Dict[str, ~typing.Any]] | None=None, chunks: Dict[str, ~typing.Any]] | None=None, chunk_shape: Tuple[int, int, int] | None=(1, 512, 512), chunk_order: str = "ZYX", build_chunks: bool = True, masks: Any = None) StructScalar[source]#
Create a typed OME-Arrow StructScalar with sensible defaults.
This builds and validates a nested dict that conforms to the given StructType (e.g., OME_ARROW_STRUCT). You can override any field explicitly; others use safe defaults.
- Parameters:
type – Top-level type string (“ome.arrow” by default).
version – Specification version string.
image_id – Unique image identifier.
name – Human-friendly name.
image_type – Open-ended image kind (e.g., “image”, “label”). Note that from_* helpers pass image_type=None by default to preserve “unspecified” vs explicitly set (“image”).
acquisition_datetime – Datetime of acquisition (defaults to now).
dimension_order – Dimension order (“XYZCT” or “XYCT”).
dtype – Pixel data type string (e.g., “uint16”).
size_x – Axis sizes.
size_y – Axis sizes.
size_z – Axis sizes.
size_c – Axis sizes.
size_t – Axis sizes.
physical_size_x/y/z – Physical scaling in µm.
physical_size_unit – Unit string, default “µm”.
channels – List of channel dicts. Autogenerates one if None.
planes – List of plane dicts. Empty if None.
chunks – Optional list of chunk dicts. If None and build_chunks is True, chunks are derived from planes using chunk_shape.
chunk_shape – Chunk shape as (Z, Y, X). Defaults to (1, 512, 512).
chunk_order – Flattening order for chunk pixels (default “ZYX”).
build_chunks – If True, build chunked pixels from planes when chunks is None.
masks – Optional placeholder for future annotations.
- Returns:
A validated StructScalar for the schema.
- Return type:
pa.StructScalar
Example
>>> s = to_struct_scalar(OME_ARROW_STRUCT, image_id="img001") >>> s.type == OME_ARROW_STRUCT True
ome_arrow.export#
Module for exporting OME-Arrow data to other formats.
- ome_arrow.export.plane_from_chunks(data: Dict[str, Any] | StructScalar, *, t: int, c: int, z: int, dtype: dtype = np.uint16, strict: bool = True, clamp: bool = False) ndarray[source]#
Extract a single (t, c, z) plane using chunked pixels when available.
- Parameters:
data – OME-Arrow data as a Python dict or a pa.StructScalar.
t – Time index for the plane.
c – Channel index for the plane.
z – Z index for the plane.
dtype – Output dtype (default: np.uint16).
strict – When True, raise if chunk pixels are malformed.
clamp – If True, clamp values to the valid range of the target dtype.
- Returns:
2D array with shape (Y, X).
- Return type:
np.ndarray
- Raises:
KeyError – If required OME-Arrow fields are missing.
ValueError – If indices are out of range or pixels are malformed.
- ome_arrow.export.to_numpy(data: Dict[str, Any] | StructScalar, dtype: dtype = np.uint16, strict: bool = True, clamp: bool = False) ndarray[source]#
Convert an OME-Arrow record into a NumPy array shaped (T,C,Z,Y,X).
The OME-Arrow “planes” are flattened YX slices indexed by (z, t, c). When chunks are present, this function reconstitutes the dense TCZYX array from chunked pixels instead of planes.
- Parameters:
data – OME-Arrow data as a Python dict or a pa.StructScalar.
dtype – Output dtype (default: np.uint16). If different from plane values, a cast (and optional clamp) is applied.
strict – When True, raise if a plane has wrong pixel length. When False, truncate/pad that plane to the expected length.
clamp – If True, clamp values to the valid range of the target dtype before casting.
- Returns:
Dense array with shape (T, C, Z, Y, X).
- Return type:
np.ndarray
- Raises:
KeyError – If required OME-Arrow fields are missing.
ValueError – If dimensions are invalid or planes are malformed.
Examples
>>> arr = ome_arrow_to_tczyx(my_row) # (T, C, Z, Y, X) >>> arr.shape (1, 2, 1, 512, 512)
- ome_arrow.export.to_ome_parquet(data: Dict[str, Any] | StructScalar, out_path: str, column_name: str = 'image', file_metadata: Dict[str, str] | None = None, compression: str | None = 'zstd', row_group_size: int | None = None) None[source]#
Export an OME-Arrow record to a Parquet file as a single-row, single-column table. The single column holds a struct with the OME-Arrow schema.
- ome_arrow.export.to_ome_tiff(data: Dict[str, Any] | StructScalar, out_path: str, *, dtype: dtype = np.uint16, clamp: bool = False, dim_order: str = 'TCZYX', compression: str | None = 'zlib', compression_level: int = 6, tile: Tuple[int, int] | None = None, use_channel_colors: bool = False) None[source]#
Export an OME-Arrow record to OME-TIFF using BioIO’s OmeTiffWriter.
Notes
No ‘bigtiff’ kwarg is passed (invalid for tifffile.TiffWriter.write()). BigTIFF selection is automatic based on file size.
- ome_arrow.export.to_ome_vortex(data: Dict[str, Any] | StructScalar, out_path: str, column_name: str = 'image', file_metadata: Dict[str, str] | None = None) None[source]#
Export an OME-Arrow record to a Vortex file.
The file is written as a single-row, single-column Arrow table where the column holds a struct with the OME-Arrow schema.
- Parameters:
data – OME-Arrow dict or StructScalar.
out_path – Output path for the Vortex file.
column_name – Column name to store the struct.
file_metadata – Optional file-level metadata to attach.
- Raises:
ImportError – If the optional vortex-data dependency is missing.
- ome_arrow.export.to_ome_zarr(data: Dict[str, Any] | StructScalar, out_path: str, *, dtype: dtype = np.uint16, clamp: bool = False, dim_order: str = 'TCZYX', multiscale_levels: int = 1, downscale_spatial_by: int = 2, zarr_format: int = 3, chunks: Tuple[int, int, int, int, int] | None = None, shards: Tuple[int, int, int, int, int] | None = None, compressor: str | None = 'zstd', compressor_level: int = 3, image_name: str | None = None) None[source]#
Write OME-Zarr using your OMEZarrWriter (instance API).
Builds arr as (T,C,Z,Y,X) using your to_numpy.
Creates level shapes for a multiscale pyramid (if multiscale_levels>1).
Chooses Blosc codec compatible with zarr_format (v2 vs v3).
Populates axes names/types/units and physical pixel sizes from pixels_meta.
Uses default TCZYX chunks if none are provided.
ome_arrow.meta#
Meta-definition for OME-Arrow format.
ome_arrow.tensor#
Tensor view utilities for OME-Arrow pixel data.
- class ome_arrow.tensor.LazyTensorView(*, loader: Callable[[], dict[str, Any] | StructScalar | StructArray | ChunkedArray], resolver: Callable[[dict[str, Any]], TensorView] | None = None, t: int | slice | Sequence[int] | None = None, z: int | slice | Sequence[int] | None = None, c: int | slice | Sequence[int] | None = None, roi: tuple[int, int, int, int] | None = None, roi3d: tuple[int, int, int, int, int, int] | None = None, roi_nd: tuple[int, ...] | None = None, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None = None, tile: tuple[int, int] | None = None, layout: str | None = None, dtype: dtype | None = None, chunk_policy: Literal['auto', 'combine', 'keep'] = 'auto', channel_policy: Literal['error', 'first'] = 'error')[source]#
Bases:
objectDeferred TensorView plan with Polars-style collect semantics.
- collect() TensorView[source]#
Materialize this lazy plan into a concrete TensorView.
- property device: str#
Return the tensor storage device.
Note
For unresolved lazy plans, this returns
"cpu"without callingcollect().
- property dtype: dtype#
Return the tensor dtype.
Note
Accessing this property calls
collect()and may materialize data from source files (for example Parquet/TIFF), which can be expensive.
- iter_dlpack(*, batch_size: int | None = None, tile_size: tuple[int, int] | None = None, tiles: tuple[int, int] | None = None, shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Iterator[Any][source]#
Iterate DLPack outputs in batches or 2D tiles.
- Parameters:
batch_size – Number of time indices per batch.
tile_size – Optional tile size as
(tile_h, tile_w).tiles – Deprecated alias for
tile_size.shuffle – Whether to shuffle iteration order.
seed – Optional random seed for deterministic shuffling.
prefetch – Placeholder prefetch count.
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (
"arrow"or"numpy").
- Returns:
Iterator of DLPack-compatible objects.
- Return type:
Iterator[Any]
- iter_tiles_3d(*, tile_size: tuple[int, int, int], shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'numpy') Iterator[Any][source]#
Iterate DLPack outputs in 3D tiles.
- Parameters:
tile_size – Tile shape as
(tile_z, tile_h, tile_w).shuffle – Whether to shuffle iteration order.
seed – Optional random seed for deterministic shuffling.
prefetch – Placeholder prefetch count.
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (currently
"numpy"only).
- Returns:
Iterator of DLPack-compatible objects.
- Return type:
Iterator[Any]
- property layout: str#
Return the effective tensor layout.
Note
Accessing this property calls
collect()and may materialize data from source files (for example Parquet/TIFF), which can be expensive.
- select(*, t: int | slice | Sequence[int] | None | _Unset = _UNSET, z: int | slice | Sequence[int] | None | _Unset = _UNSET, c: int | slice | Sequence[int] | None | _Unset = _UNSET, roi: tuple[int, int, int, int] | None | _Unset = _UNSET, roi3d: tuple[int, int, int, int, int, int] | None | _Unset = _UNSET, roi_nd: tuple[int, ...] | None | _Unset = _UNSET, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None | _Unset = _UNSET, tile: tuple[int, int] | None | _Unset = _UNSET) LazyTensorView[source]#
Return a new lazy plan with updated index/ROI selections.
- property shape: tuple[int, ...]#
Return the tensor shape.
Note
Accessing this property calls
collect()and may materialize data from source files (for example Parquet/TIFF), which can be expensive.
- property strides: tuple[int, ...]#
Return tensor strides in bytes.
Note
Accessing this property calls
collect()and may materialize data from source files (for example Parquet/TIFF), which can be expensive.
- to_dlpack(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Export the planned view as a DLPack object.
- Parameters:
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (
"arrow"or"numpy").
- Returns:
DLPack-compatible object.
- Return type:
Any
- to_jax(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Convert the planned view to a JAX array.
- Parameters:
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (
"arrow"or"numpy").
- Returns:
JAX array when JAX is installed.
- Return type:
Any
- to_numpy(*, contiguous: bool = False) ndarray[source]#
Materialize as a NumPy array.
- Parameters:
contiguous – When True, return a contiguous array copy.
- Returns:
Materialized array.
- Return type:
np.ndarray
- to_torch(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Convert the planned view to a torch tensor.
- Parameters:
device – Target device (
"cpu"or"cuda").contiguous – When True, materialize contiguous data when needed.
mode – Export mode (
"arrow"or"numpy").
- Returns:
torch.Tensorwhen torch is installed.- Return type:
Any
- with_layout(layout: str) LazyTensorView[source]#
Return a new lazy view with an updated layout.
- class ome_arrow.tensor.TensorView(data: dict[str, Any] | StructScalar | StructArray | ChunkedArray, *, plane_loader: Callable[[int, int, int], ndarray] | None = None, t: int | slice | Sequence[int] | None = None, z: int | slice | Sequence[int] | None = None, c: int | slice | Sequence[int] | None = None, roi: tuple[int, int, int, int] | None = None, roi3d: tuple[int, int, int, int, int, int] | None = None, roi_nd: tuple[int, ...] | None = None, roi_type: Literal['2d', '2d_timelapse', '3d', '4d'] | None = None, tile: tuple[int, int] | None = None, layout: str | None = None, dtype: dtype | None = None, chunk_policy: Literal['auto', 'combine', 'keep'] = 'auto', channel_policy: Literal['error', 'first'] = 'error')[source]#
Bases:
objectView OME-Arrow pixel data as a tensor-like object.
- Parameters:
data – OME-Arrow dict, StructScalar, or 1-row StructArray/ChunkedArray.
t – Time index selection (int, slice, or sequence). Default: all.
z – Z index selection (int, slice, or sequence). Default: all.
c – Channel index selection (int, slice, or sequence). Default: all.
roi – Spatial crop (x, y, w, h) in pixels. Default: full frame.
roi3d – Spatial + depth crop (x, y, z, w, h, d). This is a convenience alias for
roi=(x, y, w, h)andz=slice(z, z + d).roi_nd – General ROI tuple with min/max bounds, interpreted by
roi_type.roi_type – ROI interpretation mode for
roi_nd. Supported values:"2d"=(ymin, xmin, ymax, xmax);"2d_timelapse"=(tmin, ymin, xmin, tmax, ymax, xmax);"3d"=(zmin, ymin, xmin, zmax, ymax, xmax);"4d"=(tmin, zmin, ymin, xmin, tmax, zmax, ymax, xmax).tile – Tile index (tile_y, tile_x) based on chunk grid.
layout – Desired layout string using TZCYX letters where T=time, Z=depth, C=channel, Y=row axis, X=column axis. TZCHW aliases are also accepted for compatibility.
dtype – Output dtype override. Defaults to pixels_meta.type when valid.
chunk_policy – Handling for
pyarrow.ChunkedArrayinputs. “auto” keeps multi-chunk arrays and unwraps single-chunk arrays. “combine” always combines multi-chunk arrays eagerly. “keep” always keeps chunked storage.channel_policy – Behavior when dropping C from layout while multiple channels are selected. “error” raises (default). “first” keeps the first channel.
- property device: str#
Return the storage device for the view (currently always “cpu”).
- property dtype: dtype#
Return the tensor dtype.
- iter_dlpack(*, batch_size: int | None = None, tile_size: tuple[int, int] | None = None, tiles: tuple[int, int] | None = None, shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Iterator[Any][source]#
Iterate over DLPack capsules in batches or tiles.
- Parameters:
batch_size – Number of T indices per batch. Defaults to full range.
tile_size – Tile size (tile_h, tile_w) in pixels for spatial tiling.
tiles – Deprecated alias for
tile_size.shuffle – Whether to shuffle the iteration order.
seed – Seed for deterministic shuffling.
prefetch – Placeholder for future asynchronous prefetch support. Currently validated but does not change synchronous iteration.
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize contiguous buffers if needed.
mode – Export mode. “arrow” returns 1D values buffers.
- Yields:
DLPack object per batch or tile.
- iter_tiles_3d(*, tile_size: tuple[int, int, int], shuffle: bool = False, seed: int | None = None, prefetch: int = 0, device: str = 'cpu', contiguous: bool = True, mode: str = 'numpy') Iterator[Any][source]#
Iterate over 3D tiles (z, y, x) as DLPack capsules.
- Parameters:
tile_size – Tile size as
(tile_z, tile_h, tile_w).shuffle – Whether to shuffle the tile order.
seed – Seed for deterministic shuffling.
prefetch – Placeholder for future asynchronous prefetch support.
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize contiguous buffers if needed.
mode – Export mode. Must be
"numpy"for tiled 3D iteration.
- Yields:
DLPack object per 3D tile.
- property layout: str#
Return the effective layout for this view.
- property shape: tuple[int, ...]#
Return the tensor shape for the current layout.
- property strides: tuple[int, ...]#
Return the tensor strides in bytes for the current layout.
- to_dlpack(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Export the view as a DLPack capsule.
- Parameters:
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize a contiguous buffer if needed.
mode – Export mode. “arrow” returns a capsule for the Arrow values buffer (1D). “numpy” materializes a tensor-shaped NumPy view. Zero-copy Arrow mode requires Arrow-backed inputs (typically Parquet/Vortex ingestion with canonical schema); StructScalar and dict inputs are normalized through Python objects.
- Returns:
DLPack object compatible with torch/jax import utilities. The returned object is single-use per DLPack ownership semantics: after a consumer imports it, the capsule must not be reused.
- Raises:
ValueError – If an unsupported device is requested.
RuntimeError – If required optional dependencies are missing.
- to_jax(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Convert the view into a JAX array using DLPack.
- Parameters:
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize a contiguous buffer if needed.
mode – Export mode. “arrow” returns a 1D values buffer.
- Returns:
Array backed by the DLPack capsule.
- Return type:
jax.Array
- to_numpy(*, contiguous: bool = False) ndarray[source]#
Materialize the view as a NumPy array.
- Parameters:
contiguous – When True, return a contiguous array copy.
- Returns:
Array in the requested layout.
- Return type:
np.ndarray
- to_torch(*, device: str = 'cpu', contiguous: bool = True, mode: str = 'arrow') Any[source]#
Convert the view into a torch.Tensor using DLPack.
- Parameters:
device – Target device (“cpu” or “cuda”).
contiguous – When True, materialize a contiguous buffer if needed.
mode – Export mode. “arrow” returns a 1D values buffer.
- Returns:
Tensor backed by the DLPack capsule.
- Return type:
torch.Tensor
- with_layout(layout: str) TensorView[source]#
Return a new TensorView with a layout override.
- Parameters:
layout – Desired layout string using TZCYX letters where T=time, Z=depth, C=channel, Y=row axis, X=column axis. TZCHW aliases are also accepted for compatibility.
- Returns:
New view with the requested layout.
- Return type:
ome_arrow.transform#
Module for transforming OME-Arrow data (e.g., slices, projections, or other changes).
- ome_arrow.transform.slice_ome_arrow(data: Dict[str, Any] | StructScalar, x_min: int, x_max: int, y_min: int, y_max: int, t_indices: Iterable[int] | None = None, c_indices: Iterable[int] | None = None, z_indices: Iterable[int] | None = None, fill_missing: bool = True) StructScalar[source]#
Create a cropped copy of an OME-Arrow record.
Crops spatially to [y_min:y_max, x_min:x_max] (half-open) and, if provided, filters/reindexes T/C/Z to the given index sets.
- Parameters:
data (dict | pa.StructScalar) – OME-Arrow record.
x_min (int) – Half-open crop bounds in pixels (0-based).
x_max (int) – Half-open crop bounds in pixels (0-based).
y_min (int) – Half-open crop bounds in pixels (0-based).
y_max (int) – Half-open crop bounds in pixels (0-based).
t_indices (Iterable[int] | None) – Optional explicit indices to keep for T, C, Z. If None, keep all. Selected indices are reindexed to 0..len-1 in the output.
c_indices (Iterable[int] | None) – Optional explicit indices to keep for T, C, Z. If None, keep all. Selected indices are reindexed to 0..len-1 in the output.
z_indices (Iterable[int] | None) – Optional explicit indices to keep for T, C, Z. If None, keep all. Selected indices are reindexed to 0..len-1 in the output.
fill_missing (bool) – If True, any missing (t,c,z) planes in the selection are zero-filled.
- Returns:
New OME-Arrow record with updated sizes and planes.
- Return type:
pa.StructScalar
ome_arrow.utils#
Utility functions for ome-arrow.
- ome_arrow.utils.describe_ome_arrow(data: StructScalar | dict) Dict[str, Any][source]#
Describe the structure of an OME-Arrow image record.
Reads pixels_meta from the OME-Arrow struct to report TCZYX dimensions and classify whether it’s a 2D image, 3D z-stack, movie/timelapse, or 4D timelapse-volume. Also flags whether it is multi-channel (C > 1) or single-channel.
- Parameters:
data – OME-Arrow row as a pa.StructScalar or plain dict.
- Returns:
shape: (T, C, Z, Y, X)
type: classification string
summary: human-readable text
- Return type:
dict with keys
- ome_arrow.utils.verify_ome_arrow(data: Any, struct: StructType) bool[source]#
Return True if data conforms to the given Arrow StructType.
This tries to convert data into a pyarrow scalar using struct as the declared type. If conversion fails, the data does not match.
- Parameters:
data – A nested Python dict/list structure to test.
struct – The expected pyarrow.StructType schema.
- Returns:
True if conversion succeeds, False otherwise.
- Return type:
bool
ome_arrow.view#
Viewing utilities for OME-Arrow data.
- ome_arrow.view.view_matplotlib(data: dict[str, object] | StructScalar, tcz: tuple[int, int, int] = (0, 0, 0), autoscale: bool = True, vmin: int | None = None, vmax: int | None = None, cmap: str = 'gray', show: bool = True) tuple[Figure, Axes, AxesImage][source]#
Render a single (t, c, z) plane with Matplotlib.
- Parameters:
data – OME-Arrow row or dict containing pixels_meta and planes.
tcz – (t, c, z) indices of the plane to render.
autoscale – If True, infer vmin/vmax from the image data.
vmin – Explicit lower display limit for intensity scaling.
vmax – Explicit upper display limit for intensity scaling.
cmap – Matplotlib colormap name.
show – Whether to display the plot immediately.
- Returns:
A tuple of (figure, axes, image) from Matplotlib.
- Raises:
ValueError – If the requested plane is missing or pixel sizes mismatch.
- ome_arrow.view.view_pyvista(data: dict | pa.StructScalar, c: int = 0, downsample: int = 1, scaling_values: tuple[float, float, float] | None = None, opacity: str | float = 'sigmoid', clim: tuple[float, float] | None = None, show_axes: bool = True, backend: str = 'auto', interpolation: str = 'nearest', background: str = 'black', percentile_clim: tuple[float, float] = (1.0, 99.9), sampling_scale: float = 0.5, show: bool = True) pyvista.Plotter[source]#
Jupyter-inline interactive volume view using PyVista backends. Tries ‘trame’ → ‘html’ → ‘static’ when backend=’auto’.
sampling_scale controls ray step via the mapper after add_volume.