Exporting OME-Arrow pixel data via DLPack#
OME-Arrow exposes a small tensor view API for pixel data. The returned
TensorView can export DLPack capsules for zero-copy interoperability on CPU
and (optionally) GPU.
Key defaults:
OME-Arrow tensor layouts always include channels (
C) as a tensor axis.Default layout is
CHW(equivalent toCYX) when bothTandZare singleton in the source.Otherwise, default layout is
TZCHW(equivalent toTZCYX, with singletonT/Zretained unless you override layout).You can override with any valid TZCHW/TZCYX permutation/subset, for example
YXC,ZCYX, orCYX.
Layout nomenclature:
T: time indexZ: z/depth indexC: channel indexY: image row axis (height)X: image column axis (width) (H/Waliases are also accepted for compatibility).
Practical mapping:
2D image content (
YX) is typically exposed asCYX.3D z-stack content (
ZYX) is typically exposed asZCYXorTZCYX(withT=1).Time-lapse and volumetric content use
TZCYX/TZCHWby default.
PyTorch#
from ome_arrow import OMEArrow
obj = OMEArrow("example.ome.parquet")
view = obj.tensor_view(t=0, z=0, c=0)
# DLPack capsule -> torch.Tensor
import torch
capsule = view.to_dlpack(mode="arrow", device="cpu")
flat = torch.utils.dlpack.from_dlpack(capsule)
tensor = flat.reshape(view.shape)
Lazy scan-style slicing#
from ome_arrow import OMEArrow
obj = OMEArrow.scan("example.ome.parquet")
# Prioritize lazy slice planning first.
lazy_crop = obj.slice_lazy(0, 512, 0, 512).slice_lazy(64, 256, 64, 256)
cropped = lazy_crop.collect()
# Then execute tensor selections on the sliced result.
tensor_view = cropped.tensor_view(t=0, z=slice(0, 8), roi=(64, 64, 128, 128))
arr = tensor_view.to_numpy()
# Note: executing a LazyTensorView from OMEArrow.scan(...) does not
# materialize the original OMEArrow object itself.
# Call obj.collect() explicitly if you need to materialize `obj`.
JAX#
from ome_arrow import OMEArrow
obj = OMEArrow("example.ome.parquet")
view = obj.tensor_view(t=0, z=0, c=0, layout="CYX")
import jax.numpy as jnp
capsule = view.to_dlpack(mode="arrow", device="cpu")
flat = jnp.from_dlpack(capsule)
arr = flat.reshape(view.shape)
Iteration examples#
from ome_arrow import OMEArrow
import numpy as np
obj = OMEArrow("example.ome.parquet")
view = obj.tensor_view()
# Batch over time (T) dimension.
for cap in view.iter_dlpack(batch_size=2, shuffle=False, mode="numpy"):
batch = np.from_dlpack(cap)
# batch shape: (batch, Z, C, Y, X) in TZCYX layout
from ome_arrow import OMEArrow
import numpy as np
obj = OMEArrow("example.ome.parquet")
view = obj.tensor_view(t=0, z=0)
# Tile over spatial region.
for cap in view.iter_dlpack(
tile_size=(256, 256), shuffle=True, seed=123, mode="numpy"
):
tile = np.from_dlpack(cap)
# tile shape: (C, Y, X) in CYX layout
Ownership and lifetime#
TensorView.to_dlpack() returns a DLPack-capable object (with __dlpack__)
that references the underlying Arrow values buffer in mode="arrow", or a
NumPy buffer in mode="numpy". Keep the TensorView (or any NumPy array
returned by to_numpy) alive until the consumer finishes using the DLPack
object.
mode="arrow" currently requires a single (t, z, c) selection and a full-frame
ROI. Use mode="numpy" for batches, crops, or layout reshaping beyond a simple
reshape.
Zero-copy guarantees depend on the source: Arrow-backed inputs preserve buffers,
while records built from Python lists or NumPy arrays will materialize once into
Arrow buffers. The same applies to StructScalar inputs, which are normalized
through Python objects before Arrow-mode export.
For Parquet/Vortex sources, zero-copy also requires the on-disk struct schema
to match OME_ARROW_STRUCT; non-strict schema normalization materializes via
Python objects.
Optional dependencies#
CPU DLPack export uses Arrow buffers by default. For framework helpers and GPU paths, install only what you need:
pip install "ome-arrow[dlpack-torch]" # torch only
pip install "ome-arrow[dlpack-jax]" # jax only
pip install "ome-arrow[dlpack]" # both
Benchmarking lazy reads#
To quickly compare lazy tensor read paths (TIFF source-backed execution, Parquet planes, Parquet chunks), run:
uv run python benchmarks/benchmark_lazy_tensor.py --repeats 5 --warmup 1
This is a lightweight local benchmark intended for directional performance checks during development.
In CI, the tests workflow runs a benchmark_canary job that executes the
same script and uploads a JSON report artifact.
Recalibrating ci-baseline.json#
When performance changes are intentional (or runner behavior shifts), update
benchmarks/ci-baseline.json as follows:
Check out the latest
main.Run the benchmark multiple times:
uv run python benchmarks/benchmark_lazy_tensor.py --repeats 7 --warmup 2 --json-out benchmark-results.jsonRecord
median_msper case across runs.Set each baseline value to a stable, slightly conservative median.
Open a PR that updates baseline values only, with benchmark evidence.
Expected variability:
Small fluctuations are normal on GitHub-hosted runners.
Relative ordering of cases is usually stable.
Typical drift should be modest, but occasional jumps can happen due to runner image or dependency changes.