Learning to fly with OME-Arrow#

This notebook provides a quick demonstration of what you can do with OME Arrow.

# we import a single class, OMEArrow
# which handles all data I/O and manipulation
from ome_arrow import OMEArrow
# read a TIFF file and convert it to OME-Arrow
oa_image = OMEArrow(
    data="../../../tests/data/examplehuman/AS_09125_050116030001_D03f00d0.tif"
)
# by default, the image and metadata are shown
oa_image
2D image, single-channel - shape (T=1, C=1, Z=1, Y=512, X=512)
../_images/931ab929984f8c12c7ce3d1a2c4952356feda4f9e4fe3bfb19f31950257e035d.png
# we can also get a summary of the OME-Arrow object
oa_image.info()
{'shape': (1, 1, 1, 512, 512),
 'type': '2D image',
 'channels': 1,
 'is_multichannel': False,
 'summary': '2D image, single-channel - shape (T=1, C=1, Z=1, Y=512, X=512)'}
# we can export the data into a number
# of different formats, e.g. numpy
oa_image.export(how="numpy")
array([[[[[ 8,  8,  8, ..., 63, 78, 75],
          [ 8,  8,  7, ..., 67, 71, 71],
          [ 9,  8,  8, ..., 53, 64, 66],
          ...,
          [ 8,  9,  8, ..., 17, 24, 59],
          [ 8,  8,  8, ..., 17, 22, 55],
          [ 8,  8,  8, ..., 16, 18, 38]]]]],
      shape=(1, 1, 1, 512, 512), dtype=uint16)
# We can also read in TIFF stacks following OME bfconvert API conventions
stack = OMEArrow(
    data="../../../tests/data/nviz-artificial-4d-dataset/E99_C<111,222>_ZS<000-021>.tif",
    # this is an optional for which
    # timepoint, channel, and z-slice to show by default
    tcz=(0, 0, 20),
)
stack
3D image (z-stack), multi-channel (2 channels) - shape (T=1, C=2, Z=22, Y=128, X=128)
../_images/1bb93096d8e2fb6c74d1dede39e2a25201e5cdf06e956c5e9fc1a0d81520692a.png
# we can visualize the stack using pyvista for 3D rendering
# note: we use manually specified scaling values here
# and can also default to what the image metadata provides
# with `scaling_values=None` (the default).
stack.view(how="pyvista", scaling_values=(1, 0.1, 0.1))
Static snapshot (for non-interactive view)
<pyvista.plotting.plotter.Plotter at 0x14dcc3f10>
# here we demonstrate that the data can be exported again
# into numpy format and re-imported
# into a new OME-Arrow object (from numpy data).
stack_np = stack.export(how="numpy")
OMEArrow(data=stack_np, tcz=(0, 0, 20))
3D image (z-stack), multi-channel (2 channels) - shape (T=1, C=2, Z=22, Y=128, X=128)
../_images/1bb93096d8e2fb6c74d1dede39e2a25201e5cdf06e956c5e9fc1a0d81520692a.png
# here we demonstrate that the data can be exported again
# into OME-TIFF format and re-imported
# into a new OME-Arrow object (from OME-TIFF data).
stack.export(how="ome-tiff", out="example.ome.tiff")
OMEArrow(data="example.ome.tiff", tcz=(0, 0, 20))
3D image (z-stack), multi-channel (2 channels) - shape (T=1, C=2, Z=22, Y=128, X=128)
../_images/1bb93096d8e2fb6c74d1dede39e2a25201e5cdf06e956c5e9fc1a0d81520692a.png
# here we demonstrate that the data can be exported again
# into OME-ZARR format and re-imported
# into a new OME-Arrow object (from OME-ZARR data).
stack.export(how="ome-zarr", out="example.ome.zarr")
OMEArrow(data="example.ome.zarr", tcz=(0, 0, 20))
3D image (z-stack), multi-channel (2 channels) - shape (T=1, C=2, Z=22, Y=128, X=128)
../_images/1bb93096d8e2fb6c74d1dede39e2a25201e5cdf06e956c5e9fc1a0d81520692a.png
# here we demonstrate that the data can be exported again
# into OME-Parquet format and re-imported
# into a new OME-Arrow object (from OME-Parquet data).
stack.export(how="ome-parquet", out="example.ome.parquet")
OMEArrow(data="example.ome.parquet", tcz=(0, 0, 20))
3D image (z-stack), multi-channel (2 channels) - shape (T=1, C=2, Z=22, Y=128, X=128)
../_images/1bb93096d8e2fb6c74d1dede39e2a25201e5cdf06e956c5e9fc1a0d81520692a.png
# we can also slice the data to get a smaller region of interest
stack.slice(
    x_min=40,
    y_min=80,
    x_max=70,
    y_max=110,
    t_indices=[0],
    c_indices=[0],
    z_indices=[20],
)
2D image, single-channel - shape (T=1, C=1, Z=1, Y=30, X=30)
../_images/3193080becacb4589aee5482abf7038b3c0276ad8ba23542b01318093a708c89.png
# read from a multi-image OME Parquet file as OME-Arrow
# note: the Parquet file was created using the CytoDataFrame project
# which helps convert CellProfiler and Image data into OME-Parquet format.
# see here for more details:
# https://github.com/cytomining/CytoDataFrame/blob/main/docs/src/examples/cytodataframe_at_a_glance.ipynb
oa_image = OMEArrow(
    data="../../../tests/data/JUMP-BR00117006/BR00117006.ome.parquet",
    # we can specify which column and row to read
    # (or rely on OMEArrow to find a suitable default)
    column_name="Image_FileName_OrigDNA_OMEArrow_LABL",
    row_index=2,
)
# by default, the image and metadata are shown
oa_image
2D image, single-channel - shape (T=1, C=1, Z=1, Y=73, X=97)
../_images/8ac656d424fd162a4e545e74720dde8101e6b12a4feb800b0425dd2b477502dd.png
# read a 3d zarr image from IDR
oa_image = OMEArrow(data="../../../tests/data/idr0062A/6001240_labels.zarr")
# show the image using pyvista
oa_image.view(how="pyvista")
Static snapshot (for non-interactive view)
<pyvista.plotting.plotter.Plotter at 0x37dba4550>

DLPack tensor export (advanced)#

This is optional and requires torch: pip install "ome-arrow[dlpack-torch]"

# examples of exporting OME-Arrow data into DLPack format for zero-copy
import jax.numpy as jnp
import torch

oa = OMEArrow("example.ome.parquet")
%%time
# DLPack Arrow mode: zero-copy 1D values buffer + reshape
view = oa.tensor_view(t=0, z=0, c=0)
cap = view.to_dlpack(mode="arrow")
flat = torch.utils.dlpack.from_dlpack(cap)
tensor = flat.reshape(view.shape)
tensor.shape
CPU times: user 139 ms, sys: 2.58 ms, total: 141 ms
Wall time: 144 ms
torch.Size([1, 128, 128])
%%time
# DLPack NumPy mode: shaped tensor directly (still zero-copy when possible)
# Layout quick reference:
# - `C` = channels
# - `H` = image height (Y axis)
# - `W` = image width (X axis)
view_chw = oa.tensor_view(t=0, z=0, layout="CHW")
cap_chw = view_chw.to_dlpack(mode="numpy", contiguous=True)
tensor_chw = torch.utils.dlpack.from_dlpack(cap_chw)
tensor_chw.shape
CPU times: user 142 ms, sys: 2.31 ms, total: 144 ms
Wall time: 143 ms
torch.Size([2, 128, 128])
%%time
# DLPack Arrow mode: zero-copy 1D values buffer + reshape
view = oa.tensor_view(t=0, z=0, c=0)
caps = view.to_dlpack(mode="arrow")
flat = jnp.from_dlpack(caps)
arr = flat.reshape(view.shape)
arr.shape
CPU times: user 162 ms, sys: 23.7 ms, total: 186 ms
Wall time: 253 ms
(1, 128, 128)