{ "cells": [ { "cell_type": "markdown", "id": "intro", "metadata": {}, "source": [ "# Metadata Workflow Example\n", "\n", "This notebook demonstrates the stable, user-facing API without requiring a\n", "live catalog. It covers three common tasks:\n", "\n", "- build a canonical `ScanResult`\n", "- summarize it into a user-facing `DatasetSummary`\n", "- join it to profile data in Arrow\n" ] }, { "cell_type": "code", "execution_count": null, "id": "imports", "metadata": {}, "outputs": [], "source": [ "import pyarrow as pa\n", "\n", "from iceberg_bioimage import (\n", " ImageAsset,\n", " ScanResult,\n", " join_profiles_with_scan_result,\n", " summarize_scan_result,\n", ")\n", "from iceberg_bioimage.publishing.chunk_index import scan_result_to_chunk_rows\n", "from iceberg_bioimage.publishing.image_assets import scan_result_to_rows" ] }, { "cell_type": "code", "execution_count": null, "id": "scan-result", "metadata": {}, "outputs": [], "source": [ "scan_result = ScanResult(\n", " source_uri=\"data/example.ome.zarr\",\n", " format_family=\"zarr\",\n", " image_assets=[\n", " ImageAsset(\n", " uri=\"data/example.ome.zarr\",\n", " array_path=\"0\",\n", " shape=[1, 1, 256, 256],\n", " dtype=\"uint16\",\n", " chunk_shape=[1, 1, 128, 128],\n", " metadata={\n", " \"axes\": \"czyx\",\n", " \"channel_count\": 1,\n", " \"storage_variant\": \"zarr-v2\",\n", " },\n", " image_id=\"example:0\",\n", " )\n", " ],\n", ")\n", "\n", "scan_result.to_dict()" ] }, { "cell_type": "code", "execution_count": null, "id": "summary", "metadata": {}, "outputs": [], "source": [ "summary = summarize_scan_result(scan_result)\n", "summary.to_dict()" ] }, { "cell_type": "code", "execution_count": null, "id": "join-profiles", "metadata": {}, "outputs": [], "source": [ "profiles = pa.table(\n", " {\n", " \"dataset_id\": [\"example\"],\n", " \"image_id\": [\"example:0\"],\n", " \"cell_count\": [42],\n", " }\n", ")\n", "\n", "joined = join_profiles_with_scan_result(scan_result, profiles, include_chunks=True)\n", "joined.to_pydict()" ] }, { "cell_type": "code", "execution_count": null, "id": "canonical-rows", "metadata": {}, "outputs": [], "source": [ "{\n", " \"image_assets\": scan_result_to_rows(scan_result),\n", " \"chunk_index_count\": len(scan_result_to_chunk_rows(scan_result)),\n", "}" ] }, { "cell_type": "markdown", "id": "ome-arrow-note", "metadata": {}, "source": [ "## Optional OME-Arrow path\n", "\n", "If you install the optional `ome-arrow` extra, the package also exposes a\n", "small bridge for Arrow-native image payload workflows:\n", "\n", "```python\n", "from iceberg_bioimage import create_ome_arrow, scan_ome_arrow\n", "\n", "oa = create_ome_arrow(\"image.ome.tiff\")\n", "lazy_oa = scan_ome_arrow(\"image.ome.parquet\")\n", "```\n", "\n", "That keeps metadata registration and Arrow-native image handling adjacent,\n", "without pulling OME-Arrow into the required dependency set.\n" ] } ], "metadata": { "jupytext": { "formats": "ipynb,py:light" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.11" } }, "nbformat": 4, "nbformat_minor": 5 }