{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "intro",
   "metadata": {},
   "source": [
    "# Metadata Workflow Example\n",
    "\n",
    "This notebook demonstrates the stable, user-facing API without requiring a\n",
    "live catalog. It covers three common tasks:\n",
    "\n",
    "- build a canonical `ScanResult`\n",
    "- summarize it into a user-facing `DatasetSummary`\n",
    "- join it to profile data in Arrow\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "imports",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pyarrow as pa\n",
    "\n",
    "from iceberg_bioimage import (\n",
    "    ImageAsset,\n",
    "    ScanResult,\n",
    "    join_profiles_with_scan_result,\n",
    "    summarize_scan_result,\n",
    ")\n",
    "from iceberg_bioimage.publishing.chunk_index import scan_result_to_chunk_rows\n",
    "from iceberg_bioimage.publishing.image_assets import scan_result_to_rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "scan-result",
   "metadata": {},
   "outputs": [],
   "source": [
    "scan_result = ScanResult(\n",
    "    source_uri=\"data/example.ome.zarr\",\n",
    "    format_family=\"zarr\",\n",
    "    image_assets=[\n",
    "        ImageAsset(\n",
    "            uri=\"data/example.ome.zarr\",\n",
    "            array_path=\"0\",\n",
    "            shape=[1, 1, 256, 256],\n",
    "            dtype=\"uint16\",\n",
    "            chunk_shape=[1, 1, 128, 128],\n",
    "            metadata={\n",
    "                \"axes\": \"czyx\",\n",
    "                \"channel_count\": 1,\n",
    "                \"storage_variant\": \"zarr-v2\",\n",
    "            },\n",
    "            image_id=\"example:0\",\n",
    "        )\n",
    "    ],\n",
    ")\n",
    "\n",
    "scan_result.to_dict()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "summary",
   "metadata": {},
   "outputs": [],
   "source": [
    "summary = summarize_scan_result(scan_result)\n",
    "summary.to_dict()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "join-profiles",
   "metadata": {},
   "outputs": [],
   "source": [
    "profiles = pa.table(\n",
    "    {\n",
    "        \"dataset_id\": [\"example\"],\n",
    "        \"image_id\": [\"example:0\"],\n",
    "        \"cell_count\": [42],\n",
    "    }\n",
    ")\n",
    "\n",
    "joined = join_profiles_with_scan_result(scan_result, profiles, include_chunks=True)\n",
    "joined.to_pydict()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "canonical-rows",
   "metadata": {},
   "outputs": [],
   "source": [
    "{\n",
    "    \"image_assets\": scan_result_to_rows(scan_result),\n",
    "    \"chunk_index_count\": len(scan_result_to_chunk_rows(scan_result)),\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ome-arrow-note",
   "metadata": {},
   "source": [
    "## Optional OME-Arrow path\n",
    "\n",
    "If you install the optional `ome-arrow` extra, the package also exposes a\n",
    "small bridge for Arrow-native image payload workflows:\n",
    "\n",
    "```python\n",
    "from iceberg_bioimage import create_ome_arrow, scan_ome_arrow\n",
    "\n",
    "oa = create_ome_arrow(\"image.ome.tiff\")\n",
    "lazy_oa = scan_ome_arrow(\"image.ome.parquet\")\n",
    "```\n",
    "\n",
    "That keeps metadata registration and Arrow-native image handling adjacent,\n",
    "without pulling OME-Arrow into the required dependency set.\n"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "formats": "ipynb,py:light"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}