Python API#
cytodataframe.frame#
Defines a CytoDataFrame class.
- class src.cytodataframe.frame.CytoDataFrame(data: CytoDataFrame_type | DataFrame | str | Path, data_context_dir: str | None = None, data_bounding_box: DataFrame | None = None, data_mask_context_dir: str | None = None, data_outline_context_dir: str | None = None, segmentation_file_regex: Dict[str, str] | None = None, **kwargs: Dict[str, Any])[source]#
Bases:
DataFrame
A class designed to enhance single-cell data handling by wrapping pandas DataFrame capabilities, providing advanced methods for quality control, comprehensive analysis, and image-based data processing.
This class can initialize with either a pandas DataFrame or a file path (CSV, TSV, TXT, or Parquet). When initialized with a file path, it reads the data into a pandas DataFrame. It also includes capabilities to export data.
- _metadata#
A class-level attribute that includes custom attributes.
- Type:
ClassVar[list[str]]
- _custom_attrs#
A dictionary to store custom attributes, such as data source, context directory, and bounding box information.
- Type:
dict
- _metadata: ClassVar = ['_custom_attrs']#
- _repr_html_(key: int | str | None = None) str [source]#
Returns HTML representation of the underlying pandas DataFrame for use within Juypyter notebook environments and similar.
Referenced with modifications from: pandas-dev/pandas
Modifications added to help achieve image-based output for single-cell data within the context of CytoDataFrame and coSMicQC.
Mainly for Jupyter notebooks.
- Returns:
The data in a pandas DataFrame.
- Return type:
str
- _wrap_method(method: Callable, *args: List[Any], **kwargs: Dict[str, Any]) Any [source]#
Wraps a given method to ensure that the returned result is an CytoDataFrame if applicable.
- Parameters:
method (Callable) – The method to be called and wrapped.
*args (List[Any]) – Positional arguments to be passed to the method.
**kwargs (Dict[str, Any]) – Keyword arguments to be passed to the method.
- Returns:
The result of the method call. If the result is a pandas DataFrame, it is wrapped in an CytoDataFrame instance with additional context information (data context directory and data bounding box).
- Return type:
Any
- static draw_outline_on_image_from_mask(actual_image_path: str, mask_image_path: str) <module 'PIL.Image' from '/home/runner/.cache/pypoetry/virtualenvs/cytodataframe-wPvlLXoR-py3.11/lib/python3.11/site-packages/PIL/Image.py'> [source]#
Draws outlines on a TIFF image based on a mask image and returns the combined result.
This method takes the path to a TIFF image and a mask image, creates an outline from the mask, and overlays it on the TIFF image. The resulting image, which combines the TIFF image with the mask outline, is returned.
- Parameters:
actual_image_path (str) – Path to the TIFF image file.
mask_image_path (str) – Path to the mask image file.
- Returns:
A PIL Image object that is the result of combining the TIFF image with the mask outline.
- Return type:
PIL.Image.Image
- Raises:
FileNotFoundError – If the specified image or mask file does not exist.
ValueError – If the images are not in compatible formats or sizes.
- static draw_outline_on_image_from_outline(actual_image_path: str, outline_image_path: str) <module 'PIL.Image' from '/home/runner/.cache/pypoetry/virtualenvs/cytodataframe-wPvlLXoR-py3.11/lib/python3.11/site-packages/PIL/Image.py'> [source]#
Draws green outlines on a TIFF image based on an outline image and returns the combined result.
This method takes the path to a TIFF image and an outline image (where outlines are non-black and the background is black) and overlays the green outlines on the TIFF image. The resulting image, which combines the TIFF image with the green outline, is returned.
- Parameters:
actual_image_path (str) – Path to the TIFF image file.
outline_image_path (str) – Path to the outline image file.
- Returns:
A PIL Image object that is the result of combining the TIFF image with the green outline.
- Return type:
PIL.Image.Image
- Raises:
FileNotFoundError – If the specified image or outline file does not exist.
ValueError – If the images are not in compatible formats or sizes.
- export(file_path: str, **kwargs: Dict[str, Any]) None [source]#
Exports the underlying pandas DataFrame to a file.
- Parameters:
file_path (str) – The path where the DataFrame should be saved.
**kwargs – Additional keyword arguments to pass to the pandas to_* methods.
- get_bounding_box_from_data() CytoDataFrame_type | None [source]#
Retrieves bounding box data from the DataFrame based on predefined column groups.
This method identifies specific groups of columns representing bounding box coordinates for different cellular components (cytoplasm, nuclei, cells) and checks for their presence in the DataFrame. If all required columns are present, it filters and returns a new CytoDataFrame instance containing only these columns.
- Returns:
A new instance of CytoDataFrame containing the bounding box columns if they exist in the DataFrame. Returns None if the required columns are not found.
- Return type:
Optional[CytoDataFrame_type]
- static is_notebook_or_lab() bool [source]#
Determines if the code is being executed in a Jupyter notebook (.ipynb) returning false if it is not.
This method attempts to detect the interactive shell environment using IPython’s get_ipython function. It checks the class name of the current IPython shell to distinguish between different execution environments.
- Returns:
- True
if the code is being executed in a Jupyter notebook (.ipynb).
- False
otherwise (e.g., standard Python shell, terminal IPython shell, or scripts).
- Return type:
bool
- process_image_data_as_html_display(data_value: Any, bounding_box: Tuple[int, int, int, int]) str [source]#
Process the image data based on the provided data value and bounding box, applying masks or outlines where applicable, and return an HTML representation of the cropped image for display.
- Parameters:
data_value (Any) – The value to search for in the file system or as the image data.
bounding_box (Tuple[int, int, int, int]) – The bounding box to crop the image.
- Returns:
The HTML image display string, or the unmodified data value if the image cannot be processed.
- Return type:
str
- search_for_mask_or_outline(data_value: str, pattern_map: dict, file_dir: str, candidate_path: Path, mask: bool = True) <module 'PIL.Image' from '/home/runner/.cache/pypoetry/virtualenvs/cytodataframe-wPvlLXoR-py3.11/lib/python3.11/site-packages/PIL/Image.py'> [source]#
Search for a mask or outline image file based on the provided patterns and apply it to the target image.
This function attempts to find a mask or outline image for a given data value, either based on a pattern map or by searching the file directory directly. If a mask or outline is found, it is drawn on the target image. If no relevant file is found, the function returns None.
- Parameters:
data_value (str) – The value used to match patterns for locating mask or outline files.
pattern_map (dict) – A dictionary of file patterns and their corresponding original patterns for matching.
file_dir (str) – The directory where image files are stored.
candidate_path (pathlib.Path) – The path to the candidate image file to apply the mask or outline to.
mask (bool, optional) – Whether to search for a mask (True) or an outline (False). Default is True.
- Returns:
The target image with the applied mask or outline, or None if no relevant file is found.
- Return type:
Image
- sort_values(*args: List[Any], **kwargs: Dict[str, Any]) CytoDataFrame_type [source]#
Sorts the DataFrame by the specified column(s) and returns a new CytoDataFrame instance.
Note: we wrap this method within CytoDataFrame to help ensure the consistent return of CytoDataFrames in the context of pd.Series (which are treated separately but have specialized processing within the context of sort_values).
- Parameters:
*args (List[Any]) – Positional arguments to be passed to the pandas DataFrame’s sort_values method.
**kwargs (Dict[str, Any]) – Keyword arguments to be passed to the pandas DataFrame’s sort_values method.
- Returns:
A new instance of CytoDataFrame sorted by the specified column(s).
- Return type:
CytoDataFrame_type
cytodataframe.image#
Helper functions for working with images in the context of CytoDataFrames.
- src.cytodataframe.image.adjust_image_brightness(image: <module 'PIL.Image' from '/home/runner/.cache/pypoetry/virtualenvs/cytodataframe-wPvlLXoR-py3.11/lib/python3.11/site-packages/PIL/Image.py'>) <module 'PIL.Image' from '/home/runner/.cache/pypoetry/virtualenvs/cytodataframe-wPvlLXoR-py3.11/lib/python3.11/site-packages/PIL/Image.py'> [source]#
Adjust the brightness of an image using histogram equalization.
- Parameters:
image (Image) – The input PIL Image.
- Returns:
The brightness-adjusted PIL Image.
- Return type:
Image
- src.cytodataframe.image.is_image_too_dark(image: <module 'PIL.Image' from '/home/runner/.cache/pypoetry/virtualenvs/cytodataframe-wPvlLXoR-py3.11/lib/python3.11/site-packages/PIL/Image.py'>, pixel_brightness_threshold: float = 10.0) bool [source]#
Check if the image is too dark based on the mean brightness. By “too dark” we mean not as visible to the human eye.
- Parameters:
image (Image) – The input PIL Image.
threshold (float) – The brightness threshold below which the image is considered too dark.
- Returns:
True if the image is too dark, False otherwise.
- Return type:
bool