detectda

Submodules

Classes

ImageSeries

Reads in an image series (video), either a single or multiple frames.

ImageSeriesPickle

Creates ImageSeries object from .pkl file. Designed for use with output of identify_polygon script.

ImageSeriesPlus

Reads in an image series (video), either a single or multiple frames.

VacuumSeries

Functionality to generate vacuum region videos for multiple hypothesis testing.

Package Contents

class detectda.ImageSeries(video, polygon=None, div=1, n_jobs=None)[source]

Bases: VidPol

Reads in an image series (video), either a single or multiple frames.

May optionally specify polygonally region, held constant across frames, in which to select specific generators in persistent homology.

Parameters:
  • video (array_like) – Image series. Index on axis=0 represents the frame index, unless a single image (2d array) is provided.

  • polygon (shapely.Polygon, optional, default is None) – Polygonal region outside of which positive cells of 0th persistent homology will be excluded.

  • div (positive int/float, optional, default is 1.) – In nanoparticle imaging process, pixel intensities are often registered as something close to a(div), so dividing by div and rounding to nearest integer will give pixel intensities that conform more strongly to common parametric assumptions.

  • n_jobs (int or None, optional, default is None) – The number of jobs to use for the computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.

alps_plot(frames)[source]

Returns the ALPS plots of up to 4 images in the video taken by ImageSeries.

Parameters:

frames (int or list) – Indices for frames in image series.

Raises:
  • TypeError – If type is not int nor list.

  • ValueError – If more than 4 frames are given.

Return type:

None.

fit(sigma=None, max_death_pixel_int=True, print_time=True)[source]

Fit method for ImageSeries object.

Optional Gaussian smoothing with sigma parameter.

The argument max_death_pixel_int controls whether or not the maximum death time is the largest pixel value (within an image), or the largest finite death time (within an image).

get_alps()[source]

Get ALPS statistic of each image frame from fitted object.

get_degp_totp(p=1, inf=False)[source]

Get degree-p total persistence of each image frame from fitted object.

get_pers_entr(neg=True)[source]

Get persistent entropy of each image frame from fitted object. For hypothesis testing purposes, the default is negative of the entropy

plot_im(frame, plot_poly=True, plot_pts=True, smooth=True, thr=None, **kwargs)[source]

Plot an individual frame in the video, with or without the polygonal region superimposed

class detectda.ImageSeriesPickle(file_path, div=1, n_jobs=None)[source]

Bases: ImageSeries

Creates ImageSeries object from .pkl file. Designed for use with output of identify_polygon script.

class detectda.ImageSeriesPlus(video, polygon=None, div=1, n_jobs=None, im_list=False)[source]

Bases: VidPol

Reads in an image series (video), either a single or multiple frames.

May optionally specify polygonal region, held constant across frames, in which to select specific generators in persistent homology. Similar to ImageSeries, but with enhanced functionality for utilizing BOTH 0- and 1-dimensional persistent homology.

Parameters:
  • video (array_like) – Image series. Index on axis=0 represents the frame index, unless a single image (2d array) is provided.

  • polygon (shapely.Polygon, optional, default is None) – Polygonal region outside of which positive cells of 0th persistent homology will be excluded.

  • div (positive int/float, optional, default is 1.) – In nanoparticle imaging process, pixel intensities are often registered as something close to a(div), so dividing by div and rounding to nearest integer will give pixel intensities that conform more strongly to common parametric assumptions.

  • n_jobs (int or None, optional, default is None) – The number of jobs to use for the computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.

  • im_list (bool, default is False) – Bool indicating whether or not a list of numpy arrays is given (True) rather than a 3d numpy array.

convert_to_df()[source]

Creates pandas DataFrames (self.dfs) from persistence information calculated from detectda algorithm.

Return type:

None.

fit(sigma=None, print_time=True, verbose=0)[source]

Fit method for ImageSeriesPlus object.

Optional Gaussian smoothing with sigma parameter.

get_lifetimes()[source]

Creates persistence lifetimes (self.lifetimes) for each persistence diagram in image series.

Return type:

None.

get_midlife_coords()[source]

Creates persistence midlife coordinates (self.midlife_coords) for each persistence diagram in image series.

Return type:

None.

get_pers_im(bts, lts, dim, bandwidth=1)[source]

Create persistence images from cubical persistent homology for each image in the detectda object, for homology dimension dim. The resulting dimension of the persistence image vectorizations is bts x lts.

Parameters:
  • bts (int) – birth-time resolution (higher = finer)

  • lts (int) – lifetime resolution (higher = finer).

  • bandwidth (float) – Positive number corresponding to Gaussian kernel bandwidth (i.e. variance)

  • dim (int) – Integer 0 or 1 corresponds to thresholding only based on dimension 0 and 1 persistence features.

Return type:

None.

get_pers_mag()[source]

Creates persistence magnitudes (self.pers_mag) for each persistence diagram in image series.

Return type:

None.

get_pers_stats()[source]
Get persistence statistics for each image, according to Topological approaches to skin disease analysis, along with

persistent entropy and ALPS statistics, constituting an embedding into 36-dimensional Euclidean space.

pd_threshold(minq=0.05, maxq=0.95, dim='both', num=50)[source]

Calculates binary images for all frames in video based on best persistence-preserving threshold as described in Chung and Day (2018).

Parameters:
  • minq (float) – Minimum pixel quantile threshold to consider.

  • maxq (float) – Maximum pixel quantile threshold to consider.

  • dim (str or int, optional) – Integer 0 or 1 corresponds to thresholding only based on dimension 0 and 1 persistence features. The default is “both”, corresponding to both dimensions 0 and 1.

  • num (int, optional) – Number of quantile thresholds to choose.

Raises:

HomologyError – Error raised due to lack of sufficient homology to perform PD thresholding.

Returns:

ims_t – Binary, thresholded images.

Return type:

list of ndarray.

plot_im(frame, dim=0, plot_poly=True, plot_pts=True, smooth=True, thr=None, **kwargs)[source]

Plot an individual frame in the video, with or without the polygonal region superimposed

plot_pi_sig(frame, betas_feat='pos', smooth=True, **kwargs)[source]
Parameters:
  • frame (int) – Frame which you would like to plot. Should lie in self.indices.

  • betas_feat (str, optional) – Whether to plot positive, negative betas, or both. The default is ‘pos’.

  • smooth (bool, optional) – Whether or not to smooth the image. The default is True.

  • **kwargs (dict) – Additional parameters for plotting.

Return type:

None.

proc_pers_im(betas, quantiles, indices)[source]
Parameters:
  • betas (array-like of shape (n_samples, n_features)) – Simulated values of beta, such as those contained in self.post_beta after running fit and transform methods in bclr class.

  • quantiles (array-like of shape (2,)) – Quantiles which must be greater (resp. less) than 0 for positive and negative coefficients.

  • indices (array-like of shape (n_indices,)) – Which frames to consider for processing.

Return type:

None.

class detectda.VacuumSeries(vacuum_video, observed_ImageSeries, parametric=True, div=1, n_jobs=None)[source]

Bases: detectda.imgs.ImageSeries

Functionality to generate vacuum region videos for multiple hypothesis testing.

adjust_alpha(alpha, conservative=False)[source]

Adjust p-values based on a different alpha value.

Parameters:
  • alpha (float) – Statistical significance level.

  • conservative (bool, optional) – Whether to use Benjamini-Yekutieli. The default is False.

Return type:

None.

fit(convert_to_int=False)[source]

Fits the Poisson mle for the vacuum region if parametric==True

Else, it fits the empirical probability mass function.

gen_images(n)[source]

Generate and return a random image according to estimated null distribution

kolm_dist()[source]

Check how far the empirical distribution of vacuum values is from Poisson with parameter equal to mle, in terms of the Kolmogorov distance

Uses the DKW inequality with the tight constant = 2 for Poisson testing.

plot_hypo()[source]

Plots hypothesis testing sequence.

transform(n, func='pers_entr', seed=0, alpha=0.05, conservative=False)[source]

Collects p-values and rejections for based off n Monte Carlo simulations…