FARSIGHT Toolkit

From FarsightWiki
(Difference between revisions)
Jump to: navigation, search
Line 14: Line 14:
  
 
'''Image Segmentation Library: ''' Segmentation is the automatic delineation of objects in images. Its output is a set of objects, identifiable by unique object identifiers (IDs). It is the most challenging image analysis step, and its accuracy has a direct bearing on the accuracy of the image-based measurements of ultimate interest. The best-available segmentation algorithms are model based, i.e., they rely on a mathematical model describing the expected range of morphologies and intensity patterns of the objects to be delineated. our strategy for overcoming the segmentation hurdle is based on the philosophy of “divide and conquer” enabled by the fact that fluorescence microscopy combined with spectral unmixing can cleanly separate the emission spectra into a set of pure channels, i.e., channels containing only one type of object belonging to a known morphological class most of the time. For the FARSIGHT system, we focus on seven morphological classes corresponding to blobs (B), tubes (T), Shells (S), Foci (F), plates (P), Clouds (C), and man-made objects (M). The FARSIGHT toolkit contains a collection of specialized algorithms for delineating common types of objects listed in our computational taxonomy of morphologies. For instance, there are routines for segmenting cell nuclei, thin processes of neurons, astrocytes, and microglia, and vessels that are imaged by labeling the blood flow or the vessel laminae, etc. The user is expected to specify the morphological class of objects in a given image manually, and thereby choose the corresponding segmentation algorithm.
 
'''Image Segmentation Library: ''' Segmentation is the automatic delineation of objects in images. Its output is a set of objects, identifiable by unique object identifiers (IDs). It is the most challenging image analysis step, and its accuracy has a direct bearing on the accuracy of the image-based measurements of ultimate interest. The best-available segmentation algorithms are model based, i.e., they rely on a mathematical model describing the expected range of morphologies and intensity patterns of the objects to be delineated. our strategy for overcoming the segmentation hurdle is based on the philosophy of “divide and conquer” enabled by the fact that fluorescence microscopy combined with spectral unmixing can cleanly separate the emission spectra into a set of pure channels, i.e., channels containing only one type of object belonging to a known morphological class most of the time. For the FARSIGHT system, we focus on seven morphological classes corresponding to blobs (B), tubes (T), Shells (S), Foci (F), plates (P), Clouds (C), and man-made objects (M). The FARSIGHT toolkit contains a collection of specialized algorithms for delineating common types of objects listed in our computational taxonomy of morphologies. For instance, there are routines for segmenting cell nuclei, thin processes of neurons, astrocytes, and microglia, and vessels that are imaged by labeling the blood flow or the vessel laminae, etc. The user is expected to specify the morphological class of objects in a given image manually, and thereby choose the corresponding segmentation algorithm.
 +
 +
 +
'''Tools to Optimize Segmentation:''' The performance of segmentation algorithms depends upon the selection of appropriate parameter settings. In building the FARSIGHT system, we continue to emphasize algorithms that require the fewest parameters to the extent possible. However, truly parameter-free algorithms remain a research goal. We are developing tools to make the process of parameter selection more intuitive and rapid under the FARSIGHT Rapid Prototyping System module (RPS). This is currently a work in progress that builds upon our prior work[[http://www.ecse.rpi.edu/~roysam/PDF/J55.pdf]].
  
  

Revision as of 13:05, 26 April 2009

The FARSIGHT Toolkit is not a single tightly integrated software package. It is an organized collection of software modules for image data handling, pre-processing, segmentation, post-processing, and secondary analysis, as described in the [FARSIGHT Framework]. All of the modules are written in accordance with software practices of the [Insight Toolkit] Community. Importantly, all modules are accessible through the Python scripting language. This allows users to create scripts to accomplish sophisticated associative image analysis tasks over multi-dimensional microscopy image data. This language works on most computing platforms, providing a high degree of platform independence to FARSIGHT. Another important design principle is the use of standardized XML file formats for data interchange between modules.

The following paragraphs provide an overview of how this toolkit is organized. Follow the links for further details about each module.


Bio-Formats Module: To work with microscopy data, FARSIGHT uses Bio-Formats, a Java library for reading and writing life sciences image file formats. Bio-Formats is capable of parsing both pixels and metadata for a large number of formats, as well as writing to several formats, and converts proprietary microscopy data into an open standard called the OME data model. See the Bio-Formats page for more information.


Interface to OME Image Database: Users of FARSIGHT have two main choices on how to manage their image data, and the accompanying image metadata. The simplest choice that works well when you have a modest number of datasets is to keep them in a file folder. This file folder can also store script files, data files (in XML format), and your notes. When your needs require an image database, we prefer the open source OME database, especially the most recent version named OMERO [click here for more information on OME & OMERO].


Image Pre-processing Library: The purpose of pre-processing is to improve the performance of image segmentation and measurement computation (feature extraction) algorithms. FARSIGHT includes a collection of routines for pre-processing images (e.g., smoothing, filtering, attenuation correction, and deconvolution). Most of the smoothing and multi-scale vessel enhancement are drawn from the Insight Toolkit (ITK). Spectral unmixing algorithms are drawn from the [CenSSIS Hyperspectral Image Processing Toolkit]. The image deconvolution routines are contributed by collaborators (Jose Conchello, Chrysanthe' Preza).


Image Segmentation Library: Segmentation is the automatic delineation of objects in images. Its output is a set of objects, identifiable by unique object identifiers (IDs). It is the most challenging image analysis step, and its accuracy has a direct bearing on the accuracy of the image-based measurements of ultimate interest. The best-available segmentation algorithms are model based, i.e., they rely on a mathematical model describing the expected range of morphologies and intensity patterns of the objects to be delineated. our strategy for overcoming the segmentation hurdle is based on the philosophy of “divide and conquer” enabled by the fact that fluorescence microscopy combined with spectral unmixing can cleanly separate the emission spectra into a set of pure channels, i.e., channels containing only one type of object belonging to a known morphological class most of the time. For the FARSIGHT system, we focus on seven morphological classes corresponding to blobs (B), tubes (T), Shells (S), Foci (F), plates (P), Clouds (C), and man-made objects (M). The FARSIGHT toolkit contains a collection of specialized algorithms for delineating common types of objects listed in our computational taxonomy of morphologies. For instance, there are routines for segmenting cell nuclei, thin processes of neurons, astrocytes, and microglia, and vessels that are imaged by labeling the blood flow or the vessel laminae, etc. The user is expected to specify the morphological class of objects in a given image manually, and thereby choose the corresponding segmentation algorithm.


Tools to Optimize Segmentation: The performance of segmentation algorithms depends upon the selection of appropriate parameter settings. In building the FARSIGHT system, we continue to emphasize algorithms that require the fewest parameters to the extent possible. However, truly parameter-free algorithms remain a research goal. We are developing tools to make the process of parameter selection more intuitive and rapid under the FARSIGHT Rapid Prototyping System module (RPS). This is currently a work in progress that builds upon our prior work[[1]].


Temporal Associations: Cell Tracking and Lineage Analysis: FARSIGHT contains a collection of 2D and 3D time-lapse analysis modules. The most common operation is to track moving cells. These objects can also undergo cell division, so the tracking methods are designed to cope with these important events. The results of cell tracking are a set of temporal associations (assignments) representing object correspondences over time. For each such correspondence, it is possible to compute measurements such as displacement, path tortuosity, and speed.


Image Registration and Mosaicing: registration is the process of spatially aligning two or more images. Mosaicing is the process of "stitching" these images together to form a synthetic image that covers a much greater extent of tissue than any one image. We prefer our generalized dual-bootstrap registration and joint mosaicing algorithms for most tasks in FARSIGHT. These algorithms are part of the [CenSSIS registration and mosaicing toolkit].


PACE: Pattern Analysis aided Cluster Editing: Even the best-available automated segmentation algorithms have a non-zero error rate. The FARSIGHT user is provided with a set of tools to inspect and edit any of the segmentation results to his/her satisfaction. The PACE module in FARSIGHT is a tool for performing this task efficiently. The core ideas behind this module are simple and intutive: (i) visualization systems that highlight potential errors; (ii) Computerize editing of common errors (this requires an editor for each segmentation module); (iii) Automation of repetitious edits; and (iv) Performance assessment from user edits. All of this "intelligence" is provided by pattern analysis algorithms that operate on object features (listed below). Integrated kernel based pattern analysis algorithms are used to help identify automated segmentation errors efficiently, allowing “group” or “cluster” editing of multiple errors simultaneously. This dramatically reduces the amount of manual effort required compared to unassisted edit-based methods. We term this methodology PACE (Pattern Assisted Cluster Editing). For more information see the Validation Methods page.


Intrinsic Measurements (object features) Library: FARSIGHT contains a rich and growing library of routines for making morphological measurements of segmented objects. The intrinsic measurements of each object are directly determined by its morphological class. For example, blobs representing cell nuclei are characterized by measurements of location, diameter, volume, shape factor, surface area, eccentricity, etc. Thin tube-like structures such as neurite processes are characterized by measurements of lengths and tortuosity of centerlines, surface locations, local diameter, and measurements of topological determinants such as branch points. In addition to measurements that are of interest to the biological user, FARSIGHT contains a set of diagnostic features that are designed to help diagnose the output of automated segmentation algorithms in PACE.


XML Data Interchange: The FARSIGHT toolkit uses XML files as the common data interchange format. For example, the intrinsic measurements are stored in XML file formats, in which the records are indexed by the object IDs. The XML file format is a standard of the worldwide web consortium, and offers important advantages. It is human and machine readable. Most database programs support this format for storage. Most spreadsheets can import and export XML data. The OME-TIFF format stores all image metadata using the XML representation. To learn more about XML, click here[[2]].


Primary Associative Features Library: Associative measurements quantify associations/relationships between objects. One simple set of associative measurements can be computed as soon as segmentation is completed - we call them primary associations. These associations are based primarily on spatial proximity. For instance, it is straightforward to quantify the spatial distribution of molecular markers (usually proteins) around segmented cell nuclei. FARSIGHT contains a rich and growing library of routines to compute primary associations. This includes the sub-cellular locations feature library developed at Carnegie Mellon University. primary associative features are stored in XML files and are indexed by the IDs of objects around which they are computed.


Pattern Analysis Library: FARSIGHT contains Python wrapped versions of pattern analysis software tools for unsupervised and supervised analysis. This allows the user to integrate sophisticated pattern analysis operations into a script. Perhaps the most important pattern analysis step is cell classification (cell-type identification) based on a combination of intrinsic and associative features.


TissueNets Graph Builder for Secondary Associations: The complex brain tissues of interest to us contain a dense web of interesting relationships, so we use a graph-theoretic interpretation of associative measurements. An association can be described in terms of graphs in which each object is a node with a set of attributes (intrinsic measurements of the object), and each association is a link. Nodes and their attributes are uniquely identifiable by their object IDs. The attributes of each link is a list of measurements arising from associating the respective objects, and is uniquely identifiable by the ID’s of the associated objects. FARSIGHT includes a flexible facility for building and analyzing association graphs.


Query-driven Analysis of Associations: FARSIGHT contains a growing library of tools and routines for associative data mining from graphs. The simplest such operation computes conditional histograms and empirical probability distributions. These distributions can be used for Bayesian inference. Other examples include automated summarization and critical event analysis.

Personal tools