Validation Methods

From FarsightWiki
Revision as of 15:03, 14 April 2009 by Roysam (Talk | contribs)
Jump to: navigation, search

Validation is the process of establishing the validity of an automated algorithm. Closely related topics are performance assessment and quality control. In fact, these topics are closely related and must be considered together. Validation and performance are of utmost importance since a carefully validated algorithm with known (and acceptable) performance can be deployed with confidence in biological studies, and vice versa. In the FARSIGHT project, we are interested in validation and performance assessment methods for detection, segmentation, classification, and change analysis algorithms. We are also interested in methods that are practical to use, scalable, and requiring minimal manpower compared to traditional methods.


Classical Validation Methods: The classical approach to validation is to compare automated results to “ground truth data” that is known in advance to be an accurate result. Two such cases are listed below:

a. Synthetic Data: One approach to ground truth generation is to create a man-made phantom image (or image sequence) with known parameters and perturbations. This is helpful for evaluating several aspects of image analysis algorithms in a tightly controlled manner. For example, one can make sure that the algorithm produces the correct result when zero noise or other perturbations are applied. One can also study performance degradation as noise and perturbations are added. This type of study also allows comparisons of algorithms under identical conditions.

b. Data from Physical Phantoms: At the next level of sophistication, one can place precisely fabricated man-made objects under the microscope (e.g., fluorescent microbeads, dye-filled micro-pipettes, etc.) and record images. This method is a little more realistic and less tightly controlled compared to the above case, and is often helpful in developing a better understanding of algorithm strengths and weaknesses.

c. Real Biological data: A practical strategy for validating automated image analysis results for real data is to use the currently best-available “gold standard”. In this regard, the human observer continues to be the most widely accepted and versatile gold standard since the human visual system remains unbeatable for visual tasks. In other words, image analysis systems are attempting to automate tasks that have traditionally been carried out by humans. The human visual system takes up roughly 2/3 of our brain's computing capacity, and is a formidable competitor. This is not to imply that the human observer is perfect – this is far from true. It has significant weaknesses as well.

Personal tools