Microglia Tracer

From FarsightWiki
Jump to: navigation, search

(under construction)

The tracing procedure followed to segment microglia processes was originally developed by Amit Mukherjee for the purpose of segmenting neurons. This tracer requires as input microglia soma centroids, as well as an image of microglia somas. Both can be calculated by the application of the Nucleus Editor [1].

The main stages of the tracing algorithm can be summarized as follows:

1. Identify points of interest: a ridge detection algorithm is used to isolate the centerlines of long, narrow objects.

2. Generate tree structures from these points: each microglial cell is modeled by a separate tree.

3. Prune and interpolate these trees: traces are smoothed, erroneous branches are removed and the system resources needed to model these cells are reduced.


1. Ridge Detection

At this point in the tracing process our goal is to isolate pixels that lie on the centerline of the processes of the microglia in our input image. Since microglia processes are long, narrow objects, a ridge detection algorithm is used to identify these points. This is an appropriate strategy to pursue, since a ridge is a curve whose points are local maxima, as is the case with the points of interest. We use the Laplacian of Gaussian (LoG) for ridge detection because it produces a strong response when we are near an edge. One problem with ridge detection is that it tends to be scale dependent. For this reason, we use a multi-scale version of the Laplacian of Gaussian filter, with multiple "sigma" parameters. This filter computes a response for each of a set of logarithmic scales.

We use a 3x3x3 neighborhood to search for pixels that are on a ridge. The ”ridgeness” test works as follows: first we compute the maximum response from the LoG filter for each pair of diametrically opposing pixels within the neighborhood . We then take the average of these maximum responses. If the response of the center pixel exceeds a certain threshold (thresh1) and the difference of this responce from the average of the maximum responses exceeds another threshold (thesh2), then the center pixel is detected as a ridge pixel. This ridge detection process is repeated across the entire range of scales. The pixels that were detected at any scale are considered primary nodes.

Hessian matrix is considered in order to filter out nodes which cannot be associated with the tubular structure of neurons. Hessian provides second derivative information and contrary to the Laplacian, which only provides scalar values, it returns both mixed and unmixed derivatives, i.e. it is more informative on the directionality of intensity variations.

The three primary eigenvectors of the Hessian matrix lamda_1, lamda_2, lamda_3 are used for the filtering process (unfinished-Hessian test to be added).


2. Graph Construction

Our next task is to determine how the previously detected nodes are connected. In order to solve this problem, we construct a minimum spanning tree (MST) to model each microglia.A MST, like any graph, consists of nodes and edges. In our case, each node represents a pixel in our 3D input image of microglia. Each edge is the cost for considering that pixel as belonging to a cell’s process. Once tracing is complete we are left with a forest, where each tree models the morphology of a single microglial cell.

Any given edge in our MST is a quantitative measurement of how likely it is that a given node lies along the process of a microglial cell. The definition of our metric is given by Algorithm 4. This measurement is primarily based on image intensity. This makes sense intuitively because bright pixels are more likely to belong to regions of interest. Our metric also incorporates the costs of the neighboring nodes.

For the purposes of graph construction, we distinguish between two different types of nodes. We reserve the term primary nodes for those pixels that were selected by the ridge detection procedure. Only primary nodes will be included in the morphological models of the microglia processes. All other pixels will be referred to as secondary nodes. These points are still valuable to us, as they are used in calculating the costs associated with their neighbors.

In order to construct our MSTs, we use an adaptation of Prim’s algorithm. Prim’s algorithm constructs a MST by iteratively adding the least expensive nonmember node to the tree. This process is completed when all of the nodes have been incorporating into the tree. Our implementation differs from the traditional algorithm in a few ways: -we are constructing multiple trees simultaneously from a common pool of nodes. -the edge costs between nodes are not all defined when the algorithm begins. This is because our cost metric (described in Algorithm 4) depends on the costs of the neighboring nodes. As the trees expand from their roots, more nodes are considered and their costs are updated. -The final difference is that our algorithm incorporates a "cost threshold" parameter. Nodes that have been deemed more expensive than this threshold will not be added to a tree. This is a safety mechanism to prevent distant pixels from being erroneously incorporated into our models.

In order to implement our modified version of Prim’s algorithm we make use of a priority queue. Each element of the queue is a node in the original input image. Both primary and secondary nodes will be added to the queue. The priority of each element is given by its cost.

The queue is initialized to contain the centroids of the somata. These initial elements are all assigned a cost of zero, corresponding to the foremost priority. Each such initial element becomes the root of a separate tree, since it belongs to a separate microglial cell. Once our priority queue has been initialized we start using our modified version of Prim’s algorithm to construct models of the microglia. An overview of this procedure follows. A more detailed pseudocode representation of this algorithm is presented as Algorithm 1.

Each iteration of the graph construction algorithm consists of the following steps:

1. Remove node n with the lowest cost from the queue. Steps 2 - 4 are only executed if it’s a primary node that hasn’t already been assigned to a tree.

2. Find the closest tree t to node n. Here ”closest” is meant in the sense of minimizing our cost metric. We will refer to the node closest to n that already belongs to t as leaf. See Algorithm 2 for more details about this tree finding step.

3. Assign to tree t all the nodes along the path from n to leaf.

4. Update the cost of all the nodes along this path. Algorithm 5 explains how this is done.

5. Search node n’s neighborhood for other inexpensive nodes and add them to the priority queue. This neighborhood searching procedure is described by Algorithm 3.

We repeat this procedure until the queue is empty.


Algorithm 1- Graph Construction Overview

while queue is not empty do
 queryNode <- queue.pop()
 if queryNode.Cost>= threshold then
  continue
 end if
 if queryNode.TreeId = 0 then {This node does not belong to a tree yet} 
  [treeID, path] = GetTreeNearNode(queryNode)
  if treeID <> 0 then {If a tree was found near queryNode}
   queryNode.TreeID = treeID
   parent = path.pop back()
   while path is not empty do
    pathNode = path.pop back()
    UpdateCostFactor(pathNode) {See Algorithm 5}
    queue.push(pathNode) {This is done so this node's neighborhood will be searched}
    pathNode.T reeID = treeID
    parent.addChild(pathNode)
    parent = pathNode
   end while
  end if
 end if
 AnalyzeNeighbors(queryNode) {More nodes may be added to the queue during this step}
end while


Algorithm 2 Find a tree near a given node

currentNode <- queryNode
tree <- 0
prevDir <- {0, 0, 0} 
trail <- {}
while tree = 0 and numSteps < 500 do {500 iterations max}
 {Find the direction where the cost decreases most rapidly}
 currentDir <- grad(Cost)
 {Simulate momentum by incorporating our previous direction}
 currentDir <- (currentDir+prevDir)/abs(currentDir+prevDir)
 {Find a new node by taking a step in this direction}
 nextNode <- (currentNode + currentDir)
 {Add this new node to our trail}
 trail.push back(nextNode)
 if nextNode.T reeID > 0 then {If the new node belongs to a tree} 
  tree <- nextNode.TreeID
 else
  currentNode <- nextNode
  prevDir <- currentDir
 end if
end while
if tree <> 0 then
 return [tree, trail]
end if


Algorithm 3 Find low-cost neighbors near node n

for all neighbor in n's 3x3x3 neighborhood do
 oldCost <- neighbor.Cost
 newCost <- GetCost(neighbor) {See Algorithm 4}
 if newCost < oldCost then
  neighbor.Cost <- newCost
  queue.push(neighbor)
 end if
end for


Algorithm 4 Calculate the cost associated with adding pixel p to a tree

c1, c2, c3: costs of the least-expensive neighbors in each cardinal direction

Require: c1 < c2 < c3

pVal <- 1/I(p) {pVal is the inverse of the intensity value of the input pixel p} 
Cost <- 0
Δ <- (c1+c2+c3)^2-3(c_1^2+c_2^2+c_3^2-pVal^2)  
if Δ >=0
 Cost <- (c1+c2+c3+Δ^2)/3		 
end if
if Cost<=c3  then
 Δ<-(c1+c2)^2-2(c1^2+c2^2-pVal^2)
 if Δ >=0  then
  Cost <-(c1+c2+Δ^2)/2 
 end if
 if Cost<c2  then
  Cost<-c1+pVal	 
 end if
end if
return Cost


Algorithm 5 Update a node's cost based on its position relative to a nearby tree

node: the node whose score we're updating.

leaf: the end of the chain connecting node to its tree.

root: the root of the tree that node is being added to.

v1 <- node.Position - leaf.Position {v1: vector from leaf to node} 
v2 <- leaf.Position - root.Position {v2: vector from root to leaf} 
w <- v1 * v2
if w <= 0 then
 costFactor = 1.0
else if w > 0 and w <= 0.98 then
 costFactor = 1.0 - w
else
 costFactor = 0
end if
node.Cost <- node.Cost * costFactor


3. Interpolation

Once we’ve generated a MST for each of the microglia, our next step is to use interpolation to increase the accuracy of the nodes’ locations. During interpolation we iterate over all of the primary nodes that have been incorporated into our morphological models. The only such nodes that are changed during interpolation are the intermediate nodes: those that have both a parent and at least one child. The root and leaf nodes are not modified . Each intermediate node’s position is updated to become weighted average of its prior position, its parent position, and the position of its child(ren). These terms are weighted by the image intensity at each location, since brighter pixels are more likely to fall within regions of interest.


4. Pruning

We prune the MSTs to reduce their footprint without losing any important morphological details. During this step we also remove branches that are shorter than some minimum offshoot length. We examine each node in our MSTs to see if it can be safely removed. In order to do this, we mark each node as active or inactive before any pruning actually takes place. Once pruning is complete, only the active nodes will remain. By default, allnodes are considered inactive. We mark a node as active if any of the following conditions are met:

• The node is a branch point. This means that it has more than one child.

• The node is a leaf node. This means that it has no children.

• The node is a root node. This means that it has no parent.

• The node’s parent is inactive and its child is also inactive. This criterion keeps us from removing consecutive nodes, which would otherwise result in the loss of morphological detail.

We also explicitly mark a node as inactive if its a leaf node whose distance from the branch point is less than the minimum offshoot length. The minimum offshoot length is a tunable parameter that allows us to remove small branches. We remove small branches because it is more likely that they are artifacts of tracing than it is that they represent some actual, physical structure of a microglial cell. Once we’ve analyzed the nodes in each of the MSTs, we delete all nodes that were marked as inactive. Children of inactive parents are reassigned to be children of their closest active ancestor. Once pruning is complete, we interpolate the trees another time.


5. Parameter Values for Microglia Images

Curvelets preprocessing:curvelet_sigma=0.03

External multiple neuron tracer parameters:cost threshold=300

Internal multiple neuron tracer parameters:

thresh1=0.005, thresh2=0.0003 (Node detection thresholds)

sigma[] = { 2.0f, 2.8284f, 4.0f, 5.6569f, 8.0f, 11.31f } (LoG)

minimum offshoot length=6


6. Execution

>MultipleNeuronTracer.exe <InputFileName> <SomaCentroids.txt> <CostThreshold> <SomaImageFile>