Kevin Mader
12 March 2015
ETHZ: 227-0966-00L
With inconsistent or every changing illumination it may not be possible to apply the same threshold to every image.
is easy using a threshold and size criteria (we know how big the cells should be)
is much more difficult because the small channels having radii on the same order of the pixel size are obscured by partial volume effects and noise.
Given that applying a threshold is such a common and signficant step, there have been many tools developed to automatically (unsupervised) perform it. A particularly important step in setups where images are rarely consistent such as outdoor imaging which has varying lighting (sun, clouds). The methods are based on several basic principles.
Just like we visually inspect a histogram an algorithm can examine the histogram and find local minimums between two peaks, maximum / minimum entropy and other factors
These look at the statistics of the thresheld image themselves (like entropy) to estimate the threshold
These search for a threshold which delivers the desired results in the final objects. For example if you know you have an image of cells and each cell is between 200-10000 pixels the algorithm runs thresholds until the objects are of the desired size
Taking a typical image of a bone slice, we can examine the variations in calcification density in the image
We can see in the histogram that there are two peaks, one at 0 (no absorption / air) and one at 0.5 (stronger absorption / bone)
Search and minimize intra-class (within) variance \sigma^2_w(t)=\omega_{bg}(t)\sigma^2_{bg}(t)+\omega_{fg}(t)\sigma^2_{fg}(t)
There are many methods and they can be complicated to implement yourself. FIJI offers many of them as built in functions so you can automatically try all of them on your image
While an incredibly useful tool, there are many potential pitfalls to these automated techniques.
These methods are very sensitive to the distribution of pixels in your image and may work really well on images with equal amounts of each phase but work horribly on images which have very high amounts of one phase compared to the others
These methods are sensitive to noise and a large noise content in the image can change statistics like entropy significantly.
These methods are inherently biased by the expectations you have. If you want to find objects between 200 and 1000 pixels you will, they just might not be anything meaningful.
Imaging science rarely represents the ideal world and will never be 100% perfect. At some point we need to write our master's thesis, defend, or publish a paper. These are approaches for more qualitative assessment we will later cover how to do this a bit more robustly with quantitative approaches
One approach is to try and simulate everything (including noise) as well as possible and to apply these techniques to many realizations of the same image and qualitatively keep track of how many of the results accurately identify your phase or not. Hint: >95% seems to convince most biologists
Apply the methods to each sample and keep track of which threshold was used for each one. Go back and apply each threshold to each sample in the image and keep track of how many of them are correct enough to be used for further study.
Come up with the worst-case scenario (noise, misalignment, etc) and assess how inacceptable the results are. Then try to estimate the quartiles range (75% - 25% of images).
For some images a single threshold does not work
Comparing the original image with the three phases
Now we apply two important steps. The first is to remove the objects which are not cells (too small) using an opening operation.
The second step to keep the between pixels which are connected (by looking again at a neighborhood \mathcal{N} ) to the air voxels and ignore the other ones. This goes back to our original supposition that the smaller structures are connected to the larger structures
As we briefly covered last time, many measurement techniques produce quite rich data.
A pairing between spatial information (position) and some other kind of information (value). \vec{x} \rightarrow \vec{f}
We are used to seeing images in a grid format where the position indicates the row and column in the grid and the intensity (absorption, reflection, tip deflection, etc) is shown as a different color
The alternative form for this image is as a list of positions and a corresponding value
\hat{I} = (\vec{x},\vec{f})
x | y | Intensity |
---|---|---|
1 | 1 | 14 |
2 | 1 | 42 |
3 | 1 | 83 |
4 | 1 | 42 |
5 | 1 | 78 |
1 | 2 | 57 |
This representation can be called the feature vector and in this case it only has Intensity
If we use feature vectors to describe our image, we are no longer to worrying about how the images will be displayed, and can focus on the segmentation/thresholding problem from a classification rather than a image-processing stand point.
So we have an image of a cell and we want to identify the membrane (the ring) from the nucleus (the point in the middle).
A simple threshold doesn't work because we identify the point in the middle as well. We could try to use morphological tricks to get rid of the point in the middle, or we could better tune our segmentation to the ring structure.
In this case we add a very simple feature to the image, the distance from the center of the image (distance).
x | y | Intensity | Distance |
---|---|---|---|
-10 | -10 | 0.9350683 | 14.14214 |
-10 | -9 | 0.7957197 | 13.45362 |
-10 | -8 | 0.6045178 | 12.80625 |
-10 | -7 | 0.3876575 | 12.20656 |
-10 | -6 | 0.1692429 | 11.66190 |
-10 | -5 | 0.0315481 | 11.18034 |
We now have a more complicated image, which we can't as easily visualize, but we can incorporate these two pieces of information together.
Now instead of trying to find the intensity for the ring, we can combine density and distance to identify it
iff (5<\textrm{Distance}<10 \\ \& 0.5<\textrm{Intensity}>1.0)
The distance while illustrative is not a commonly used features, more common various filters applied to the image
x | y | Intensity | Sobel | Gaussian |
---|---|---|---|---|
1 | 1 | 0.94 | 0.32 | 0.53 |
1 | 10 | 0.48 | 0.50 | 0.45 |
1 | 11 | 0.50 | 0.50 | 0.46 |
1 | 12 | 0.48 | 0.64 | 0.46 |
1 | 13 | 0.43 | 0.78 | 0.45 |
1 | 14 | 0.33 | 0.94 | 0.42 |
The distributions of the features appear very different and can thus likely be used for identifying different parts of the images.
Combine this with our a priori information (called supervised analysis)
x | y | Intensity | Sobel | Gaussian | |
---|---|---|---|---|---|
1 | 1 | 1 | 0.94 | 0.32 | 0.53 |
2 | 1 | 10 | 0.48 | 0.50 | 0.45 |
3 | 1 | 11 | 0.50 | 0.50 | 0.46 |
4 | 1 | 12 | 0.48 | 0.64 | 0.46 |
5 | 1 | 13 | 0.43 | 0.78 | 0.45 |
6 | 1 | 14 | 0.33 | 0.94 | 0.42 |
Distance metric D_{ij}=||\vec{v}_i-\vec{v}_j||
Group Count ( N=2 )
\downarrow
x | y | Intensity | Sobel | Gaussian | |
---|---|---|---|---|---|
20 | 1 | 8 | 0.33 | 0.50 | 0.40 |
21 | 1 | 9 | 0.43 | 0.50 | 0.42 |
22 | 10 | 1 | 0.48 | 0.14 | 0.45 |
23 | 10 | 10 | 0.83 | 0.50 | 0.42 |
24 | 10 | 11 | 0.91 | 0.50 | 0.36 |
x | y | Intensity | Sobel | Gaussian | |
---|---|---|---|---|---|
100 | 13 | 4 | 1.00 | 0.16 | 0.49 |
101 | 13 | 5 | 0.88 | 0.74 | 0.49 |
102 | 13 | 6 | 0.63 | 0.96 | 0.52 |
103 | 13 | 7 | 0.30 | 0.94 | 0.55 |
104 | 13 | 8 | 0.06 | 0.00 | 0.55 |
We give as an initial parameter the number of groups we want to find and possible a criteria for removing groups that are too similar
What vector space to we have?
Continuing with our previous image and applying K-means to the Intensity, Sobel and Gaussian channels looking for 2 groups we find
Looking for 5 groups
Including the position in the features as well
Since the distance is currently calculated by ||\vec{v}_i-\vec{v}_j|| and the values for the position is much larger than the values for the Intensity, Sobel or Gaussian they need to be rescaled so they all fit on the same axis \vec{v} = \left\{\frac{x}{10}, \frac{y}{10}, \textrm{Intensity},\textrm{Sobel},\textrm{Gaussian}\right\}
An approach for simplifying images by performing a clustering and forming super-pixels from groups of similar pixels.
Drastically reduced data size, serves as an initial segmentation showing spatially meaningful groups
Segment the superpixels and apply them to the whole image (only a fraction of the data and much smaller datasets)
A more general approach is to use a probabilistic model to segmentation. We start with our image I(\vec{x}) \forall \vec{x}\in \mathbb{R}^N and we classify it into two phases \alpha and \beta
P(\{\vec{x} , I(\vec{x})\} | \alpha) \propto P(\alpha) + P(I(\vec{x}) | \alpha)+ P(\sum_{x^{\prime} \in \mathcal{N}} I(\vec{x^{\prime}}) | \alpha)
Expanding on the hole filling issues examined before, a general problem in imaging is identifying regions of interest with in an image.
For samples like brains it is done to identify different regions of the brain which are responsible for different functions.
In material science it might be done to identify a portion of the sample being heated or understress.
There are a number of approaches depending on the clarity of the data and the
takes all of the points in a given slice or volume and finds the smallest convex 2D area or 3D volume (respectively) which encloses all of those points.
Depending on the type of sample the convex hull can make sense for filling in the gaps and defining the boundaries for a sample.
The critical short coming is it is very sensitive to single outlier points.
The convex hull very closely matches the area we would define as 'bone' without requiring any parameter adjustment, resolution specific adjustments, or extensive image-processing, for such a sample a convex hull is usually sufficient.
Here is an example of the convex hull applied to a region of a cortical bone sample. The green shows the bone and the red shows the convex hull. Compared to a visual inspection, the convex hull overestimates the bone area as we probably would not associate the region where the bone curves to the right with 'bone area'
Useful for a variety of samples (needn't be radially symmetric) and offers more flexibility in step size, smoothing function etc than convex hull.
If we use quartiles or the average instead of the maximum value we can make the method less sensitive to outlier pixels
Many forms of guided methods exist, the most popular is known simply as the Magnetic Lasso in Adobe Photoshop (video).
The basic principal behind many of these methods is to optimize a set of user given points based on local edge-like information in the image. In the brain cortex example, this is the small gradients in the gray values which our eyes naturally seperate out as an edge but which have many gaps and discontinuities.
Fuzzy classification based on Fuzzy logic and Fuzzy set theory and is a general catagory for multi-value logic instead of simply true and false and can be used to build IF and THEN statements from our probabilistic models.
P(\{\vec{x} , I(\vec{x})\} | \alpha) \propto P(\alpha) + P(I(\vec{x}) | \alpha)+ P(\sum_{x^{\prime} \in \mathcal{N}} I(\vec{x^{\prime}}) | \alpha)
which encompass aspects of filtering, thresholding, and morphological operations