Quantitative Big Imaging

Kevin Mader, Maria Büchner
10 April 2014

ETHZ: 227-0966-00L

Statistics and Reproducibility

Exercises

Tools for the Exercise

  • Basic Exercises
    • FIJI (ImageJ with all the plugins already installed): Visualization, Preprocessing, Segmentation, Morphometrics [http://www.fiji.sc]
    • RStudio (Statistics and Plots)

Advanced

  • testit package in R to test functions
  • knitr for report generation

Objectives

  1. Create scripts to analyze a sample in exactly the same way
  2. Create scripts which perform parameter sweeps
  3. Process and summarize results from parameter sweeps
  4. Compare two different groups using the student t-test

Advanced

  1. Create a Unit-Test for Shape Analysis

Setting up the Computers

Once you have logged in

  1. Install FIJI (if it is not already installed)
    • FIJI can be downloaded from this link
  2. Start FIJI
  3. Start RStudio rstudio at command line

Setting up the Computers

FIJI: Recording Macros

Recording Marco's Movie

  • ImageJ provides a powerful macro language, API
  • macro-recording feature that allows you to record operations and apply the same steps in a folder full of images
  • not all features are supported so be careful
  1. You can run a Macro again by going to New \( \rightarrow \) _Script, pasting the code in
  2. Set the Language \( \rightarrow \) to ImageJ Macro
  3. Run

Macros to MIJI

For many commands, the results from macro recording can be directly used in Matlab/MIJI by making the following changes

  1. add MIJ. before the command
  2. Replace the character “ with '
run("Compute Curvatures", "compute sigma=0.5000 use order");

\[ \downarrow \textrm{Becomes} \]

MIJ.run('Compute Curvatures', 'compute sigma=0.5000 use order');

Creating a Scripted Analysis

Starting Macro Movie So we have a macro which runs once, how can we modify it to perform a Parameter Sweep Analysis?

  • Using the loops inside the ImageJ API.
  • Make sure you save each file with a different name
  • Make sure you select the image window before doing any additional processing
for (threshold=90; threshold<110; threshold++) {
      selectWindow("blobs.gif");
      setThreshold(threshold, 255);
      run("3D Objects Counter", "threshold="+threshold+" slice=0 min.=1 max.=65024 statistics");
      outFileName=getDirectory("home")+File.separator+"blob_analysis_th_"+threshold+".txt";
      print("Saving Results as:"+outFileName);
      saveAs("Results", outFileName);
}

Create a new Macro

Use these starting macros and scripts

  1. Plugins \( \rightarrow \) New \( \rightarrow \) Macro
  2. Language \( \rightarrow \) ImageJ Macro
  3. Type or paste code in and run

Looking at the results with R

Start RStudio, Install the plyr and ggplot2 packages

install.packages(c("plyr","ggplot2"))

Read in the data

library("plyr")
library("ggplot2")
# get the threshold from the file name (thresh.loc means 4th spot by _ in the name)
thresh.from.filename<-function(in.name,thresh.loc=4) {
  last.name<-sub("^([^.]*).*", "\\1", basename(in.name))
  as.numeric(strsplit(last.name,"_")[[1]][thresh.loc])
}
file.list<-Sys.glob(file.path(Sys.getenv("HOME"),"blob_analysis_th_*.txt"))
# function to read each file and add a column for threshold
read.fcn<-function(in.name) cbind(read.csv(in.name,sep="\t"),threshold=thresh.from.filename(in.name))
# read in all of the files
all.results<-ldply(file.list,read.fcn)

Show the Columns in the Data

head(all.results)
X Volume..pixel.3. Surface..pixel.2. Nb.of.obj..voxels Nb.of.surf..voxels IntDen Mean StdDev Median Min Max X.1 Y Z Mean.dist..to.surf…pixel. SD.dist..to.surf…pixel. Median.dist..to.surf…pixel. XM YM ZM BX BY BZ B.width B.height B.depth threshold
1 650 650 650 92 117714 181.1 35.70 193.0 101 222 19.97 14.09 0 14.439 3.110 14.226 19.95 13.833 0 8 0 0 30 32 0 100.txt
2 302 302 302 63 49549 164.1 28.79 176.5 100 194 62.96 5.53 0 9.737 2.483 9.661 63.11 5.238 0 50 0 0 27 14 0 100.txt
3 902 902 902 104 176116 195.3 40.80 210.0 100 244 108.32 13.89 0 16.551 1.351 16.420 108.45 13.502 0 92 0 0 33 31 0 100.txt
4 808 808 808 113 152712 189.0 49.95 196.0 100 248 150.45 13.52 0 15.800 3.965 15.024 151.47 12.466 0 130 0 0 40 30 0 100.txt
5 622 622 622 89 125541 201.8 40.13 220.0 100 241 245.36 14.80 0 13.768 2.252 14.291 246.01 14.062 0 234 0 0 22 32 0 100.txt
6 1239 1239 1239 151 222132 179.3 39.05 191.0 100 241 182.97 26.57 0 18.573 7.853 20.040 183.25 26.219 0 159 4 0 51 46 0 100.txt

Grammar of Graphics Plots

Making plots or graphics should be divided into separate independent components.

  • Setup is the \( ggplot \) command and the data
  • Mapping is in the \( aes \) command \[ ggplot(\textrm{input data frame},aes(x=\textrm{name of x column},y=\textrm{name of y column}))+ \]
  • Plot is the next command (geom_point, geom_smooth, geom_density, geom_histogram, geom_contour) \[ \textrm{geom_point}()+ \]
  • Coordinates can then be added to any type of plot (coord_equal, coord_polar, etc)
  • Scales can also be added (scale_x_log10, scale_y_sqrt, scale_color_gradientn)
  • Labels are added \[ labs(x="\textrm{x label}",y="\textrm{y label}",title="\textrm{title}") \]

Show the volume

# ggplot then data.frame name, then aes for the mapping of variables, then plus to add elements
ggplot(all.results,aes(x=threshold,y=Nb.of.obj..voxels))+
  geom_point()+geom_smooth()+ # add points and a smooth curve
  labs(y="Voxel Count")+theme_bw(20) # make the font size correct and make the background white

plot of chunk unnamed-chunk-4

Show the volume distribution for each threshold

# ggplot then data.frame name, then aes for the mapping of variables, then plus to add elements
ggplot(all.results,aes(color=as.factor(threshold),x=Nb.of.obj..voxels))+
  geom_density()+ # add points and a smooth curve
  labs(y="Voxel Count")+theme_bw(20) # make the font size correct and make the background white

plot of chunk unnamed-chunk-5

Create a summary table

summary.table<-ddply(all.results,.(threshold), # process each threshold separately
function(in.df) { # code to apply to each
  data.frame(obj.count=nrow(in.df), # number of rows in df
    mean.volume=mean(in.df$Nb.of.obj..voxels),
    mean.surface=mean(in.df$Nb.of.surf..voxels),
    mean.intensity=mean(in.df$Mean))
})
threshold obj.count mean.volume mean.surface mean.intensity
90 47 795.7 99.34 166.9
91 48 773.6 97.17 167.5
92 49 753.6 94.82 168.2
93 50 734.7 92.76 169.0
95 49 740.6 94.02 171.5
96 49 735.4 93.98 172.0

Show the object count vs threshold

# ggplot then data.frame name, then aes for the mapping of variables, then plus to add elements
ggplot(summary.table,aes(x=threshold,y=obj.count))+
  geom_point()+geom_smooth()+ # add points and a smooth curve
  labs(y="Object Count")+theme_bw(20) # make the font size correct and make the background white

plot of chunk unnamed-chunk-8

Show the intensity inside the object vs threshold

# ggplot then data.frame name, then aes for the mapping of variables, then plus to add elements
ggplot(summary.table,aes(x=threshold,y=mean.intensity))+
  geom_point()+geom_smooth()+ # add points and a smooth curve
  labs(y="Mean Intensity")+theme_bw(20) # make the font size correct and make the background white

plot of chunk unnamed-chunk-9

More Complicated Analysis

Changing filters and threshold, starting code

Feel free to use different filters or change sigma instead of changing the filter

Reading in the data

Read in the data

library("plyr")
library("ggplot2")
# get the parameters from the file name (parm.loc means 4th spot by _ in the name)
parm.from.filename<-function(in.name,parm.loc=4) {
  last.name<-sub("^([^.]*).*", "\\1", basename(in.name))
  as.numeric(strsplit(last.name,"_")[[1]][parm.loc])
}
file.list<-Sys.glob(file.path(Sys.getenv("HOME"),"bigblob_analysis_th_*.txt"))
# function to read each file and add a column for threshold
read.fcn<-function(in.name) cbind(read.csv(in.name,sep="\t"),
          threshold=parm.from.filename(in.name,parm.loc=4),
          filter=parm.from.filename(in.name,parm.loc=6))
# read in all of the files
all.results<-ldply(file.list,read.fcn)

Create a summary table

# add a column for filter name
all.results$filter.name<-
  as.factor(with(all.results,ifelse(filter==0,"Gaussian",ifelse(filter==1,"Median","Maximum"))))
smt.func<-function(in.df) { # code to apply to each
  data.frame(obj.count=nrow(in.df), # number of rows in df
    mean.volume=mean(in.df$Nb.of.obj..voxels),
    mean.surface=mean(in.df$Nb.of.surf..voxels),
    mean.intensity=mean(in.df$Mean))
}
summary.table<-ddply(all.results,.(threshold,filter.name),smt.func) # process each threshold and filter separately
threshold filter.name obj.count mean.volume mean.surface mean.intensity
80 Gaussian 60 501.6 71.72 152.0
80 Maximum 48 816.3 104.08 166.4
85 Gaussian 59 488.9 71.68 156.6
85 Maximum 53 705.4 91.62 175.1
90 Gaussian 62 447.8 66.76 158.4
90 Maximum 55 654.2 86.85 179.2

Show the object count vs threshold

# ggplot then data.frame name, then aes for the mapping of variables, then plus to add elements
ggplot(summary.table,aes(x=threshold,y=obj.count,color=filter.name))+
  geom_point()+geom_smooth()+ # add points and a smooth curve
  labs(y="Object Count")+theme_bw(20) # make the font size correct and make the background white

plot of chunk unnamed-chunk-13

Show the volume the object vs threshold

# ggplot then data.frame name, then aes for the mapping of variables, then plus to add elements
ggplot(summary.table,aes(x=threshold,y=mean.volume,color=filter.name))+
  geom_point()+geom_smooth()+ # add points and a smooth curve
  labs(y="Mean Volume")+theme_bw(20) # make the font size correct and make the background white

plot of chunk unnamed-chunk-14

Show the surface area to volume of the objects vs threshold

# ggplot then data.frame name, then aes for the mapping of variables, then plus to add elements
ggplot(summary.table,aes(x=threshold,y=mean.surface/mean.volume,color=filter.name))+
  geom_point()+geom_smooth()+ # add points and a smooth curve
  labs(y="Surface to Volume")+theme_bw(20) # make the font size correct and make the background white

plot of chunk unnamed-chunk-15

Show the intensity inside the object vs threshold

# ggplot then data.frame name, then aes for the mapping of variables, then plus to add elements
ggplot(summary.table,aes(x=threshold,y=mean.intensity,color=filter.name))+
  geom_point()+geom_smooth()+ # add points and a smooth curve
  labs(y="Mean Intensity")+theme_bw(20) # make the font size correct and make the background white

plot of chunk unnamed-chunk-16

Evaluate Thickness and Spacing vs Threshold

Using the same basic loop as before, thickness can be evaluated vs threshold and filter. Using this startup Imagej macro and R code. The threshold is run on the foreground and background of every image. The major difference is instead of making many files, only one file will be made with all the data saved in it (using the getHistogram command)

Loading Direct Histogram Data

library("plyr")
library("ggplot2")
thick.results<-read.csv(file.path(Sys.getenv("HOME"),"blobthk_analysis.txt"))
# add a column for filter name
thick.results$filter.name<-
  as.factor(with(thick.results,ifelse(Filter==0,"Gaussian",ifelse(Filter==1,"Median","Maximum"))))

Showing Direct Histogram Data

The histograms can be directly shown using points and lines, facet_grid can be used to make multiple plots (like subplot)

ggplot(subset(thick.results,Voxels>0),aes(x=Thickness,y=Voxels,color=as.factor(Threshold)))+
  geom_point()+geom_line()+
  facet_grid(Phase~filter.name)+scale_y_log10()+scale_x_log10()

plot of chunk unnamed-chunk-18

Show Results using Density

We can use the density feature of ggplot to smooth the results by taking the Voxel count as a weight instead of an absolute value

ggplot(subset(thick.results,Voxels>0),aes(x=Thickness,count=Voxels,color=as.factor(Threshold)))+
  geom_density()+
  facet_grid(Phase~filter.name)

plot of chunk unnamed-chunk-19

Create a Summary Table

library("plyr")
library("ggplot2")
smthk.func<-function(in.df) { # code to apply to each
    data.frame(vox.count=sum(in.df$Voxel),
               mean.thickness=sum(in.df$Thickness*in.df$Voxel)/sum(in.df$Voxel),
               max.thickness=max(in.df$Thickness)
               )
}
summary.table<-ddply(thick.results,.(Threshold,filter.name,Phase),smthk.func) # process each threshold and filter separately
Threshold filter.name Phase vox.count mean.thickness max.thickness
80 Gaussian bg 65024 11.453 33.87
80 Gaussian fg 65024 12.428 35.86
80 Maximum bg 65024 3.115 28.17
80 Maximum fg 65024 38.092 93.38
80 Median bg 65024 5.795 32.43
80 Median fg 65024 26.925 63.28

Thickness vs Threshold

# ggplot then data.frame name, then aes for the mapping of variables, then plus to add elements
ggplot(summary.table,aes(x=Threshold,y=mean.thickness,color=filter.name))+
  geom_point()+geom_smooth()+ # add points and a smooth curve
  facet_wrap(~Phase,scales="free_y")+
  scale_y_sqrt()+
  labs(y="Mean Thickness (vx)")+theme_bw(20) # make the font size correct and make the background white

plot of chunk unnamed-chunk-22

Batch Mode and Command Line Arguments

ImageJ

We can use the command setBatchMode(true) to run our code in batch mode from the command line (more efficient and fewer windows). We can also have the script read in command line parameters

name = getArgument;
if (name=="") exit ("No argument!");
path = getDirectory("home")+"images"+File.separator+name;
setBatchMode(true);
open(path);
print(getTitle+": "+getWidth+"x"+getHeight);

We can now run the script using java -jar ij.jar -macro TestMacro.ijm blobs.tif

Other Functions

There is a substantial library of Macro functions with some degree of documentation here, with macro's nearly all the functionality of imageJ is available and if you cannot get something to work google is usually the best way to figure out how to do it.

Comparing Groups

For this we take the best parameter settings as a starting point and compare different regions within the sample

best.table<-subset(all.results,threshold==100 & filter.name=="Median")
id X Volume..pixel.3. Surface..pixel.2. Nb.of.obj..voxels Nb.of.surf..voxels IntDen Mean StdDev Median Min Max X.1 Y Z Mean.dist..to.surf…pixel. SD.dist..to.surf…pixel. Median.dist..to.surf…pixel. XM YM ZM BX BY BZ B.width B.height B.depth threshold filter filter.name
59 1 482 482 482 81 87152 180.8 36.89 192 104 232 20.01 13.357 0 12.608 3.305 12.596 20.00 13.169 0 10 0 0 26 30 0 100 1 Median
60 2 211 211 211 52 35576 168.6 28.63 184 104 208 63.00 4.654 0 8.021 1.968 7.897 63.08 4.423 0 53 0 0 22 12 0 100 1 Median
61 3 704 704 704 91 139656 198.4 35.75 208 104 248 108.37 12.920 0 14.570 1.149 14.464 108.45 12.668 0 95 0 0 28 29 0 100 1 Median
62 4 475 475 475 73 98352 207.1 45.74 232 104 248 154.41 10.091 0 11.899 1.118 11.820 154.38 9.772 0 143 0 0 25 24 0 100 1 Median
63 5 501 501 501 81 103624 206.8 36.31 224 104 248 246.59 13.828 0 12.350 2.236 12.692 247.03 13.215 0 237 0 0 19 30 0 100 1 Median
64 6 315 315 315 55 60376 191.7 44.26 208 104 248 197.29 15.886 0 9.577 0.591 9.530 197.27 15.859 0 188 6 0 20 21 0 100 1 Median

Comparing Groups

The left half is normal and the right half has a treatment applied

best.table$treatment<-with(best.table,ifelse(X.1<128,"Normal","Treated"))
ggplot(best.table,aes(x=X.1,y=Y,color=treatment,size=Nb.of.obj..voxels))+geom_point()

plot of chunk unnamed-chunk-25

Comparing Groups: ICC

The left half is normal and the right half has a treatment applied and we want to know if there is a difference in intensity or volume. The first value we can calculate is the Interclass Correlation Coefficient as covered in the lecture. 1. First install the package ICC

install.packages("ICC")
library("ICC")
icVal<-ICCbare(treatment,Nb.of.obj..voxels,data=best.table)$ICC
ggplot(best.table,aes(x=treatment,y=Nb.of.obj..voxels))+
  geom_point()+
  labs(x="Treatment",y="Area",title=paste("ICC:",icVal))+
  theme_bw(20)

plot of chunk unnamed-chunk-27

ICC: For Intensity

Mean Intensity

icVal<-ICCbare(treatment,Mean,data=best.table)$ICC
ggplot(best.table,aes(x=treatment,y=Mean))+
  geom_point()+
  labs(x="Treatment",y="Mean Intensity",title=paste("ICC:",icVal))+
  theme_bw(20)

plot of chunk unnamed-chunk-28

X Position

icVal<-ICCbare(treatment,X.1,data=best.table)$ICC
ggplot(best.table,aes(x=treatment,y=X.1))+
  geom_point()+
  labs(x="Treatment",y="X Position",title=paste("ICC:",icVal))+
  theme_bw(20)

plot of chunk unnamed-chunk-29

Comparing Groups: Student's T-Test

require(compareGroups)
tt.group<-compareGroups(treatment~Nb.of.obj..voxels+Mean,data=best.table)
o.table<-createTable(tt.group)