mat.heatmap

plot heatmaps and averages of matrices

description

mat.heatmap generates heatmaps and averages of genomic data in matrix (.mat) files generated by mat.make. Heatmaps can be grouped and sorted based on a variety of criteria.

usage

mat.heatmap ( mats=NULL , matnames=NULL ,
sorton=1 , numgroups=3 , sort.methods="none" , sort.ranges = NA , sort.breaks = NA , gene.list = NULL ,
groupcolors=NULL , heatmap.lims="5%,95%" , heatmap.colors="black,white" , forcescore=TRUE ,
crossmat=NULL , matcolors=NULL ,
fragmats=NULL , fragmatnames=NULL , fragrange=c(0,200) , vplot.colors="black,blue,yellow,red" , vplot.lims="0,95%" ,
cores="max" )

arguments

Main options Description
mats a vector of matrix file names from which to draw heatmaps. default is NULL, which causes mat.heatmap to not produce score heatmaps.
matnames a character vector of names corresponding to ‘mats’ used to label heatmaps. default is NULL, causing the matrix filenames to be used to label heatmaps

Sort/group options Description
sorton positive integer of index of matrix in ‘mats’ to determine grouping/order on
numgroups a numeric vector which defines how many groups or clusters rows are sequentially divided into, corresponding to the sorting/clustering methods defined in ‘sorting’. For kmeans, defines how many kmeans clusters are created. For all other methods, divides genes into equally-sized groups based on the corresponding value in ‘numgroups’.
sort.methods a vector defining how to sequentially group and/or sort the data, and when applicable, which portion of the matrix to use in determining how to sort the data. methods include kmeans,mean,median,min,max,minloc,maxloc,sd,chrom, the left and right distances (in bp) from the center of the matrix to which to limit the sorting/clustering method. In each string in sorting, sorting/clustering methods, left distance, and right distance, must be separated by a comma and not contain spaces. Clustering/grouping methods include kmeans and chromosome. Defaults to none (no sorting/clustering).
sort.ranges character strings indicating ranges (in bp) from the matrix center to use to apply ‘sort.methods’
sort.breaks numeric vector indicating break points used to group rows using sort.methods
gene.list a list of character vectors of gene names, one character per group, defining the genes belonging to each group. Only used when sorting[1] is “genelist”. does not currently work

Heatmap options Description
group.colors a character vector of colors for each primary group, the length of which should equal numgroups[1]
heatmap.lims a character vector of values defining the upper and lower limits of the color gradient of the heatmaps.
heatmap.colors a character vector of strings defining colors used to create color gradient to draw heatmap (from low scores to high scores). Colors must be separated by spaces. Defaults to “white black” where white is the lowest score and black is the highest score.
forcescore logical. When TRUE, before drawing the heatmap, NAs are converted to zeroes. This prevents regions with no scores from showing as white, which may hinder or distract visualization of other colors in heatmap. Defaults to TRUE.

Vplot options Description
fragmats a vector of fragment-matrix file names from which to draw v-plots.
fragmatnames a character vector of names corresponding to fragmats to label vplots
fragrange range of fragment sizes to define y-axis in vplots.
vplot.colors string defining colors used to create color gradient to draw v-plot (from low scores to high scores). Colors must be separated by spaces. Defaults to “black blue yellow red” where black is the lowest scores and red is the highest score.
vplot.lims numeric vector of length 2 that define the range of values that correspond to the color gradient edges in v-plots defined in ‘vplotcolors’. Defaults to c(“auto”,”auto”), which uses the 3 and 97 percentiles of each data set.

Misc options Description
cores a natural number defining the number of scorefiles to process simultaneously for each featurefile. Defaults to “max”, or all but one core.

output

mat.heatmap by default will produce one image per matrix defined in ‘mats’, each image containing one heatmap and one average plot. Images will be placed in a directory called ‘heatmaps’ in the current working directory. A .sort file will also be saved in this directory that (in the future) can be used to replot other matrices or vplots with the same groups/sorting without having to redefine how to sort the matrices.

examples

generate basic heatmaps of a matrices without reordering rows (TSSs) or splitting rows into groups

make a list of matrix file names

> matrices <- c( "H3K36me3-signal_TSS.mat10" , "MNase-seq_TSS.mat10" )

generate heatmaps

> mat.heatmap ( matrices , numgroups = 1 )

split TSSs into 2 equally-sized groups based on the mean score +/- 100 bp from the TSS of the first matrix in ‘mats’

> mat.heatmap ( matrices , numgroups = 2 , sort.methods = "mean" , sort.ranges = "-100,100" )

in addition to the previous grouping, group each of the two groups further into 3 equally-sized groups based on the location of highest score +/- 50 bp from the TSS

> mat.heatmap ( matrices , numgroups = c(2,3) , sort.methods = c( "mean" , "maxloc" ) , sort.ranges = c( "-100,100" , "-50,50" ) )

sequentially k-means cluster genes twice into 2 groups based on the entire region surrounding the TSS in the matrix of the first matrix in ‘mats’

> mat.heatmap ( matrices , numgroups = c(2,2) , sort.methods = c( "kmean" , "kmeans" ) )