Argument | Description |
---|---|
object | A profileplyr object |
fun | the function used to summarize the ranges (e.g. rowMeans or rowMax) |
output | Must be either “matrix”, “long”, or “object”. |
keep_all_mcols | If output is ‘long’ and this is set to TRUE, then all metadata columns in the rowRanges will be included in the output. If FALSE (default value), then only the column indicated in the ‘rowGroupsInUse’ slot of the metadata will be included in the output dataframe. |
sampleData_columns_for_longPlot | If output is set to ‘long’, then this argument can be used to add information stored in sampleData(object) to the summarized data frame. This needs to be a character vector with elements matching coumn names in sampleData(object). |
If the ‘output’ argument is set to ‘matrix’, then only a matrix will be returned with a single column for each sample containing the bins summarized as indicated with the ‘fun’ argument. The row names of this matrix is a unique identifier for each range containing the chromosome, start, end, and group.
proplyr_object_subset_sumMat <- profileplyr::summarize(proplyr_object_subset,
fun = rowMeans,
output = "matrix")
proplyr_object_subset_sumMat[1:3, ]
## K27ac_esc K4me1_esc
## chr21_34696445_34697205_K27ac_top10_HUES64.bed 4.0326237 0.327227
## chr21_36207782_36209962_K27ac_top10_HUES64.bed 0.9740337 2.296254
## chr7_150734773_150737527_K27ac_top10_HUES64.bed 0.5168534 0.308683
## K4me3_esc K27ac_meso
## chr21_34696445_34697205_K27ac_top10_HUES64.bed 104.73912667 15.517333
## chr21_36207782_36209962_K27ac_top10_HUES64.bed 0.35388878 8.081078
## chr7_150734773_150737527_K27ac_top10_HUES64.bed 0.09848487 11.528448
## K4me1_meso K4me3_meso
## chr21_34696445_34697205_K27ac_top10_HUES64.bed 0.2341562 122.2847307
## chr21_36207782_36209962_K27ac_top10_HUES64.bed 3.6760741 0.9602948
## chr7_150734773_150737527_K27ac_top10_HUES64.bed 2.6178709 0.3257963
This matrix can be used directly in other heatmap generating packages, including heatmap or pheatmap.
library(pheatmap)
pheatmap(proplyr_object_subset_sumMat,
scale = "row",
cluster_cols = FALSE,
show_rownames = FALSE)
If the ‘output’ argument is set to ‘long’, then the output will be a long data frame that can be used for plotting with ggplot. The grouping column of the range metadata as specified by ‘params(proplyrObject)$rowGroupsInUse’ will automatically be included in the data frame. If the other range metadata columns should be included in the data frame, then the ‘keep_all_mcols’ argument should be set to TRUE. Additionally, columns specifying the range, as well as the sample and the summarized signal that correspond to that range are included by default.
proplyr_object_subset_long <- profileplyr::summarize(proplyr_object_subset,
fun = rowMeans,
output = "long")
proplyr_object_subset_long[1:3, ]
## sgGroup combined_ranges Sample Signal
## 1 K27ac_top10_HUES64.bed chr21_34696445_34697205 K27ac_esc 4.0326237
## 2 K27ac_top10_HUES64.bed chr21_36207782_36209962 K27ac_esc 0.9740337
## 3 K27ac_top10_HUES64.bed chr7_150734773_150737527 K27ac_esc 0.5168534
Note: It is often helpful to log transform the signal to more clearly see trends in the signal that is quantified.
library(ggplot2)
ggplot(proplyr_object_subset_long, aes(x = Sample, y = log(Signal))) +
geom_boxplot() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Lastly, if the ‘output’ argument is set to ‘object’, then a profileplyr object containing the summarized matrix will be returned. This will allow for further grouping or manipulation of the summarized ranges with other profileplyr functions, as opposed to using the binned ranges that are often used in later examples.
proplyr_object_subset_summ <- profileplyr::summarize(proplyr_object_subset,
fun = rowMeans,
output = "object")
assays(proplyr_object_subset_summ)[[1]][1:3, ]
## K27ac_esc K4me1_esc K4me3_esc K27ac_meso K4me1_meso K4me3_meso
## giID1 4.0326237 0.327227 104.73912667 15.517333 0.2341562 122.2847307
## giID10 0.9740337 2.296254 0.35388878 8.081078 3.6760741 0.9602948
## giID100 0.5168534 0.308683 0.09848487 11.528448 2.6178709 0.3257963
rowRanges(proplyr_object_subset_summ)[1:3]
## GRanges object with 3 ranges and 5 metadata columns:
## seqnames ranges strand | name score
## <Rle> <IRanges> <Rle> | <character> <numeric>
## giID1 chr21 34696445-34697205 + | <NA> 0
## giID10 chr21 36207782-36209962 + | <NA> 0
## giID100 chr7 150734773-150737527 + | <NA> 0
## sgGroup giID names
## <factor> <factor> <factor>
## giID1 K27ac_top10_HUES64.bed giID1 <NA>
## giID10 K27ac_top10_HUES64.bed giID10 <NA>
## giID100 K27ac_top10_HUES64.bed giID100 <NA>
## -------
## seqinfo: 24 sequences from an unspecified genome; no seqlengths