object |
A profileplyr object |
fun |
The function used to summarize the ranges (e.g. rowMeans or rowMax) |
scaleRows |
If TRUE, the rows of the matrix containing the signal in each bin that is used as the input for clustering will be scaled (as specified by pheatmap) |
kmeans_k |
The number of kmeans groups used for clustering |
clustering_callback |
Clustering callback function to be passed to pheatmap |
clustering_distance_rows |
distance measure used in clustering rows. Possible values are “correlation” for Pearson correlation and all the distances supported by dist, such as “euclidean”, etc. If the value is none of the above it is assumed that a distance matrix is provided. |
cluster_method |
clustering method used. Accepts the same values as hclust |
cutree_rows |
Whether or not a heatmap (from pheatmap) is shown with the output. This will not change what is returned with the function as it will always be a profileplyr object. If silent = FALSE, the heatmap will be shown which may be helpful in quick evaluation of varying numbers of clusters before proceeding with downstream analysis. The default is silent = TRUE, meaning no heatmap will be shown. |
show_rownames |
for any heatmaps printed while running this function, set to TRUE if rownames should be displayed. Default is FALSE. |
Example
set.seed(0)
proplyr_object_subset <- clusterRanges(proplyr_object_subset,
fun = rowMeans,
kmeans_k = 3)
## K means clustering used.
## A column has been added to the range metadata with the column name 'cluster', and the 'rowGroupsInUse' has been set to this column.
rowRanges(proplyr_object_subset)[1:3]
## GRanges object with 3 ranges and 6 metadata columns:
## seqnames ranges strand | name score
## <Rle> <IRanges> <Rle> | <character> <numeric>
## giID1 chr21 34696445-34697205 + | <NA> 0
## giID10 chr21 36207782-36209962 + | <NA> 0
## giID100 chr7 150734773-150737527 + | <NA> 0
## sgGroup giID names cluster
## <factor> <factor> <factor> <ordered>
## giID1 K27ac_top10_HUES64.bed giID1 <NA> 1
## giID10 K27ac_top10_HUES64.bed giID10 <NA> 3
## giID100 K27ac_top10_HUES64.bed giID100 <NA> 3
## -------
## seqinfo: 24 sequences from an unspecified genome; no seqlengths
Visualize signal in clusters as a ranged heatmap
generateEnrichedHeatmap(proplyr_object_subset,
include_group_annotation = TRUE,
color_by_sample_group = "chip",
ylim = "chip",
show_heatmap_legend = rep(FALSE, 6)
)

Visualize signal in clusters with ggplot using summarize function
library(ggplot2)
proplyr_object_subset_summ <- profileplyr::summarize(proplyr_object_subset,
fun = rowMeans,
output = "long")
ggplot(proplyr_object_subset_summ, aes(x = Sample, y = log(Signal))) +
geom_boxplot() +
facet_grid(~cluster)+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
