Basic edgeR results for Project SampleReport-microRNA

This report was rendered at 2019-08-05 18:31:20 according to instructions for analysis by the customer. Please read carefully. In case questions arise please contact our responsible project manager at TAmiRNA directly or via .

Introduction

This report is meant to help explore the results obtained form the analysis of a small RNA sequencing experiment. We have used the publicly available Bioconductor toolbox edgeR (Robinson, McCarthy, and Smyth, 2010; McCarthy, Chen, and Smyth, 2012; Chen, Lun, and Smyth, 2014) to prepare these results. While the report is rich, it is meant to help you to start the exploration of your results.

This report is complementary to the Quality Control (QC) report, which has been sent seperately to this report.

Statistical read count analysis

Read count statistics

microRNA diversity across the sample set

This bar chart summarizes the number of distinct microRNA species (sequences), which were expressed in each sample. Two expression cut-offs were implied to call a microRNA as “expressed”: RPM > 1 in at least 1 sample RPM > 10 in at least 1 samples

Exploratory data analysis: heatmaps and hierachical clustering

The following plots illustrate results from different types of clusterings, to inform about the similarity between samples using information from eihter all microRNA or only the 30 most variant microRNA based on their Euclidean distance. In order to find similar clusters the complete linkage method was applied.
The underlying data are the RPM count data. Only microRNAs which showed a RPM value of > 1.0 in each individual sample are considered for this analysis.
For reasons of easier readability and interpretation the data has been scaled using unit variance method (\(a = a / \sigma_{a}\)) and centered around the mean value.

All microRNAs

Values for 334 microRNAs are shown (rows). Columns are the samples of the experiment. This figure gives an overview of whether the known factors have a strong impact on microRNA expression. This can be expected in case microRNA expression in different cell types or tissues is compared, or in case the tested treatment has profound effects on cell phenotype.

30 most variant microRNAs

### preparing heatmap for top miRNAs based on coeff of var
coeffVar <- function(x) (sd(x)/mean(x))

coVar <- apply(rpmThresh1,1,coeffVar)
top30 <-names(tail(sort(coVar),30))

rpmThresh1Top30 <- rpmThresh1[rownames(rpmThresh1) %in% top30, ]

# removing 4 letters of prefix
# rownames(rpmThresh1Top30) <-  gsub("^....","",rownames(rpmThresh1Top30))

# using pcaMethods package for scaling
# transposing as done in clustvis
scaled <- prep(t(rpmThresh1Top30), scale = "uv", center = TRUE)
scaled <- t(scaled)

## Define colors to use for the heatmap if none were supplied
colVec <- colorRampPalette( rev(brewer.pal(11, "Spectral")) )(100)

# adopted form clustvis
digits <- 2
colMin <- round(min(scaled,na.rm = TRUE),digits) - 10 ** (-digits)
colMax <- round(max(scaled,na.rm = TRUE),digits) + 10 ** (-digits)

colBreaks <- seq(colMin, colMax, length.out = 101)

## Adding additional column annotations
rownames(colanno) <- colnames(rpmThresh1Top30)

# create dummy heatmap to get columns and rows clustering data
#H <- pheatmap(scaled, silent = T)

## Make the heatmap without row clustering shown
# if (outputIsHTML) {
#   d3heatmap(scaled[H$tree_row$order,H$tree_col$order], color = colVec, breaks = colBreaks,  annotation_col = colanno, annotation_colors = ann_colors, show_rownames = T)
# } else {
#   pheatmap(scaled[H$tree_row$order,H$tree_col$order], cluster_rows = F,
#            color = colVec, breaks = colBreaks,  annotation_col = colanno, 
#            annotation_colors = ann_colors, show_rownames = T, 
#            fontsize_col = 8)
# }

# version with rownames and row clustering shown
if (outputIsHTML) {
  d3heatmap(scaled, color = colVec, breaks = colBreaks,  annotation_col = colanno, annotation_colors = ann_colors, show_rownames = T)
} else {
  pheatmap(scaled, cluster_rows = T,
           color = colVec, breaks = colBreaks,  annotation_col = colanno, 
           annotation_colors = ann_colors, show_rownames = T, 
           fontsize_col = 7, fontsize_row = 7)
}
# pheatmap(scaled[H$tree_row$order,H$tree_col$order], cluster_rows = F, 
#         color = colVec, breaks = colBreaks,  annotation_col = colanno,
#         annotation_colors = ann_colors, show_rownames = TRUE, fontsize_row = 7,
#         fontsize_col = 8)

Values for the top 30 microRNA with the highest variation, measured as the coefficient of variation, are shown (rows). Columns are the samples of the experiment. This figure whether the variablity in microRNAs that have the highest relative variance (CV%), is determined by the experimental factors. If this is not the case, either the experimental factors do not have a strong influence on microRNA levels, or there are other factors (which might not have been controlled in this study), which have an even bigger effect on microRNA expression.

Exploratory data analysis: principal component analysis (PCA)

## Transform count data
rld <- tryCatch(rlog(dds), error = function(e) { rlog(dds, fitType = 'mean') })

## Get percent of variance explained
data_pca <- plotPCA(rld, intgroup = intgroup, returnData = TRUE)
percentVar <- round(100 * attr(data_pca, "percentVar"))

## Perform PCA analysis and make plot
if (outputIsHTML) {
  tooltips <- paste(data_pca$name)
  
  scatterD3(x=data_pca$PC1, y=data_pca$PC2, xlab=paste0("PC1: ",percentVar[1], "% variance"), 
            ylab=paste0("PC2: ",percentVar[2], "% variance"), tooltip_text = tooltips,
            col_var=data_pca$group, lab=data_pca$name)
  
} else {
  plotPCA(rld, intgroup = intgroup)
}
#ggsave(paste0(outdir,"/plots/PCA_rld_all",SpeciesId,".png"))

## Get percent of variance explained
#data_pca <- plotPCA(rld, intgroup = intgroup, returnData = TRUE)
#percentVar <- round(100 * attr(data_pca, "percentVar"))

The above plot shows the first two principal components that explain the variability in the data using the regularized log count data. This analysis is based on 395 microRNA. In this case, the first and second principal component explain 55 and 18 percent of the variance, respectively.

If you are unfamiliar with principal component analysis, you might want to check the Wikipedia entry or this interactive explanation.

## Obtain the sample euclidean distances

# Sample-to-sample distances based on rld

sampleDist <- dist(t(assay(rld)))
sampleDistMatrix <- as.matrix(sampleDist)
## Add names based on intgroup
colnames(sampleDistMatrix) <- apply(as.data.frame(colData(rld)[, intgroup]), 1,
    paste, collapse = ' : ')
rownames(sampleDistMatrix) <- NULL

## Define colors to use for the heatmap if none were supplied
if(is.null(colors)) {
    colors <- colorRampPalette( rev(brewer.pal(9, "Blues")) )(255)
}

## Adding additional column annotations
colnames(sampleDistMatrix) <- gsub("_NEW$","",colnames(sampleDistMatrix))
# aux <- strsplit(colnames(sampleDistMatrix),"_")
# p1 <- unique(unlist(aux)[2*(1:length(colnames(sampleDistMatrix)))])
# p2 <- unique(unlist(aux)[2*(1:length(colnames(sampleDistMatrix)))-1])
# colanno <- data.frame(age=as.factor(ifelse(grepl(p1[1],colnames(sampleDistMatrix)),p1[1],p1[2])),
#                         muscle_type=as.factor(ifelse(grepl(p2[1],colnames(sampleDistMatrix)),p2[1],p2[2])))
# 
# a1 <- unname(unlist(sapply(table(colnames(sampleDistMatrix)),function(x) seq(1:x))))
colnames(sampleDistMatrix) <-indexedSamples # paste0(colnames(sampleDistMatrix),"_",a1)
rownames(colanno) <- colnames(sampleDistMatrix)

## Make the heatmap
pheatmap(sampleDistMatrix, clustering_distance_rows = sampleDist,
    clustering_distance_cols = sampleDist, color = colors, annotation_col = colanno)


#This plot shows how samples are clustered based on their euclidean distance using the #regularized log transformed count data. This figure gives an overview of how the #samples are hierarchically clustered. It is a complementary figure to the PCA plot.

Differential Expression Analysis

Important for results interpretation: Note that according to the used convention throughout this chapter always the second given group serves as the reference. E.g. for a given comparison of \(A\) vs. \(B\) a \(log2FoldChange > 0\) indicates that the respective microRNA is upregulated in group \(A\).

GroupA vs GroupB

### EdgeR version
pvalue <- 0.05
alpha <- 0.1

write(contrasts[i], stderr())

## for edgeR have to swap groups !!!
sampleContrast<-rev(c(unlist(strsplit(contrasts[i],"#"))))
write(sampleContrast, stderr())
sampleContrast <- gsub("\\(all\\)","all",sampleContrast) # no brackets allowed 

##############
# > ?exactTest
# Note that the first group listed in the pair is the baseline 
# for the comparison—so if the pair is c("A","B") then the comparison
# is B - A, so genes with positive log-fold change are up-regulated
# in group B compared with group A
# => exactly vice versa to deseq2 !!!
##############

write("sampleContrast:", stderr())
write(sampleContrast, stderr())

et <- exactTest(dgeLocal, pair=sampleContrast)
tt <- topTags(et, n=Inf)

selected_samples <- (which(contrastGroups==sampleContrast[1] | contrastGroups==sampleContrast[2]))

z<-as.data.frame(dgeLocal$pseudo.counts)[,selected_samples]
t<-as.data.frame(tt)
data <- merge(z,t,by="row.names")
row.names(data)<-data$Row.names
data$Row.names<-NULL

# 0205: get all results, no FDR cutoff applied in xlsx outputs
#selected <- which(data$FDR<=pvalue)
#result <- data[selected,]

result <-data

## Exporting to xlsx
xlsfile = paste0(outdir,"/",project,"_",contrastLabelXls,".xlsx")
wb <- createWorkbook(type = "xlsx")

if (nrow(result) > 0){
  wb <- writeFormatedSheet(wb,paste0(SpeciesId,"_edgeR"),result,contrastGroups[selected_samples])
} else {
  cat("using a FDR threshold of", alpha)
  cat("\nno significant differential expressed ", SpeciesId, " found for contrast ", contrastLabel)
  cat("\nno excel outputs created !!!")
}

saveWorkbook(wb,xlsfile)
ddsLocal <- DEFormats::as.DESeqDataSet(dgeLocal)

## Fake having created results
mcols(ddsLocal) <- DataFrame(pvalue = et$table$PValue)
mcols(mcols(ddsLocal)) <- DataFrame(type = 'results',
    description = 'manual incomplete conversion from edgeR to DESeq2')

## Fix column names
colnames(et$table)[colnames(et$table) == 'PValue'] <- 'pvalue'
colnames(et$table)[colnames(et$table) == 'logFC'] <- 'log2FoldChange'
colnames(et$table)[colnames(et$table) == 'logCPM'] <- 'baseMean'

## Define res input
res2 <- DESeqResults(DataFrame(et$table))

## Save alpha value
metadata(res2)[["alpha"]] <- alpha

res2 <- DESeq2:::pvalueAdjustment(res2, 
            independentFiltering = FALSE, filter = filter,
theta = theta, alpha = alpha, pAdjustMethod = 'BH')
## For ggplot
res2.df <- as.data.frame(res2)

## Sort results by adjusted p-values
#ord <- order(res2.df$padj, decreasing = FALSE)
ord <- order(res2.df$pvalue, decreasing = FALSE) # 01092018
res2.df <- res2.df[ord, ]

rownames(res2.df) <- cutRNANames(SpeciesId, rownames(res2.df))

res2.df <- cbind(data.frame(Feature = rownames(res2.df)), res2.df)
rownames(res2.df) <- NULL

Volcano Plot for GroupA vs GroupB

results<-as.data.frame(res2[,c("log2FoldChange","pvalue","padj")])
results$miRNA <-  cutRNANames(SpeciesId, rownames(results))
results <- dplyr::mutate(results, sig=ifelse(results$padj<0.05, "FDR<0.05", "Not Sig"))

annoRNA <- 20 # number on xxRNA to annotate in plot
#if (sum(results$sig == "FDR<0.05",na.rm = T) < annoRNA) annoRNA <- sum(results$sig == "FDR<0.05",na.rm = T )
if (dim(results)[1] < annoRNA) annoRNA <- dim(results)[1]

The top 20 microRNA by p-value (unadjusted) are annotated in the plot.

p <- ggplot(results, aes(log2FoldChange, -log10(pvalue))) +
  geom_point(aes(col=sig)) +
  scale_color_manual(values=c("red", "black")) 

if (annoRNA > 0) {
  #data <- head(results[order(results$padj),],annoRNA)
  data <- head(results[order(results$pvalue),],annoRNA) # 01092018
  p <- p + geom_text_repel(data=data, aes(label=miRNA))
}

if(outputIsHTML) {
  p
  #ggplotly(p)
} else {
  p
}

Count Plot for Top 5 microRNAs by p-value for GroupA vs GroupB

Note that the following plots show RPM values and not the count data used internally in edgeR after library normalization (TMM values).

## order by logFC & get names of top 5 miRNA
# ordFC <- order(res2.df$log2FoldChange, decreasing = T)
# ordFC[1:5]
top5 <- subset(res2.df, log2FoldChange > 0.0)
top5 <- as.character(top5[1:5,c("Feature")])

down5 <-  subset(res2.df, log2FoldChange < 0.0)
down5 <- as.character(down5[1:5,c("Feature")])

colnames(rpmTotal) <- fileSpec$indexedSamples 

# remove technical replicates
repsRemoved <- !grepl('_REP',fileSpec$tec_rep)

############ top 5 upregulated ##########
## fetching RPM values 
topReads <- rpmTotal[rownames(rpmTotal) %in% top5, repsRemoved]

## rownames to column
topReads$miRNA <- rownames(topReads)
rownames(topReads) <- NULL

## convert to wide form
wideForm <- cbind(topReads$miRNA, stack(lapply(topReads[,names(topReads)!="miRNA"],as.character)))
wideForm$ind <- gsub("\\_\\d+$", "", wideForm$ind)

newDf <- data.frame(miRNA=wideForm$`topReads$miRNA`,RPM=as.numeric(wideForm$values), group=wideForm$ind)

## relevel groups accordingly
if (length(unique(groups)) > 2 ) {
  t3 <- tryCatch(guessOrder(sampleContrast,unique(newDf$group)),error=function(e) NULL)
  if (!is.null(t3)) {
    ## relevel to get current contrast at first
    newDf$group <- factor(newDf$group, levels = t3)
  }
}

# removing data with RPM == 0, so no log transform warning...
newDf <- subset(newDf, RPM > 0.0)

p <- ggplot(newDf,aes(x=miRNA,y=RPM,color=group, fill=group)) +
  geom_point(position = position_jitterdodge(dodge.width = 0.9), size = 0.1) +
  geom_boxplot(alpha=0.5, outlier.color = NA, position = position_dodge(width=0.9)) +
  coord_trans(y = "log10") +
  #scale_y_log10() +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) + 
  labs(title=paste0("Top 5 upregulated ", SpeciesId, "s by p-value"), 
       x = SpeciesId, y ="log10 RPM Value") +
  theme(legend.position = 'bottom') +
  guides(colour=guide_legend(ncol=2))

if(outputIsHTML) {
  p
  #ggplotly(p)
} else {
  p
}

##### top 5 down regulated
## fetching RPM values 
topReads <- rpmTotal[rownames(rpmTotal) %in% down5, repsRemoved]

## rownames to column
topReads$miRNA <- rownames(topReads)
rownames(topReads) <- NULL

## convert to wide form
wideForm <- cbind(topReads$miRNA, stack(lapply(topReads[,names(topReads)!="miRNA"],as.character)))
wideForm$ind <- gsub("\\_\\d+$", "", wideForm$ind)

newDf <- data.frame(miRNA=wideForm$`topReads$miRNA`,RPM=as.numeric(wideForm$values), group=wideForm$ind)

if (length(unique(groups)) > 2 ) {
  t3 <- tryCatch(guessOrder(sampleContrast,unique(newDf$group)),error=function(e) NULL)
  if (!is.null(t3)) {
    ## relevel to get current contrast at first
    newDf$group <- factor(newDf$group, levels = t3)
  }
}

# removing data with RPM == 0, so no log transform warning...
newDf <- subset(newDf, RPM > 0.0)
p <- ggplot(newDf,aes(x=miRNA,y=RPM,color=group, fill=group)) +
  geom_point(position = position_jitterdodge(dodge.width = 0.9), size = 0.1) +
  geom_boxplot(alpha=0.5, outlier.color = NA, position = position_dodge(width=0.9)) +
  coord_trans(y = "log10") +
  #scale_y_log10() +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) + 
  labs(title=paste0("Top 5 down regulated ", SpeciesId,"s by p-value"),
       x = SpeciesId, y ="log10 RPM Value") +
  theme(legend.position = 'bottom') +
  guides(colour=guide_legend(ncol=2))

if(outputIsHTML) {
  p
  #ggplotly(p)
} else {
  p
}

P-values distribution for GroupA vs GroupB

## P-value histogram plot
ggplot(res2.df[!is.na(res2.df$pvalue), ], aes(x = pvalue)) +
    geom_histogram(alpha=.5, position='identity', bins = 50) +
    labs(title='Histogram of unadjusted p-values') +
    xlab('Unadjusted p-values') +
    xlim(c(0, 1.0005))
## Warning: Removed 2 rows containing missing values (geom_bar).

#ggsave(paste0(outdir,"/plots/pVal_distr_",contrastLabelXls, "_", SpeciesId,".png"))

This histogram plot shows a histogram of the unadjusted p-values. It might be skewed right or left, or flat as shown in the Wikipedia examples. The shape depends on the percent of features that are differentially expressed. For further information on how to interpret a histogram of p-values you might be interested in David Robinson’s post on this topic.

## P-value distribution summary
summary(res2.df$pvalue)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.0000064 0.1729613 0.3934454 0.4298441 0.6768447 1.0000000

This is the numerical summary of the distribution of the p-values.

## Split features by different p-value cutoffs
pval_table <- lapply(c(1e-04, 0.001, 0.01, 0.025, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5,
    0.6, 0.7, 0.8, 0.9, 1), function(x) {
    data.frame('Cut' = x, 'Count' = sum(res2.df$pvalue <= x, na.rm = TRUE))
})
pval_table <- do.call(rbind, pval_table)

if(outputIsHTML) {
    kable(pval_table, format = 'markdown', align = c('c', 'c'))
} else {
    kable(pval_table, format ='latex', booktabs =T) %>%
      kable_styling(position = "center")
}
Cut Count
0.0001 5
0.0010 11
0.0100 24
0.0250 40
0.0500 50
0.1000 75
0.2000 110
0.3000 161
0.4000 204
0.5000 238
0.6000 266
0.7000 302
0.8000 333
0.9000 358
1.0000 395

This table shows the number of features with p-values (unadjusted) less or equal than commonly used cut-off values for significance levels.

## Add search url if appropriate
if(!is.null(searchURL) & outputIsHTML & (SpeciesId == "microRNA")) {
    res2.df$Feature <- paste0('<a href="', searchURL, res2.df$Feature, '">',
        res2.df$Feature, '</a>')
}

for(i in which(colnames(res2.df) %in% c('pvalue', 'padj'))) res2.df[, i] <- format(res2.df[, i], scientific = TRUE)

if(outputIsHTML) {
    datatable(head(res2.df, n = nBest), options = list(pagingType='full_numbers', pageLength=10, scrollX='100%'), escape = FALSE, rownames = FALSE) %>% formatRound(which(!colnames(res2.df) %in% c('pvalue', 'padj', 'Feature')), digits)
} else {
    res2.df_top <- head(res2.df, n = 20)
    for(i in which(!colnames(res2.df) %in% c('pvalue', 'padj', 'Feature'))) res2.df_top[, i] <- round(res2.df_top[, i], digits)
    kable(res2.df_top, format ='latex', booktabs =T) %>%
      kable_styling(position = "center")
}
### Count Plots Top Features
# plotCounts_gg <- function(i, ddsLocal, intgroup) {
#     group <- if (length(intgroup) == 1) {
#         colData(ddsLocal)[[intgroup]]
#     } else if (length(intgroup) == 2) {
#         lvls <- as.vector(t(outer(levels(colData(ddsLocal)[[intgroup[1]]]),
#             levels(colData(ddsLocal)[[intgroup[2]]]), function(x,
#                 y) paste(x, y, sep = " : "))))
#         droplevels(factor(apply(as.data.frame(colData(ddsLocal)[,
#             intgroup, drop = FALSE]), 1, paste, collapse = " : "),
#             levels = lvls))
#     } else {
#         factor(apply(as.data.frame(colData(ddsLocal)[, intgroup, drop = FALSE]),
#             1, paste, collapse = " : "))
#     }
#     data <- plotCounts(ddsLocal, gene=i, intgroup=intgroup, returnData = TRUE)
#     data <- cbind(data, data.frame('group' = group))
#     main <- rownames(ddsLocal)[i]
# 
#     ggplot(data, aes(x = group, y = count)) + geom_point() + ylab('Normalized count') + ggtitle(main) + coord_trans(y = "log10")
# }
# for(i in head(ord, nBestFeatures)) {
#     print(plotCounts_gg(i, dds = ddsLocal, intgroup = intgroup))
# }
## Make MDS plot for edgeR results
colnames(dgeLocal) <- unname(indexedSamples)
plotMDS(dgeLocal)

# `r ifelse(isEdgeR, 'This plot is a multidimensional scaling plot of distances between feature expression profiles. It shows the names of the samples in a two-dimensional scatterplot such that the distances are approximately the log2 fold changes between samples. Check the [edgeR vignette](http://www.bioconductor.org/packages/edgeR) for further details regarding this plot.', '')`

Summary of Differential Expression

This Bar Chart provides a summary of the up- and down-regulated microRNAs from every contrast, based on the following criteria: FDR < 10% Fold Change > 2-fold

vennPlotMine(list(vennsetdown,vennsetup),mymain = "", mysub = "",
             colmode = 2,ccol = c("blue","red"), ccex = c(0.8,0.8))

#olBarplot2(vennsetup,mincount=0)
#olBarplot2(vennsetdown,mincount=0)

Reproducibility

Software used in analysis pipeline

[1] "samtools 1.9"
[1] "cutadapt 2.3"
[1] "bowtie-align-s version 1.2.2"
[1] "sRNAbench version 1.5 - 6/2018"
[1] "mirBase library version used: mature_22.fa"
[1] "genome used: GRCh38_p10_mp"
other libraries used: 
[[1]]
[1] "libs=GRCh38_p10_ncRNA"   "GRCh38_p10_RNAcentral"  
[3] "GRCh38_p10_genomic_tRNA" "GRCh38_p10_cdna"        

The input for this report was generated with edgeR (Robinson, McCarthy, and Smyth, 2010; McCarthy, Chen, and Smyth, 2012; Chen, Lun, and Smyth, 2014) . This report was generated in path /home/ubuntu/Desktop/190805-p57_reanalysis/raw/HWFLVBGX9_20190319B_demux_1.

Date the report was generated.

[1] "2019-08-05 18:31:30 CEST"

Wallclock time spent generating the report.

Time difference of 10.214 secs

R session information.

R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C                  LC_TIME=de_AT.UTF-8          
 [4] LC_COLLATE=en_US.UTF-8        LC_MONETARY=de_AT.UTF-8       LC_MESSAGES=en_US.UTF-8      
 [7] LC_PAPER=de_AT.UTF-8          LC_NAME=de_AT.UTF-8           LC_ADDRESS=de_AT.UTF-8       
[10] LC_TELEPHONE=de_AT.UTF-8      LC_MEASUREMENT=de_AT.UTF-8    LC_IDENTIFICATION=de_AT.UTF-8

attached base packages:
 [1] tools     parallel  stats4    stats     graphics  grDevices utils     datasets  methods  
[10] base     

other attached packages:
 [1] rmarkdown_1.13              knitcitations_1.0.8         scatterD3_0.9              
 [4] d3heatmap_0.6.1.2           plotly_4.9.0                knitr_1.23                 
 [7] RColorBrewer_1.1-2          pheatmap_1.0.12             DT_0.6                     
[10] myReport_1.0                xlsx_0.6.1                  systemPipeR_1.18.0         
[13] ShortRead_1.42.0            GenomicAlignments_1.20.0    Rsamtools_2.0.0            
[16] Biostrings_2.52.0           XVector_0.24.0              pcaMethods_1.76.0          
[19] ggrepel_0.8.1               ggplot2_3.1.1               dplyr_0.8.1                
[22] vsn_3.52.0                  DESeq2_1.24.0               SummarizedExperiment_1.14.0
[25] DelayedArray_0.10.0         BiocParallel_1.18.0         matrixStats_0.54.0         
[28] Biobase_2.44.0              GenomicRanges_1.36.0        GenomeInfoDb_1.20.0        
[31] IRanges_2.18.1              S4Vectors_0.22.0            BiocGenerics_0.30.0        
[34] edgeR_3.26.4                limma_3.40.2               

loaded via a namespace (and not attached):
  [1] backports_1.1.4          GOstats_2.50.0           Hmisc_4.2-0             
  [4] plyr_1.8.4               lazyeval_0.2.2           GSEABase_1.46.0         
  [7] splines_3.6.0            crosstalk_1.0.0          digest_0.6.19           
 [10] htmltools_0.3.6          GO.db_3.8.2              magrittr_1.5            
 [13] checkmate_1.9.3          memoise_1.1.0            BSgenome_1.52.0         
 [16] base64url_1.4            cluster_2.0.8            annotate_1.62.0         
 [19] prettyunits_1.0.2        colorspace_1.4-1         blob_1.1.1              
 [22] rappdirs_0.3.1           xfun_0.7                 crayon_1.3.4            
 [25] RCurl_1.95-4.12          jsonlite_1.6             graph_1.62.0            
 [28] genefilter_1.66.0        brew_1.0-6               survival_2.43-3         
 [31] VariantAnnotation_1.30.1 glue_1.3.1               gtable_0.3.0            
 [34] DEFormats_1.12.0         zlibbioc_1.30.0          Rgraphviz_2.28.0        
 [37] scales_1.0.0             DBI_1.0.0                bibtex_0.4.2            
 [40] Rcpp_1.0.1               viridisLite_0.3.0        xtable_1.8-4            
 [43] progress_1.2.2           htmlTable_1.13.1         foreign_0.8-70          
 [46] bit_1.1-14               preprocessCore_1.46.0    Formula_1.2-3           
 [49] AnnotationForge_1.26.0   htmlwidgets_1.3          httr_1.4.0              
 [52] acepack_1.4.1            pkgconfig_2.0.2          XML_3.98-1.19           
 [55] rJava_0.9-11             nnet_7.3-12              locfit_1.5-9.1          
 [58] labeling_0.3             later_0.8.0              tidyselect_0.2.5        
 [61] rlang_0.3.4              AnnotationDbi_1.46.0     munsell_0.5.0           
 [64] RSQLite_2.1.1            evaluate_0.14            stringr_1.4.0           
 [67] yaml_2.2.0               RefManageR_1.2.12        bit64_0.9-7             
 [70] purrr_0.3.2              RBGL_1.60.0              mime_0.6                
 [73] xml2_1.2.0               biomaRt_2.40.0           BiocStyle_2.12.0        
 [76] compiler_3.6.0           rstudioapi_0.10          curl_3.3                
 [79] png_0.1-7                affyio_1.54.0            tibble_2.1.2            
 [82] geneplotter_1.62.0       stringi_1.4.3            highr_0.8               
 [85] GenomicFeatures_1.36.1   lattice_0.20-38          Matrix_1.2-17           
 [88] markdown_0.9             pillar_1.4.1             BiocManager_1.30.4      
 [91] data.table_1.12.2        bitops_1.0-6             httpuv_1.5.1            
 [94] rtracklayer_1.44.0       R6_2.4.0                 latticeExtra_0.6-28     
 [97] affy_1.62.0              hwriter_1.3.2            promises_1.0.1          
[100] gridExtra_2.3            knitrBootstrap_1.0.2     assertthat_0.2.1        
[103] xlsxjars_0.6.1           Category_2.50.0          rjson_0.2.20            
[106] withr_2.1.2              batchtools_0.9.11        GenomeInfoDbData_1.2.1  
[109] hms_0.4.2                grid_3.6.0               rpart_4.1-15            
[112] tidyr_0.8.3              shiny_1.3.2              lubridate_1.7.4         
[115] base64enc_0.1-3          ellipse_0.4.1           

Pandoc version used: 2.7.2.

Bibliography

This report was created using rmarkdown while knitr (Xie, 2014) and DT (Xie, Cheng, and Tan, 2019) were running behind the scenes. pheatmap (Kolde, 2019) was used to create the sample distances heatmap. Several plots were made with ggplot2 (Wickham, 2016).

Citations made with knitcitations (Boettiger, 2017). The BibTeX file can be found here.

[1] C. Boettiger. knitcitations: Citations for ‘Knitr’ Markdown Files. R package version 1.0.8. 2017. URL: https://CRAN.R-project.org/package=knitcitations.

[1] Y. Chen, A. T. L. Lun, and G. K. Smyth. “Differential expression analysis of complex RNA-seq experiments using edgeR”. In: Statistical Analysis of Next Generation Sequencing Data. Ed. by S. Datta and D. Nettleton. New York: Springer, 2014, pp. 51-74.

[1] R. Kolde. pheatmap: Pretty Heatmaps. R package version 1.0.12. 2019. URL: https://CRAN.R-project.org/package=pheatmap.

## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.

[1] H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016. ISBN: 978-3-319-24277-4. URL: https://ggplot2.tidyverse.org.

[1] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014. URL: http://www.crcpress.com/product/isbn/9781466561595.

[1] Y. Xie, J. Cheng, and X. Tan. DT: A Wrapper of the JavaScript Library ‘DataTables’. R package version 0.6. 2019. URL: https://CRAN.R-project.org/package=DT.