Overview

Overview of tasks for tertiary analysis or points relevant to summary report should be summarized at the top of the report.

  • Note that for each project at least 2 R markdown reports should be present:
    • Analysis report for tertiary analysis: this should be the entire analysis code with outputs present in the rmakrdown for proper review. This should remain as internal/private until discussed with the investigator/when we agree on publication procedures
    • Summary report: this should be a summary report to be shared with the PI/lab where it highlights file path locations for outputs, any concern noted with the data etc.
    • Examples of both can be found for example in (summary report markdown in the summary_report subdirectory):

NOTE: the U_BDS_authorship_note.html (after the body of the report) which is included here is typically used in the summary report shared with the PI/lab (or any other analysis when shared). For internal code/analysis, it doesn’t need to be included.

Project structure

In addition to version control etc. the structure of the projects follows the basic concepts shown at the first couple of lessons from the R carpentry material we teach at the core: https://swcarpentry.github.io/r-novice-gapminder/02-project-intro/index.html

Thus this includes at the very minimum:

  • Implementation of the RStudio project management functionality. Thus, every git repo linked to analysis should have a .Rproj file.
  • A ./results and a ./data directory.
  • The data in both directories above are not committed to git repositories due to size (ignored in .gitignore), but locations to data inputs and resulting files should be clearly listed in the README.md of the repository and in the rmakrdown as well. This may be a link to the Wrike project where paths are found within tasks and/or direct links to to Box and or U-BDS Cheaha directory.
  • Commit and push often to avoid loss of information.

Packages loaded globabally and custom functions

# Required packages and set seed
# EXAMPLE: change as needed....
set.seed(1234)
library(Seurat)
library(dplyr)
library(ggplot2)
library(AnnotationHub)

# custom function(s)
# plot_output(): wraps saving of png/pdf in a single function and evaluates to be shown in report
plot_output <- function(p, file_name, w_png=700, h_png=600, w_pdf=12, h_pdf=8){
    
    png(paste0(file_name,".png"), width = w_png, height = h_png)
    plot(eval(p))
    dev.off()
    
    pdf(paste0(file_name,".pdf"), width = w_pdf, height = h_pdf)
    plot(eval(p))
    dev.off()
    
    eval(p)
}

# make results directory in case it doesn't exist

dir.create("./results", showWarnings = FALSE)

Start analysis

Write any descriptions as you see fit (again review examples above when needed). Note that for single-cell analysis you may not want to re-run all code during compilation of report. If there are tasks that are time consuming (e.g.: peak calling with MACS2), it’s ok to set eval=FALSE as long as the code written saves the appropriate outputs including the R object itself. Once time consuming steps are performed, when possible, load the object in the R markdown and continue to include execution of outputs where possible (again, you will see some of this in the examples above and when in doubt ask Dr. Lara Ianov).

print("Your analysis starts here")
## [1] "Your analysis starts here"

sessionInfo

This should always be present at the end of your analysis!

sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] AnnotationHub_3.2.0 BiocFileCache_2.2.0 dbplyr_2.1.1       
## [4] BiocGenerics_0.40.0 ggplot2_3.3.5       dplyr_1.0.7        
## [7] SeuratObject_4.0.4  Seurat_4.0.6       
## 
## loaded via a namespace (and not attached):
##   [1] plyr_1.8.6                    igraph_1.2.11                
##   [3] lazyeval_0.2.2                splines_4.1.2                
##   [5] listenv_0.8.0                 scattermore_0.7              
##   [7] GenomeInfoDb_1.30.0           digest_0.6.29                
##   [9] htmltools_0.5.2               fansi_1.0.0                  
##  [11] magrittr_2.0.1                memoise_2.0.1                
##  [13] tensor_1.5                    cluster_2.1.2                
##  [15] ROCR_1.0-11                   globals_0.14.0               
##  [17] Biostrings_2.62.0             matrixStats_0.61.0           
##  [19] spatstat.sparse_2.1-0         colorspace_2.0-2             
##  [21] blob_1.2.2                    rappdirs_0.3.3               
##  [23] ggrepel_0.9.1                 xfun_0.29                    
##  [25] RCurl_1.98-1.5                crayon_1.4.2                 
##  [27] jsonlite_1.7.2                spatstat.data_2.1-2          
##  [29] survival_3.2-13               zoo_1.8-9                    
##  [31] glue_1.6.0                    polyclip_1.10-0              
##  [33] gtable_0.3.0                  zlibbioc_1.40.0              
##  [35] XVector_0.34.0                leiden_0.3.9                 
##  [37] future.apply_1.8.1            abind_1.4-5                  
##  [39] scales_1.1.1                  DBI_1.1.2                    
##  [41] miniUI_0.1.1.1                Rcpp_1.0.7                   
##  [43] viridisLite_0.4.0             xtable_1.8-4                 
##  [45] reticulate_1.22               spatstat.core_2.3-2          
##  [47] bit_4.0.4                     stats4_4.1.2                 
##  [49] htmlwidgets_1.5.4             httr_1.4.2                   
##  [51] RColorBrewer_1.1-2            ellipsis_0.3.2               
##  [53] ica_1.0-2                     pkgconfig_2.0.3              
##  [55] sass_0.4.0                    uwot_0.1.11                  
##  [57] deldir_1.0-6                  utf8_1.2.2                   
##  [59] tidyselect_1.1.1              rlang_0.4.12                 
##  [61] reshape2_1.4.4                later_1.3.0                  
##  [63] AnnotationDbi_1.56.2          munsell_0.5.0                
##  [65] BiocVersion_3.14.0            tools_4.1.2                  
##  [67] cachem_1.0.6                  generics_0.1.1               
##  [69] RSQLite_2.2.9                 ggridges_0.5.3               
##  [71] evaluate_0.14                 stringr_1.4.0                
##  [73] fastmap_1.1.0                 yaml_2.2.1                   
##  [75] goftest_1.2-3                 knitr_1.37                   
##  [77] bit64_4.0.5                   fitdistrplus_1.1-6           
##  [79] purrr_0.3.4                   RANN_2.6.1                   
##  [81] KEGGREST_1.34.0               pbapply_1.5-0                
##  [83] future_1.23.0                 nlme_3.1-153                 
##  [85] mime_0.12                     compiler_4.1.2               
##  [87] plotly_4.10.0                 filelock_1.0.2               
##  [89] curl_4.3.2                    png_0.1-7                    
##  [91] interactiveDisplayBase_1.32.0 spatstat.utils_2.3-0         
##  [93] tibble_3.1.6                  bslib_0.3.1                  
##  [95] stringi_1.7.6                 lattice_0.20-45              
##  [97] Matrix_1.4-0                  vctrs_0.3.8                  
##  [99] pillar_1.6.4                  lifecycle_1.0.1              
## [101] BiocManager_1.30.16           spatstat.geom_2.3-1          
## [103] lmtest_0.9-39                 jquerylib_0.1.4              
## [105] RcppAnnoy_0.0.19              bitops_1.0-7                 
## [107] data.table_1.14.2             cowplot_1.1.1                
## [109] irlba_2.3.5                   httpuv_1.6.5                 
## [111] patchwork_1.1.1               R6_2.5.1                     
## [113] promises_1.2.0.1              KernSmooth_2.23-20           
## [115] gridExtra_2.3                 IRanges_2.28.0               
## [117] parallelly_1.30.0             codetools_0.2-18             
## [119] MASS_7.3-54                   assertthat_0.2.1             
## [121] withr_2.4.3                   sctransform_0.3.2            
## [123] GenomeInfoDbData_1.2.7        S4Vectors_0.32.3             
## [125] mgcv_1.8-38                   parallel_4.1.2               
## [127] grid_4.1.2                    rpart_4.1-15                 
## [129] tidyr_1.1.4                   rmarkdown_2.11               
## [131] Rtsne_0.15                    Biobase_2.54.0               
## [133] shiny_1.7.1