Syllabus
Coordinator
 Name  Humberto OrtizZuazaga   Office  NCL A159   Laboratory  NCL A158   Telephone  7877640000 x7430   email  humberto.ortiz@upr.edu   Web page  http://ccom.uprrp.edu/~humberto/   Office hours  Monday, Wednesday 8:009:30 AM    Tuesday 3:305:00 PM    or by appointment 
Description
Through lectures, group discussions and other active learning strategies, this course will introduce the students to statistical and computing methods for observational studies and clinical trials. The students will acquire knowledge in basic concepts of Data Analysis, utilizing Biostatistics and Bioinformatics tools, and applying these to biomedical/translational cancer and population sciences research.
Prerequisites

Calculus I or equivalent

Basic course in statistics or biostatistics.
Objectives
At the end of the course, students will be able to:

Apply the basic methods of data analysis and the elementary concepts of Biostatistics and Bioinformatics in applications related to Cancer Research.

Use powerful and flexible software (such as R and BioConductor software packages) to effectively put into practice data analytic techniques in Cancer Research and be able to develop inferences.

Develop case studies utilizing cancer data.
Course schedule
Class will meet Fridays from 1:00 to 3:30 PM in the Medical Sciences Campus, Nursing Professions Building, Room 318.
Tentative course calendar
Date  Topic  Book Chapter *  Additional Reading 

Jan 22  Introduction  STAT Ch 12, DATA Pg 417  
Jan 29  How to Generate Descriptive Statistics  STAT Ch 4, DATA Pg 92137  Spriestersbach, A., Röhrig, B., du Prel, J.B., GerholdAy, A., Blettner, M. (2009). Descriptive statistics: The specification of statistical measures and their presentation in tables and graphs. Part 7 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 106(36):57883. 
Feb 5  How to Generate Inferential Statistics  DATA Pg 1891  
Feb 12  Testing Hypotheses  STAT Ch 5; DATA Pg 1891  du Prel, J.B., Röhrig, B., Hommel, G., Blettner, M. (2010). Choosing statistical tests: Part 12 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 107(19):3438. 
Feb 19  Calculation of Sample Size and Power  STAT Ch 9; DATA Pg 6274  Röhrig, B., du Prel, J.B., Wachtlin, D., Kwiecien, R., Blettner, M. (2010). Sample size calculation in clinical trials: Part 13 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 107(3132):5526. 
Feb 26  Linear Models  STAT Ch 6,11  Schneider, A., Hommel, G., Blettner, M. (2010). Linear regression analysis: Part 14 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 107(44):77682. 
Mar 4  One Way Analysis of Variance  STAT Ch 7  Victor, A., Elsässer, A., Hommel, G., Blettner, M. (2010). Judging a plethora of pvalues: how to contend with the problem of multiple testingPart 10 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 107(4):506. 
Mar 11  Contingency Tables and Loglinear Models I  STAT Ch 13,15  
Mar 18  Contingency Tables and Loglinear Models II  STAT Ch 13,15  
Mar 25  No class  
Apr 1  Introduction to Bioinformatics  W. Huber, V.J. Carey, R. Gentleman, et al. Orchestrating highthroughput genomic analysis with Bioconductor. Nature Methods, 2015:12, 115. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4509590/  
Apr 8  Single and Multiple Sequence Alignment  (1) Needleman,S. and Wunsch,C. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48, 443453, 1970. http://www.cise.ufl.edu/class/cis4930sp09rab/00052.pdf (2) Erik S. Wright. The Art of Multiple Sequence Alignment in R. University of Wisconsin. Madison, WI. October 13, 2015. https://www.bioconductor.org/packages/3.3/bioc/vignettes/DECIPHER/inst/doc/ArtOfAlignmentInR.pdf  
Apr 15  Statistical Methods for Analysis of Microarray Data  Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1, Article 3. http://www.statsci.org/smyth/pubs/ebayes.pdf  
Apr 22  Gene Clustering Analysis  K. S. Pollard and M. J. van der Laan. "Cluster Analysis of Genomic Data" in Gentleman, R., Carey, V.J., et al. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer, 2005. http://cbcb.umd.edu/~hcorrada/CFG/readings/Solutions_ch13.pdf  
Apr 29  Next Generation Sequencing  Law, CW, Chen, Y, Shi, W, and Smyth, GK (2014). Voom: precision weights unlock linear model analysis tools for RNAseq read counts. Genome Biology 15, R29. http://www.genomebiology.com/2014/15/2/R29  
May 6  Bioinformatics Case Study 
* STAT: Book chapters from P. Daalgard (2008) “Introductory Statistics with R” Springer.
* DATA: R. Irizarry and M. Love (2015) “Data Analysis for the Life Sciences” Leanpub. https://leanpub.com/dataanalysisforthelifesciences.
Instructional resources
The course will be hosted in Google Classroom. You need to sign in with a @upr.edu account.
Textbook
See the references and course calendar for course materials.
Software
We will use the R statistical software and bioconductor. Both are open source, and you can install them on your own computers.
Evaluation
Students work will be evaluated on a 100% basis with the standard curve.
 Homework, 20% final grade
 Project in Biostatistics, 35% final grade
 Project in Bioinformatics, 35% final grade
 Attendance, 10% final grade
Reasonable accomodations for students (Statement of PR Law 51)
Students with a health condition or situation that, according to the law, makes them eligible for reasonable accommodation have the right to submit a written application to the Dean, Associate Dean or Assistant Dean for Students Affairs of their Faculty, according to the procedures established in the document Submission Process for Reasonable Accommodation of the Medical Sciences Campus. This document may be obtained at the Deanship for Students Affairs of the MSC, at each Faculty, and the MSC web page. The application does not exempt students from complying with the academic requirements pertaining to the programs.
Academic integrity
The University of Puerto Rico promotes the highest standards of academic and scientific integrity. Article 6.2 of the UPR Student Bylaws (Certification JS 13 2009–2010) states that “academic dishonesty includes but is not limited to: fraudulent actions, obtaining grades or academic degrees using false or fraudulent simulations, copying totally or partially academic work from another person, plagiarizing totally or partially the work of another person, copying totally or partially responses from another person to examination questions, making another person to take any test, oral or written examination on his/hers behalf, as well as assisting or facilitating any person to incur in the aforementioned conduct”. Fraudulent conduct refers to “behavior with the intent to defraud, including but not limited to, malicious alteration or falsification of grades, records, identification cards or other official documents of the UPR or any other institution.” Any of these actions shall be subject to disciplinary sanctions in accordance with the disciplinary procedure, as stated in the existing UPR Student Bylaws.
DISCLAIMER: The above statement is an English translation, prepared at the Deanship of Academic Affairs of the Medical Sciences Campus, of certain parts of Article 6.2 of the UPR Student Bylaws “Reglamento General de Estudiantes de la Universidad de Puerto Rico”, (Certificación JS 13 20092010). It is in no way intended to be a legal substitute for the original document, written in Spanish.
Grading system
The grading system for this course is as follows:
90  100 A
80  89 B
70  79 C
Less than 70 F
Bibliography

Daalgard, P. (2008). Introductory Statistics with R. (2nd ed). Springer.

Gentleman, R., Carey, V., Huber, W., Irizarry, R., Dudoit, S. (Eds.). (2005). Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer.

Irizarry, R and Love, M. (2015). Data Analysis for the Life Sciences. Leanpub.

Spriestersbach, A., Röhrig, B., du Prel, J.B., GerholdAy, A., Blettner, M. (2009). Descriptive statistics: The specification of statistical measures and their presentation in tables and graphs. Part 7 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 106(36):57883.

du Prel, J.B., Röhrig, B., Hommel, G., Blettner, M. (2010). Choosing statistical tests: Part 12 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 107(19):3438.

Röhrig, B., du Prel, J.B., Wachtlin, D., Kwiecien, R., Blettner, M. (2010). Sample size calculation in clinical trials: Part 13 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 107(3132):5526.

Victor, A., Elsässer, A., Hommel, G., Blettner, M. (2010). Judging a plethora of pvalues: how to contend with the problem of multiple testingPart 10 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 107(4):506.

Schneider, A., Hommel, G., Blettner, M. (2010). Linear regression analysis: Part 14 of a series on evaluation of scientific publications. Dtsch Arztebl Int, 107(44):77682.
Electronic resources

R software: Available from http://cran.rproject.org Reference: R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. URL http://www.Rproject.org/.

Bioconductor: http://www.bioconductor.org/

RStudio Team (2012). R Studio: Integrated Development for R. RStudio, Inc., Boston, MA URL http://www.rstudio.com/.

Irizarry and Love (2015) is available from https://leanpub.com/dataanalysisforthelifesciences.
For each chapter, the book provides links to the R code used.