PSTAT 10 Principles of Data Science with R
This is the website of an intro data science class I taught the summer of 2022. I largely developed all of the material on my own. You are free to use it, just credit me if your use is significant.
Welcome to PSTAT 10
I will post all course material on this page including slides, worksheets, homework, worksheet and homework solutions.
Worksheets and homework are released at the beginning of the week and are due the following Tuesday evening. Worksheet solutions are released the day before they are due. Assignments should be submitted on Gradescope.
Sample assignment template
Office hours
- Robin: Friday 10am - 11am @ IV Starbucks, Monday 4pm - 5pm @ zoom link on Gauchospace
- Jeff: Monday 1:30pm - 3:30pm @ zoom link on Gauchospace
- Olivier: Tuesday 1:45pm - 3:45pm @ Building 434 Room 113
Week 1
Worksheet 1 | Solutions | Worksheet 2 | Solutions | Homework 1 | Solutions
Notes: Worksheet files | mtrush1.pgm | Rushmore example | Why use seq?
Week 2
Worksheet 3 | Solutions | Worksheet 4 | Solutions | Homework 2 | Solutions
Week 3
Worksheet 5 | Solutions | Worksheet 6 | Solutions | Homework 3 | Solutions
Week 4
Worksheet 7 | Solutions | Worksheet 8 | Solutions | Homework 4 | Solutions
Files: hibbs.dat | Chinook_Sqlite.sqlite
Week 5
Worksheet 9 | Solutions | Worksheet 10 | Solutions | Homework 5 | Solutions
Final Week
Practice exam | Solutions | tinyclothes DB
Final exam | Solutions
- No lectures this week. Office hours and sections are as usual.
- A practice final will be administered during Tuesday’s lecture. Use it to practice writing code with a pencil.
- Go over practice final during Wednesday sections.
- Thursday: Final exam during regular lecture time. One sheet/two sides of handwritten notes is allowed. Scratch paper will be provided.
Install R and RStudio
R version at least 4.1.0 is required for this class. I insist you install R 4.2.0. If you cannot install R or RStudio for whatever reason, RStudio Cloud is available.
Install R from CRAN
Install RStudio
RStudio Cloud
Recommended resources
R stats community on Twitter
Cheatsheets (particularly recommend the dplyr
, RMarkdown, and RStudio IDE cheatsheets)
Books (optional)
Davies - The Book of R (the “official” course text)
Matloff - The Art of R Programming (good descriptions of base R)
Matloff - Probability and Statistics for Data Science (for simulation)
Wickham - R for Data Science (for tidyverse
/data science workflow)
Wickham - Advanced R (if you want to know how R really works)