3 Basics II: Data Manipulation & Exloratory Analysis
3.1 Goals
- Getting to know the basics of working with data: manipulating data, basic techniques of exploratory analysis
3.2 Software
- the same, with some new libraries:
3.3 Class
Practice worksheets:
NB: Original worksheets prepared by Lincoln Mullen, GMU (https://dh-r.lincolnmullen.com/worksheets.html)
3.3.1 Topics
- Selecting columns (
select()) - Filtering rows (
filter()) - Creating new columns (
mutate()) - Sorting columns (
arrange()) - Split-apply-combine (
group_by()) - Summarizing or aggregating data (
summarize()) - Data joining with two table verbs (
left_join()et al.) - Data reshaping (
spread()andgather())
3.4 Reference materials
Consult relevant chapters from:
- Healy, Kieran Data Visualization: A Practical Guide. Princeton University Press, 2018. ISBN: 978-0691181622. http://socviz.co/
- Hadley Wickham & Garrett Grolemund, R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly, 2017. ISBN: 978-1491910399. https://r4ds.had.co.nz/
- Wickham, Hadley. Advanced R, Second Edition. 2 edition. Boca Raton: Chapman and Hall/CRC, 2019. http://adv-r.had.co.nz/
3.5 Homework
- Finish your worksheet and submit your HW as described below.
- Additional: if you’d like more practice, you can use
swirllibrary:- To install:
install.packages("swirl") - To run:
library(swirl)- Then:
swirl() - it will offer you a set of interactive exercises similar to DataCamp.
- Then:
- To install:
3.6 Submitting homework
- Homework assignment must be submitted by the beginning of the next class;
- Email your homework to the instructor as attachments.
- In the subject of your email, please, add the following:
57528-LXX-HW-YourLastName-YourMatriculationNumber, whereLXXis the number of the lesson for which you submit homework;YourLastNameis your last name; andYourMatriculationNumberis your matriculation number.
- In the subject of your email, please, add the following: