Data Wrangling in Python Study Group

This workshop has sessions on multiple days. You should plan to attend all the sessions.

"Data Wrangling" is the process of preparing data for analysis, which includes importing, cleaning, recoding, restructuring, combining, and anything else data needs before it can be analyzed. Data wrangling is a critical skill for research. This course teaches wrangling skills using mostly the data wrangling tools of the Pandas package in Python. Pandas is a collection of functions/methods for working with data similar to R's tidyverse.

This course will cover importing data, cleaning data, creating and transforming variables, merging data, and plotting. It is a hands-on class with time devoted to practicing using these tools to ready data for analysis. It is designed for people who have no experience with Python and pandas, but Python users who would like to learn pandas will also benefit from the class. Graduate students who will work in Python and pandas may choose to take this course at the beginning of their graduate student career or wait until they're ready to start doing research.

For this study group, students will be expected to read the material in Data Wrangling Essentials on their own, then come to class ready to discuss it and do programming exercises together. (There will be no reading assignment for the first day.) The number of times the group meets will be decided by the group. How much you get out of this course will depend--more than usual--on how much you put into it.

Instructor: Dimond
Room: 155 Van Hise Hall
Dates: 1/24, 1/31, 2/7, 2/14, 2/21, 2/28
Time: 2:30 - 4:00
Semester: spring20