In this workshop, you will get a sense of what is possible working with humanities data and understand how humanities scholars approach “data”. We will introduce multiple scenarios with different datasets to help you develop strategies for organizing and cleaning data. Tools may include OpenRefine, Google Sheets, and R with RStudio. Attendees should plan on some prework, including installations and brief background readings.
Note that the workshop will take place on four different days. A reminder will be sent out one day in advance of each workshop in the series. You need only register for the first day; all registrations will be automatically transferred across the series.
Please see the full schedule below:
Day 1: March 7th, 9-10am, in Lamont B-30
Learn about approaches to organizing data in spreadsheets, as well as some of the most useful formulas and tools in Google Sheets. This session will help you avoid common organizational pitfalls and help you get set up to take advantage of time-saving automation with minimal effort.
Day 2: March 9th, 9-10am, in Lamont B-30
Learn about more advanced tools for manipulating and cleaning up datasets. When you get data that isn't well-formatted or consistent, regular expressions and Open Refine can help you get that data into a usable state.
Day 3: March 28th, 8:30-10am, in B-30
Take data clean-up and normalization to the next level and learn how to combine data sources using open-source software.
Day 4: March 30th, 9-10am, in Lamont Forum Room
Application programming interfaces (APIs) power the modern web by allowing developers to combine functionality from different sources and researchers to access data programmatically. We will cover finding APIs for various tasts; reading API documentation; HTTP methods; and getting, transforming, and storing API data with R.