Web Scraping in R.

June 12, 1:00-5:00pm, 3121 Snedecor

Eric Hare, erichare@iastate.edu.
Andee Kaplan, ajkaplan@iastate.edu.
Carson Sievert, sievert@iastate.edu.

The Web Scraping in R workshop will teach you how to acquire data that lives on the Web and work it into a form suitable for data analysis.

The course will be a mix of instruction and interactive activities. It will be held in a computer lab but you are encouraged to bring your own laptops, with software already loaded. A list of software will be available at this site several days prior to the workshop.

Registration:

Sign-up is done through EventBrite.

Lectures and timetable

Date Notes Lecture and Resources
1:00 – 1:30 Install stuff This will make sure that everybody's system is up and running.
You should try to install stuff ahead of time, so that this time can be used to address potential problems.
Install Chrome, SelectorGadget, phantomjs, and R packages
1:30 – 2:30 Motivation and examples Easily extract data from HTML pages using rvest and SelectorGadget
2:30 – 3:00 Web APIs and Dynamic Sites Talk to APIs with httr and scrape dynamic sites with rdom.
3:00 – 4:45 Working with non-HTML formats Easily transform XML with XML2R and JSON with jsonlite/tidyjson
4:45 – 5:00 Questions and Survey We very much appreciate any feedback you can give us. You can find a form here: survey.

Learning outcomes

After the end of the this one course, we expect you to be able to do the following:

Useful links

Recommended Reading: