June 12, 1:00-5:00pm, 3121 Snedecor
Eric Hare, erichare@iastate.edu.
Andee Kaplan, ajkaplan@iastate.edu.
Carson Sievert, sievert@iastate.edu.
The Web Scraping in R workshop will teach you how to acquire data that lives on the Web and work it into a form suitable for data analysis.
The course will be a mix of instruction and interactive activities. It will be held in a computer lab but you are encouraged to bring your own laptops, with software already loaded. A list of software will be available at this site several days prior to the workshop.Date | Notes | Lecture and Resources |
---|---|---|
1:00 – 1:30 | Install stuff | This will make sure that everybody's system is up and running. You should try to install stuff ahead of time, so that this time can be used to address potential problems. Install Chrome, SelectorGadget, phantomjs, and R packages |
1:30 – 2:30 | Motivation and examples | Easily extract data from HTML pages using rvest and SelectorGadget |
2:30 – 3:00 | Web APIs and Dynamic Sites | Talk to APIs with httr and scrape dynamic sites with rdom. |
3:00 – 4:45 | Working with non-HTML formats | Easily transform XML with XML2R and JSON with jsonlite/tidyjson |
4:45 – 5:00 | Questions and Survey | We very much appreciate any feedback you can give us. You can find a form here: survey. |
After the end of the this one course, we expect you to be able to do the following:
Recommended Reading: