Skip to main content

Sociology: Research


Web Scraping using R

Web Scraping using R
Hosted by: Justin Chun-ting Ho # University of Edinburgh
Hosted by
Introduced by
Date and Time
8th Jun 2018 14:00 - 8th Jun 2018 16:00
Conference Room 2.15, CMB

Acquiring textual data is the first step of any Text Analysis projects. Thanks to the advancement in information and communication technology, the online presence of news outlets, discussion forums, and other websites have provided researchers with a wealth of textual data. In this workshop we will look at using the R programming language to automatically extract text from online sources, a special focus will be on online news and press releases.

The workshop will cover the basic workflow of web scraping, basic knowledge of HTML and CSS, selector tools such as SelectorGadget. If time allows, it will also cover retrieving news content through API.

You need no prior knowledge of R and HTML (while it would be beneficial if participants had some experience with R, those without any prior experience will still be able to participate effectively). You will be required to bring your own laptop and it is advised that you download and install R ( and RStudio ( beforehand. Both downloads are free. Please ensure you have installed both of these correctly prior to arriving at the workshop, so as not to cause any delays for the other participants.

The workshop will take place on 8 June, from 14:00-16:00, in the Conference Room 2.15 of the Chrystal Macmillan Building.

Please sign up for the workshop here, so we know approximately how many people are interested.

Pile of old wooden structures