Presentation Description:
September 29, 2020, 2:00pm, presented by Nich Worby
Web archives, like the Internet Archive's Wayback Machine, have existed for almost as long as the World Wide Web. This presentation will introduce tapping into this rich but complex data source. Participants will get an overview of major web archive sources, the WARC file format, methods of accessing web archived data, as well as a demonstration of tools for analytical tasks like extracting network graphs of links, extracting images, and filtering web page text for further analysis.
Link to Slides
Link to Recording - 55:40
Please visit the Bits and Bytes webpage for more presentations on various tools and topics.