This tutorial introduces Gale's Digital Scholar Lab (DSL), a digital humanities tool. In this tutorial, you will learn how to:

Build a collection of texts, including uploading your own materials
Create collaborative workspaces
Clean texts
Run analytical tools on texts and visualize the results
Download the data, graphs, and other visualizations produced through this tool
Download the scanned texts in your collection, so that you can use them in other programs
Find additional training and resources

Note: Gale periodically updates the Digital Scholar Lab, so some features of this tutorial might not always match the latest interface. This tutorial was last updated in March 2023.

Access covers how to find and log into the DSL.
Collaboration and Notes, an optional guide, shows you how to create team workspaces.
Collections includes uploading your own texts and using advanced search options to locate primary sources from Gale.
Cleaning discusses how to prepare your texts for best results.
Analysis covers the DSL's six tools in detail.
Export shows you how to export data, graphs, and full texts.
Additional training includes resources from Gale, including sample projects and recorded webinars.

What is Gale Digital Scholar Lab?

The Digital Scholar Lab (DSL) is an online tool for analyzing texts, visualizing the results, and exporting data, graphs, and texts from the platform. You can access a variety of primary sources (newspaper articles and archival documents such as books, pamphlets, reports, and ephemera), as well as upload your own tets. It runs in your Internet browser and does not need any additional software. You do not need to know any coding to use this tool.

The DSL has six analysis tools:

Document Clustering
Named Entity Recognition
Ngrams
Parts of Speech
Sentiment Analysis
Topic Modeling

The DSL makes it easier to learn and understand how these tools work by providing user-friendly graphical user interfaces, documentation, and demonstration videos. External links to the code or programs behind each tool are also made available should you wish to run the tool on your own computer and use its more advanced features.

What collections does it have?

When you use the DSL through your University of Toronto connection, you can use any of the Gale primary source collections that the University has licensed, including hundreds of thousands of documents in multiple languages with broad historical and geographical coverage. (Once you are logged in, see these instructions to view all accessible collections.) Extensive coverage, however, should not be confused with universal coverage; many perspectives are not represented in these text collections. For example, most of the colonial-era documents included in these collections were produced and collected by colonizing people, organizations, or institutions, rather than by colonized peoples. It is up to you as a critical scholar to decide on which questions can and cannot be answered by these collections. Note that some are downloadable open source tools whereas others will require knowledge of Python.

Digitization

The texts available in the DSL have gone through several steps: (1) various institutions like libraries and archives collected the texts; (2) Gale scanned the text; (3) through a process called Optical Character Recognition (OCR) these scans—which are essentially photographs of texts—are converted into readable, searchable text.

OCR uses image-recognition algorithms to identify characters and create a text file based on the image. OCR is powerful, but it is also prone to errors such as misidentifying characters (e.g. reading a zero as the letter 'O') or adding or removing spaces. There are additional challenges for scanning older English texts, such as those that use the long 's' ('ſ'), which resembles a lowercase 'f'. We discuss this process further in the section on Cleaning, but for now it is sufficient to know that this process can often leave errors in the text files produced through OCR.

Where do I find more information and videos on it?

In addition to the in-depth tutorials above, we have a variety of pages and videos related to the DSL:

Getting started with the Digital Scholar Lab
General overview and Frequently Asked Questions (FAQ)
Short demo video
In-depth recorded workshop (with captions and slides)
Additional training from Gale (recorded and live webinars)
Text Analysis Tools Comparison Cheat Sheet (compares the Digital Scholar Lab, Constellate, TDM Studio, and the HathiTrust Research Center)

Who do I contact for more help?

If you would like help or want to take any of the DSL's tools further in your own analysis, you can always contact Digital Scholarship Services.

Note: if you are experiencing an HTTP 400 error when attempting to log in, please close your browser, reopen, and retry. You may have timed out, which can cause errors on some browsers.

Text Analysis Fundamentals with the Digital Scholar Lab

Table of Contents

What is Gale Digital Scholar Lab?

What collections does it have?

Digitization

Where do I find more information and videos on it?

Who do I contact for more help?

Further Reading

Library links

Libraries

Contact