MDL Tutorials
This tutorial will help you get up and running querying the Web of Science PostgreSQL database on a Mac computer. It will cover accessing the high performance computing environment, querying the database via SQL statements and from within a python script, and downloading the results of the query.
You will need a Compute Canada account with the proper credentials to access this database. If you haven’t done so already, you should first follow the instructions to get your account set up.
Note: This tutorial is intended for Mac users. If you are using Windows, check out this tutorial instead.
The HathiTrust Research Center (HTRC) is the research arm of HathiTrust. It develops tools and resources that enable text or computational analysis of the HathiTrust corpus. This corpus or digital library includes over 10 million volumes (mostly books and journals), 3 million of which are in the public domain. It covers 400 languages and publication dates from 1500 to the present day, representing a broad variety of subjects.
Please visit this link for extensive help with Scholars GeoPortal.
This tutorial walks you through accessing Constellate using the University of Toronto's institutional privileges. Please see our general Constellate page for more information and see our tutorial on creating a dataset once you have logged in.
University of Toronto community members can access online GIS courses through ESRI Academy. Content is designed for beginners and experienced users.
Since access to the Web of Science PostgreSQL database is tied to access to supercomputers, it can only be granted after a multi-step process that ensures security and access rights:
This tutorial shows you how to build a dataset in Constellate. We recommend reading our general information on Constellate first. Please also see this access tutorial in order to log in with institutional privileges.
Georeferencing is the name given to the process of transforming a scanned map or aerial photograph so it appears “in place” in GIS. By associating features on the scanned image with real world x and y coordinates, the software can progressively warp the image so it fits to other spatial datasets. This tutorial will explain how to georeference a raster image in ArcGIS so it can then be used as an overlay or for digitizing purposes. In this example, a historic Toronto map will be georeferenced using a dataset of city streets so we can see what existed on the site of Robarts Library before it was built.
The PDF guide below illustrates how to convert an ASCII text file containing posal codes into a SAS dataset compatible with Statistics Canada's PCCF+ SAS program (verstion 6A1), and then how to run the PCCF+ program.
Link to CHASS list of how-to guides for SDA: http://sda.chass.utoronto.ca/sdaweb/sda.htm#how_to
Infographics are a type of data visualization. They are usually a mixture of text and images, but they are graphic heavy, often only visualize a small amount of data, and have specific goals. Those goals are normally to inform or persuade through storytelling. They are often shared on websites and social media for marketing purposes, but they are becoming more popular in a variety of situations to convey your data and message to your audience in a more unique and appealing way.
This tutorial describes how to request a license for NVivo 12 Plus, download it, and license it.
This tutorial walks you through the steps of downloading, installing, and licensing ArcGIS Desktop 10.x, which includes ArcMap, using a Single Use License.
This is a guide to installing and running Tableau Desktop on your personal computer. Please note that all computers in the Map and Data Library (on the fifth floor of Robarts) and in the computer labs on the fourth and fifth floors of Robarts Library already have Tableau Desktop installed.
This tutorial provides an overview of the Online version of ArcGIS, one of ESRI's many mapping tools.
ArcGIS Online is a complete, scalable and secure software-as-a-service cloud-based mapping platform which can be used to make and share maps.
This tutorial is an introduction to Piktochart, a popular online tool used to create infographics. This exercise will illustrate some infographic design principles and specific features of Piktochart to create an infographic about comparing housing in Vancouver vs Toronto.
This guide is suitable for new R-users or advanced level R-users looking for information on specific topics. The topics covered in this guide are importing, exploring, modifying and managing data.
Link to Datacamp's free intro course on R: https://www.datacamp.com/courses/free-introduction-to-r
This guide gives users an introduction to SAS. The topics covered are importing, exploring, modifying, and managing data. It has been created using SAS 9.4. The main dataset used is the flights dataset. It contains the US domestic flights in January 2020[1]. For additional support, fill out the support request form.
This online tutorial will provide an introduction to SimplyAnalytics and a few of its many possible uses. SimplyAnalytics is a web-based data visualization application. It can be used to create simple thematic maps and tables from census and other socio-demographic data, as well as business point data.
This guide gives users an introduction to Stata. The topics covered are importing, exploring, modifying and managing data.
An overview of the guide provided by Princeton with a link to the original guide. If you want to learn how to use Stata, you might find this guide by German Rodriguez at Princeton University useful: http://data.princeton.edu/stata/default.html
This tutorial will take you through two ways of logging in to your ESRI ArcGIS Online account for the first time using your UTORid.
This short guide provides instructions on how to log into the Gale Digital Scholar Lab via the University of Toronto institutional access.
This tutorial will cover how to download census data and census boundary files and matching them together in ArcMap for further analysis. Census data will be downloaded using CHASS.
This guide shows you how to match census data to postal codes, and how to merge them in a SPSS file. We will select income variables from the 2006 Census and PCCF data from Toronto.
NVivo is a program used for qualitative data analysis. It supports researchers in managing, organizing, and analyzing qualitative data to produce new insight, infer relationships, and identify themes.
This document compiles online resources that help to build terrain 3D models with a variety of software-options. Brief introductions on the pros and cons of each option are provided.
In this tutorial, we will begin work on augmenting datasets.
Note: This is an advanced tutorial. If you are new to OpenRefine, please begin with OpenRefine tutorial 1.This tutorial has been developed for OpenRefine version 3.3.
This tutorial will teach you how to use OpenRefine's reconciliation service to connect data in your dataset with Wikidata.
Note 1: Complete Augmenting activity 1 first before attempting this activity.
Note 2: In order to complete this activity, you need to be running the latest version of OpenRefine.
This is a guide to installing and running OpenRefine on your personal computer. Please note that all computers in the Map and Data Library (on the fifth floor of Robarts) and in the computer labs on the fourth and fifth floors of Robarts Library already have OpenRefine installed.
This tutorial has been developed for OpenRefine version 3.3 but is compatible with OpenRefine 3.4.1.
Please note that we also have converted some of this tutorial into a self-paced course with videos. U of T students, staff, and faculty can enroll in our OpenRefine Quercus course.
This is the first activity in this tutorial series, and assumes no prior knowledge of OpenRefine. In this activity you will be importing a spreadsheet of data into OpenRefine and exploring it. The goal of this activity is to use a simple dataset to introduce you to the OpenRefine user interface and some of the basic types of tasks you can accomplish. This dataset isn’t particularly “messy,” but provides some of the core knowledge needed to work with messier datasets in later activities.
If you need a copy of OpenRefine on your personal computer, please follow these installation instructions.
Before you begin, please download the OpenRefine workshop sample datasets.
This tutorial has been developed for OpenRefine version 3.3. If you are using OpenRefine 3.2 or earlier, please skip steps 13 through 16, inclusive, as they rely on a new feature introduced in version 3.3.
We are going to work with a bit messier dataset now for the next few tasks. This is a citizen science dataset captured using an app called iNaturalist. The data was captured for a city nature challenge and shared on data.world. This activity will showcase some more features in OpenRefine.
The goal of this activity is to create a new project with this citizen science dataset and work with the data. You will use clustering to improve the consistency of the dataset. You will also perform various manipulations, such as split and concatenate. Finally, you will learn various ways to remove columns and rows, and work with the Undo/Redo features in OpenRefine.
Before you begin, please download the OpenRefine workshop sample datasets, if you have not already.
Note: This assumes that you have learned the basics of OpenRefine already through the Survey of Household Spending activity.
This tutorial has been developed for OpenRefine version 3.3.
You were introduced to GREL in the previous activity, so you know that GREL is a powerful tool for cleaning/editing your data. You can make GREL even more powerful by learning how to use regular expressions (aka regex). A regular expression is a sequence of characters that define a search pattern – it is used to search for matches within text. In OpenRefine, you can use it in your GREL expressions to create sophisticated patterns describing what type of information you want to find within your dataset, then do something with the matching text (edit it, delete it, put it in a new column, etc.).
This activity assumes you have already completed the Survey of Household Spending and Citizen Science activities, have a familiarity with OpenRefine and know how to create simple GREL expressions. Before you begin, please download the OpenRefine workshop sample datasets, if you have not already.
This tutorial has been developed for OpenRefine version 3.3.
Update: please note that as of March 18, 2020, Open Data Toronto has suspended service and so their service is not available for API calls. Until service resumes, please skip step 3, and during step 5, please chose to Get Data From: This Computer and select the 311.json file in the packaged workshop files. This represents a snapshot of the data that will work with the exercises. Please feel free to email mdl@library.utoronto.ca if you run into difficulties.
Sometimes you don't have your data in a file. Instead you want to use an API call to pull data from elsewhere. OpenRefine can help you make these calls and parse the data you receive.
The goal of this activity is to create a new project by pulling in 311 call data from the City of Toronto into OpenRefine using an API call and then work with the data. You will construct an API call to download a subset of 311 call data in JSON format, and then use OpenRefine to parse that data and put it into a tabular format. You will then use GREL to further manipulate the data (especially working with date formats) and make some discoveries.
Note: This assumes that you have learned the basics of OpenRefine already through the Survey of Household Spending activity and the Citizen Science activity. This also assumes that you have a basic understanding of APIs and JSON. The 311 JSON dataset can be found in the sample data in case the API call does not work.
OpenRefine is a free, open-source program used for working with messy data. It allows you to clean, transform, and augment data in preparation for analysis and visualization.
In order for many GIS functions to work properly, your datasets need to be stored in a common projected coordinate system. This guide will assist you with the projection process in ArcGIS. (Unsure of what the appropriate projection is for your area of interest? Refer to this help document or ask a staff member for assistance in helping you determine it.)
ProQuest TDM Studio is a web platform for running text analyses on thousands of ProQuest materials, including but not limited to, ProQuest Dissertations & Theses and the New York Times.
R is a programming language and a free software environment to write and execute the R language. RStudio is an Integrated Development Environment that provides free and open-source tools for R. While you can run your data analysis in the basic R environment, RStudio has a more intuitive interface and more tools to help you write your R code.
Comprehensive set of guides that include installing R, learning R fundamentals including graphics and advanced data analysis examples.
Link:
This guide helps users get started with writing a SAS program/code for RTRA purposes.
Comprehensive set of guides that include installing SAS, learning SAS fundamentals and advanced data analysis examples.
Link:
This tutorial demonstrates how to scrape tweets for data analysis using Python and the Twitter API.
Link to Scholars Portal guide: http://guides.scholarsportal.info/odesi
Unsure of which projection to use in your GIS work? This tutorial will help you figure out your options.
Comprehensive set of guides that include obtaining SPSS, learning SPSS fundamentals and advanced data analysis examples.
Introductory guide to the Stat/Transfer utility (version 10).
Comprehensive set of guides that include installing Stata, learning Stata fundamentals including graphics and advanced data analysis examples.
Link: