This tutorial has been developed for OpenRefine version 3.7.5
To learn more about reconciliation services and how you can use them to augment your data, check out the official OpenRefine guide to reconciling Feel free to read just the introduction… Read more.
Cleaning data
This tutorial has been developed for OpenRefine version 3.7.5
Sometimes when you construct an API call and use the Add Column by Fetching URLs feature, it won’t work. In these cases, you can use python to help. So far we’ve been writing GREL… Read more.
This tutorial has been developed for OpenRefine version 3.7.5
You may have noticed from Activity 2 that sometimes you searched for a property to add from Wikidata, but got an error. Other times you might want to augment your dataset with data that… Read more.
We are going to work with a different dataset for the next few activities. In order to start augmenting the dataset, we need to do a bit of preparation work first. This activity will showcase some new concepts and features for OpenRefine, as well as… Read more.
In this activity, you are going to:
Create a new project by making an API call to pull in data and parse the resulting JSON
Manipulate the data by using GREL date expressions and facet the data to make discoveries
Create a new project by… Read more.
In this activity, you are going to:
Open regex101.com and load some sample data
Practice some regex basics
Use regex in OpenRefine
Open regex101.com and load some sample data
1. Browse to the website regex101.com. The REGULAR EXPRESSION box at the… Read more.
In this activity, you are going to:
Create a new project from the citizen science dataset and use the clustering feature
Split and concatenate various columns in the dataset
Restructure the dataset by removing columns and rows, and then work with… Read more.
In this activity, you are going to:
Review the dataset and load it into OpenRefine
Perform some basic data cleanup to get familiar with the OpenRefine interface
Use OpenRefine to sort, filter and facet data
Transpose the data from wide format to… Read more.
This course has multiple modules. Each module consists of a few videos. It will take approximately 2 hours to watch the videos. You can watch the videos by clicking on the topic title hyperlink under each module. If you need assistance, fill out… Read more.
The main dataset used is the flights dataset. It contains the US domestic flights in January 2020 [1]. The other two datasets used are fabricated datasets created for the purpose of this guide.
The link to the recording of this workshop can… Read more.