In this activity, you are going to:
Create a new project from the citizen science dataset and use the clustering feature
Split and concatenate various columns in the dataset
Restructure the dataset by removing columns and rows, and then work with… Read more.
Cleaning data
In this activity, you are going to:
Review the dataset and load it into OpenRefine
Perform some basic data cleanup to get familiar with the OpenRefine interface
Use OpenRefine to sort, filter and facet data
Transpose the data from wide format to… Read more.
Table of Contents
Some useful tips before you get started
Creating a number of smaller subsets based on research criteria
Dropping observations
Dropping variables
Transforming variables
Dealing with outliers
Creating new variables… Read more.
TABLE OF CONTENTS
Set up environment
Software
Data analysis packages in Python
Cleaning data in python
Download Dataset
Load dataset into Spyder
Subset
Drop data
Transform data
Create new variables
Rename variables
Merge two datasets
A few last… Read more.
The main dataset used is the flights dataset. It contains the US domestic flights in January 2020 [1]. The other two datasets used are fabricated datasets created for the purpose of this guide.
The link to the recording of this workshop can… Read more.