How to Build a Dataset in Constellate

This tutorial shows you how to build a dataset in Constellate. We recommend reading our general information on Constellate first. Please also see this access tutorial in order to log in with institutional privileges.

  1. Once you log in, you’ll be taken to your dashboard where you build a new dataset or access old ones.
  2. Select Build New Dataset to create a new query. Constellate home page (logged in) with Build a New Dataset highlighted
  3. Fill in the filters with your search parameters.
    Search page with search bar and filters. Bar graph right: count of texts per decade.
  4. The visualization on the right will automatically change, and provide useful suggestions to help you refine your filters.
    Build page showing results from 1900-1940 in English
  5. You can click on more visualizations for further information, such as word frequencies.
    Build your custom dataset page with More Visualizations highlighted
  6. To save any of the visualizations you see presented to you, you can click on the three dots at the top right to save the chart as a JPEG file.
    Three dot icon for the first visualization

    You can also download the underlying data for any of the visualizations as a CSV file.
    dropdown menu with Save Chart and Download list (.csv) highlighted

  7. Once you are happy with your search parameters, click on the Build button on the top right.
    More visualizations page with Build highlighted
    Note: the current maximum dataset size is 50 000 items.
  8. Constellate will then compile your dataset. This may take a while. You will be alerted by email when it is complete, or you can check back on this page later.
  9. Once it is finished, if you click on the download button, Dataset ready page with Analyze and Download buttons

    you will have the option to download the metadata as a CSV file or metadata + ngrams as a JSON-L file.
    Download this dataset popup window with CSV and JSON-L options

  10. You can also click on the analyze button to analyze your dataset in python. You will be presented with tutorials and python scripts you can use to perform text analysis, or you can use your own code.
    Popup with table of options: rows are different Jupyter notebooks with Tutorial and Analyze options for most