Data Evaluation Checklist

Data Evaluation Checklist *

Attribute

Element

Indicators or Measure

 

 

Availability 

Accessibility

Data can be found and distributed easily. 

Timeliness

Data are regularly updated and/or made available in a timely fashion. The collection and release meet the requirements of the research.

Confidence

Data creators have credentials, are reputable and the data are verifiable. Relevant contact information for those responsible are available. 

Preservation

Statements are available on the state of preservation of the data, or the data are located in a protected and secure repository.

 

Usability

 

Credibility

Data are verified or audited for correctness by experts, specialized groups, organizations or government bodies.

 

Reliability

Accuracy

Values represent the true state of the source of information. They cause no ambiguity and are supported by evidence and sources.

Consistency

Data are verifiable over time. After processing, concepts, values and formats still match pre-processed data

Integrity

Data formats are clear, consistent and have structural and content integrity. Data are well described using consistent and standardized metadata.

Completeness

No deficiencies that would impact the use of the data. If there are deficiencies, do they affect data accuracy or integrity?

Uniqueness

Data are not derived or generated from extrapolation.

Validity

Data conform to the formats, types and ranges of their stated purpose or definition. 

Purpose

The reasons for creating data, the intended audience, and any special interest and/or biases of the producer are evident

Relevance

Fit

Data are within (fully or partially) the theme of users’ needs and are appropriate for the method and level of use.

Presentation 

Readability & Usability

Data are readable, clear and comprehensible with a proper level of precision. 

 

Other

Timeframe

Data fit within the timeframe of research

Flexibility

Data can be compared or are compatible with other data. Data have useful groupings and classifications and can be repurposed or manipulated for use.

Value

High level of cost/benefit to the data. Optimal use is possible. There is no danger to safety or privacy.

Coverage

Where geography is being examined, the data cover the entire area being researched

Scale

The data are collected at an appropriate scale that allows proper analysis.

 

*The above grid was adapted with information from the following sources:

Albano, Jessica. “Library Guides: Savvy Info Consumers: Data & Statistics.” Accessed March 30, 2020. https://guides.lib.uw.edu/research/evaluate/data

https://creativecommons.org/licenses/by-nc/4.0/

Cai, Li, and Yangyong Zhu. “The Challenges of Data Quality and Data Quality Assessment in the Big Data Era.” Data Science Journal 14, no. 0 (May 22, 2015): 2. https://doi.org/10.5334/dsj-2015-002.

DAMA UK Working Group on “Data Quality Dimensions.” “The Six Primary Dimensions for Data Quality AsseSsment,” 2013, 17. https://www.whitepapers.em360tech.com/wp-content/files_mf/1407250286DAMAUKDQDimensionsWhitePaperR37.pdf

Levy Sarfin, Rachel. “Data Quality Dimensions: How Do You Measure Up?” Syncsort Blog (blog), August 7, 2019. https://blog.syncsort.com/2019/08/data-quality/data-quality-dimensions-measure/.

Satterfield, Ashley (CDC/ONDIEH/NCBDDD) (CTR). “The Six Dimensions of EHDI Data Quality Assessment,” n.d., 4. https://www.cdc.gov/ncbddd/hearingloss/documents/dataqualityworksheet.pdf