Data Evaluation Checklist *
Attribute |
Element |
Indicators or Measure |
|
Availability |
Accessibility |
Data can be found and distributed easily. |
|
Timeliness |
Data are regularly updated and/or made available in a timely fashion. The collection and release meet the requirements of the research. |
||
Confidence |
Data creators have credentials, are reputable and the data are verifiable. Relevant contact information for those responsible are available. |
||
Preservation |
Statements are available on the state of preservation of the data, or the data are located in a protected and secure repository. |
||
Usability |
Credibility |
Data are verified or audited for correctness by experts, specialized groups, organizations or government bodies. |
|
Reliability |
Accuracy |
Values represent the true state of the source of information. They cause no ambiguity and are supported by evidence and sources. |
|
Consistency |
Data are verifiable over time. After processing, concepts, values and formats still match pre-processed data |
||
Integrity |
Data formats are clear, consistent and have structural and content integrity. Data are well described using consistent and standardized metadata. |
||
Completeness |
No deficiencies that would impact the use of the data. If there are deficiencies, do they affect data accuracy or integrity? |
||
Uniqueness |
Data are not derived or generated from extrapolation. |
||
Validity |
Data conform to the formats, types and ranges of their stated purpose or definition. |
||
Purpose |
The reasons for creating data, the intended audience, and any special interest and/or biases of the producer are evident |
||
Relevance |
Fit |
Data are within (fully or partially) the theme of users’ needs and are appropriate for the method and level of use. |
|
Presentation |
Readability & Usability |
Data are readable, clear and comprehensible with a proper level of precision. |
|
Other |
Timeframe |
Data fit within the timeframe of research |
|
Flexibility |
Data can be compared or are compatible with other data. Data have useful groupings and classifications and can be repurposed or manipulated for use. |
||
Value |
High level of cost/benefit to the data. Optimal use is possible. There is no danger to safety or privacy. |
||
Coverage |
Where geography is being examined, the data cover the entire area being researched |
||
Scale |
The data are collected at an appropriate scale that allows proper analysis. |
*The above grid was adapted with information from the following sources:
Albano, Jessica. “Library Guides: Savvy Info Consumers: Data & Statistics.” Accessed March 30, 2020. https://guides.lib.uw.edu/research/evaluate/data.
https://creativecommons.org/licenses/by-nc/4.0/
Cai, Li, and Yangyong Zhu. “The Challenges of Data Quality and Data Quality Assessment in the Big Data Era.” Data Science Journal 14, no. 0 (May 22, 2015): 2. https://doi.org/10.5334/dsj-2015-002.
DAMA UK Working Group on “Data Quality Dimensions.” “The Six Primary Dimensions for Data Quality AsseSsment,” 2013, 17. https://www.whitepapers.em360tech.com/wp-content/files_mf/1407250286DAMAUKDQDimensionsWhitePaperR37.pdf
Levy Sarfin, Rachel. “Data Quality Dimensions: How Do You Measure Up?” Syncsort Blog (blog), August 7, 2019. https://blog.syncsort.com/2019/08/data-quality/data-quality-dimensions-measure/.
Satterfield, Ashley (CDC/ONDIEH/NCBDDD) (CTR). “The Six Dimensions of EHDI Data Quality Assessment,” n.d., 4. https://www.cdc.gov/ncbddd/hearingloss/documents/dataqualityworksheet.pdf