Data Loss Horror Stories

image showing a grave stone surrounded by carved pumpkins
Last modified
Oct 31, 2025
Category

On Halloween, it’s easy to think of haunted houses and ghost stories, but what’s truly chilling for researchers is the sudden vanishing of data, the silence of a backup that never ran, the crash you didn’t anticipate, the repository that disappears. These are the data-horrors that lurk in every research project, and they’re happening now.

Out of all the spooky parts of the season, here at MDL we find any type of data loss to be the most terrifying things of all! Working on research without data management is spooky indeed!

Here are real stories of research data loss or compromise — each one a warning for research teams and institutions.

1. Research Data Disappearing at an Alarming Rate (Canada)  

In an analysis published by researchers at University of British Columbia, it was found that while datasets were available shortly after publication, availability plummeted by roughly 17 % each year past two years after publication. In other words: the longer a dataset sits, the more likely it is to disappear, become inaccessible, or lost altogether.

What this teaches us:  

  • Data preservation isn’t just a matter of immediate backup—it’s about planning for the long term.
  • Researchers and institutions must make sure data aren’t just created and published, but curated, migrated, and remain accessible.
  • A Data Management Plan (DMP) must include lifecycle and archival strategies, not just “store it.”

2. Infrastructure Threat: Data Repositories Vanishing

A recent study found that about 6.2 % of research data repositories in a sample had shut down entirely, with varying strategies for migrating or preserving the underlying data. These “infrastructure haunted houses” mean that even if data are deposited, if the repository doesn’t persist, the data can still vanish.

Lesson:  

  • Deposit your research data into repositories with strong sustainability policies.
  • Check what happens if the host repository closes, merges, or deprecates.
  • Include “repository risk” in your DMP: What happens if the repository disappears?

Check our guide on how to select a data repository: Data Repositories  

3. Canadian Lab Loses Days of Research Data through Ransomware (Canada)

According to a report from the Canadian Centre for Cyber Security, in 2021 a Canadian laboratory had its network compromised through a student downloading pirated software, which enabled a ransomware actor to encrypt systems. The result: a week of research data was lost and the lab’s systems had to be entirely rebuilt.  

Key takeaways:  

  • Even when the attack isn’t targeted at your research specifically, the collateral damage can be severe.
  • Backups are only safe if they are accessible, untouched, and aren’t connected in a way that encryption spreads to them.
  • Your DMP should integrate with your institution’s security plan—research doesn’t happen in isolation.

Check U of T’s Research Information Security Program

4. The “Lost Research Data” Problem – aging or retiring technologies

In a policy-committee report from 2008, stakeholders expressed concern that many valuable data collections — including geospatial, survey, social science, and humanities datasets — are being destroyed, overwritten, or lost because of obsolescence of storage media or lack of archiving.  One concrete example cited is “the expensive data developed for Michael B. Katz’s book The People of Hamilton (Cambridge: Harvard University Press, 1975), which is held by the Institute for Social Research at York University, where the equipment can no longer read the magnetic tapes on which the data are stored.”

Lesson for researchers:

  • Don’t rely solely on “lab drive and folder” for long-term preservation.
  • Legacy formats, aging media, and lack of migration = research lost.
  • Build DMPs that specify: What happens after storage medium is obsolete? How do we guarantee readability in 10, 20 years?

5. A “Near Miss” at a Major International Lab

A recent study shows that of 150 000+ electron-microscopy images generated, only ~3 500 (~2 %) ever made it into publications and were accessible. The implication: > 90 % of the data generated were effectively “lost” to researchers.  

Why this matters:  

  • “Lost” doesn’t always mean destroyed; sometimes it means never used, never accessible, and effectively gone.
  • Research storage isn’t just about holding the bits—it’s about making them findable, usable, and shareable. If they sit in a box never to be opened, the opportunity is lost.
  • DMPs should include discoverability, metadata, reuseability — not only “we saved the data”.

Learn more about data documentation.

“Haunted Research Data” Prevention Checklist

The monsters of data loss come in many forms—corrupted disks, careless overwrites, ransomware, and good old-fashioned forgetfulness. But they all share one weakness: good data management.

  • Create a Data Management Plan (DMP) early: outline how data will be collected, stored, backed up, secured, shared, and preserved long-term.
  • Backup regularly using the 3-2-1 rule: 3 copies of data, 2 different media/formats, 1 off-site (or cloud) copy.
  • Version control and verify backups: Don’t wait for disaster to find you can’t restore.
  • Use trusted institutional or subject repositories: deposit datasets so they are findable and preserved beyond the lab lifecycle.
  • Secure sensitive or irreplaceable data: Whether because of privacy, patent potential, or unique field data.
  • Ensure infrastructure resilience: Change-management processes (updates, scripts) must include safeguards.
  • Plan for the end of media or format life: If you store on tape, optical media, or obsolete formats, you must migrate.
  • Train your team: Backup strategy, naming conventions, metadata, file formats, and storage policy must be known by everyone in the lab.
  • Audit and monitor: Perform periodic reviews: Are backups valid? Can you restore? Is your repository still maintained?

Don’t know where to start? Visit our Research Data Management guide or contact Research Data Management Services team.

Don’t wait for the knock on your digital door. The creepiest thing about data loss isn’t the sudden panic—it’s realizing how easily it could’ve been prevented.

Happy (and Safe) Data Management Halloween! 

 

*Image is created by Canva AI.