Apr 022015
 

Abstract

This software’s purpose is to warn you when a file gets corrupted and to help you find an uncorrupted version of this file among your backups.

It is not a Diff tool; there are already excellent and free Diff tools, such as KDiff3.

Preamble

Over time, data corrupts itself: optical discs oxidize, SSDs wear out, 0s and 1s get flipped…
The expected lifetime of data varies depending on the media but is usually taken to be around 10 years, provided you still have the machine and software to read it.

Therefore, how can you keep important photos, videos and documents?
Your only protection against data loss is multiple backups. Keep at least two backups of your important data, preferably in two different places. Copy the data every now and then to refresh the bits.
But having a backup is not useful if you cannot tell whether it is corrupted!

Hence this tool, the “Data Rot Detector”.

Download and Installation

  1. Download the following ZIP archive: DataRotDetector_1.0.0.1.zip
  2. Extract all the files from the ZIP in the same folder on your drive (keep the folder structure of the files in the ZIP).
  3. Run DRD.exe (if you are unsure whether to run the 64 bits or 32 bits version, try the 64 bits one first; if it does not work, use the 32 bits one).

Here is a suggested usage sequence

  1. Back up your files.
  2. After the backup, run the “Data Rot Detector” on the source drive, so that it can take a “thumbprint” of every file. This becomes the “original data set”.
  3. Immediately check the backup against the “original data set” (in case the backup failed silently).
  4. After some weeks (between backups), check your source drive against the “original data set”.
  5. If any file gets corrupted, retrieve the uncorrupted file by checking your backup and restore it to your main drive.

How does it work?

The “Data Rot Detector” calculates the MD5 of every file and keeps it along with the last modification date. If a file’s last modification date has not changed but its MD5 has, then it is considered to be corrupted.

Additional functions

Because of the way this program stores information, it can:

  • find moved or renamed files (between 2 scans)
  • find new and deleted files (between 2 scans)
  • find exact duplicates (even with different file names; sorted by size)

Requirements

  • At least Windows Vista and a NTFS drive
    (the “Data Rot Detector” could be recompiled for other systems since it uses Qt).
  • About 150 MiB of RAM for every 100k files.

 Limitations / False positives

The “Data Rot Detector”:

  • cannot tell if new or changed files are corrupted.
  • cannot test files it does not have access to, such as:
    • other users’ files
    • currently opened files (with share = none)
    • some system files
  • does not test unused space for corruption (use S.M.A.R.T. for this)
  • does not test alternate data streams (ADS)
  • can be misled by programs that lie about their files’ last modified date (e.g. for copy protection). Excel is unfortunately one such program (.xslx cannot be checked).

Support

I wrote this software for myself in my spare time. I offer no professional support, but I may answer questions posted in the Comments section of this blog.

License

The “Data Rot Detector” is published under the LGPL v2 license. The source code is included with the executable. If you use the code, I request that my name (Jonathan Mérel) appears in your credits. I make no warranty whatsoever about this program.

 

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)