ACK! My apologies - I didn’t notice the incorrect link in my earlier post. The forum software included the “).” as part of the URL. (Paul got it right!)
Data integrity is highly important to me. I really REALLY wish that NTFS included checksum information for each file in the file system entries. Just a simple CRC32 would be fine. I would love to be able to use something like ZFS in Windows since it includes checksum information, but until that happens I’m stuck with external checksumming tools.
I downloaded and looked at ExactFile, but the way that it works is a bit restrictive for me. I like the Explorer integration of Checksum - it’s nice to be able to right-click on a folder or drive letter in Explorer and choose “Create checksums”. If I’m wanting to make checksums I’m invariably already working within Explorer. With ExactFile you have to separately start ExactFile. At that point you can either drill down to the folder you want to start checksumming from, or you can drag the folder from Explorer to the input line in ExactFile (which is the way that I’d do it).
However, there are a few big things that I don’t like about almost all checksumming programs, including ExactFile.
Let’s say you want to create checksums for your D: drive. With ExactFile (as with most other checksum programs) all the checksums for the drive would then get put into a single file in the root of the D: drive. Let’s say a year later you then want to burn the contents of D:\Media\Pictures to a DVD. There isn’t a convenient way to first make sure that the pictures in that folder are still good via checksums, and to have checksums follow the pictures to the DVD. Since the checksums that were calculated a year before were for the entire D: drive you’d have to verify the whole D: drive, to verify the contents of just that folder. I’m assuming that you could cancel the verify after it reaches that folder, but that could be a long time. You could work around that by originally creating a separate checksum digest for each folder off the root of D:, which is how I used to do it with earlier checksumming tools, but if I have D:\Media\Movies then I could be verifying for hours before I get to D:\Media\Pictures. The upshot is that unless you happened to have used ExactFile to create a checksum digest of just that folder, you’re going to have to verify more files than what is needed. Plus, you’ll need to create a digest for that folder in order to move it to DVD. It’s all extra work and extra time.
Then there’s the problem with changes to the file system structure. If you make a checksum file of your D: drive and move folders around, that checksum file is now going to need to be totally redone, or edited by hand, to reflect those changes. I move folders a lot, so that would hurt me a lot with ExactFile’s method.
Then there’s an issue with file exclusions. There are some files that I know I’ll be changing fairly often. Because of that, a checksum that I make of it today will not be any good a year, or even a week, from now. I couldn’t see any way to tell ExactFile to not create checksums for particular filenames or extensions.
All the above things are handled by the Checksum program from corz:
- When Checksum creates hashes (checksums) for all the files on the D: drive, it stores the results in separate .hash files, one in each folder. Each hash file contains the hashes of each file in that folder, and only that folder, and the entry for each file is just the file name, no absolute or relative path is included. So if you want to ensure that the pictures in D:\Media\Pictures are OK, you would double-click on the file named Pictures.hash inside that folder. Checksum would then verify that the checksums in the hash file match the associated files in that folder. If you had subfolders you could right-click on the folder, choose “Verify checksums”, and Checksum would verify the checksums in all the hash files down that part of the folder structure. Once you’ve verified the files against the checksums in that folder (or folder structure) you can burn that folder or folder structure to DVD, and then you could check the hashes on the DVD to see if it was burned properly.
But… what if there were new files added to that folder since the .hash file was created? When you tell Checksum to create checksums on that folder where it had already been run before, it will find the existing Pictures.hash. It will not recalculate the checksums for the entries that are already in the .hash file, but will do so on the new files and append the information to the end of the existing .hash file. You won’t know if the new pictures were corrupted from the time they were copied to the time you make the checksums, but at least you’ll know from then on, and you won’t lose the original checksums from the year before.
If you move the Pictures folder to a different location then there isn’t any problem since the hash file doesn’t care about relative or absolute paths. Pictures.hash doesn’t care if it’s in D:\Media\Pictures, or Z:\folderA\folderB\folderC\Pictures.
By editing its .ini file I can tell it to not run against particular filenames, or for files with a particular extension, or files in particular folders. So I can avoid a lot of false negatives by telling it to not hash files that I don’t care about (e.g. .ini, .txt, .log, .lnk)
A few gotchas: If you rename a folder (e.g. Pictures to Photos) you need to try to remember to rename the .hash file (Pictures.hash to Photos.hash). If you’re verifying checksums then Checksum will look for every .hash file in a folder and verify the checksums against the files in that folder, so it’ll find Pictures.hash and verify the files. But if you then tell Checksum to create checksums, then it’ll create a new file called Photos.hash and calculate checksums on all the files again, giving you 2 different .hash files in that folder. Then if you tell Checksum to verify checksums on that folder it’ll do it for both .hash files, so it’ll take twice as long. Another gotcha is that if you delete one or more files from the folder then when you tell Checksum to verify checksums it’ll report it/them as missing. So if I delete some files from a folder, and that folder contains a lot of large files that would take a while to checksum, I’ll save time by editing the .hash file and manually removing the entries. If they’re in a folder with files that’ll checksum quickly then I’ll delete the hash file (after I’ve verified that the other data is OK) and recreate the checksums.
But remember that most such gotchas are true for all the checksumming programs. Until checksuming gets built into the Windows file system, Checksum is about as perfect as I’ll probably find.
Please note that I did find a bug in Checksum a couple years ago. It doesn’t want to properly handle files/folders with characters that aren’t plain 7bit ascii, and if I recall correctly it would recreate the entry for that file in the hash file each time it was run. So a file named Noël.mp3 would get done multiple times due to the special character ë. I reported it and the guy came up with a fix that he sent me, but for some reason he hasn’t updated the version that he has on his site with that version. Hopefully he’ll release that soon. In the meantime I wouldn’t mind sharing the version I have if you’d like it.