Newbie Question: Name/Date vs. Hash vs. Binary

For technical support visit http://support.2brightsparks.com/
Post Reply
ChrisW828
Newbie
Newbie
Posts: 4
Joined: Wed Mar 29, 2017 2:50 am

Newbie Question: Name/Date vs. Hash vs. Binary

Post by ChrisW828 » Wed Mar 29, 2017 3:05 am

I have been using SyncBack "forever" and am trying the Pro version now. I fear I may have been deleting files I want to keep for years.

Does SyncBack only compare file names and dates, or does it compare more in depth? I have been Googling and reading the forums for hours and it seems like either binary comparison or hash comparison is what I am asking about, but I'm not sure which one, and every thread I am finding via search discusses pieces, but doesn't tell me which is default, how to find and turn one "deeper" comparison, etc.

If possible, I'd like to synchronize in such a way that files are compared beyond name/date for differences and files ARE removed if they 1. don't share name/date, but 2. are actually the same exact file regardless, and ARE NOT removed if they 1. share name/date and 2. are actually different files regardless.

I hope I am making sense. Thank you in advance.

ChrisW828
Newbie
Newbie
Posts: 4
Joined: Wed Mar 29, 2017 2:50 am

Re: Newbie Question: Name/Date vs. Hash vs. Binary

Post by ChrisW828 » Wed Mar 29, 2017 10:00 pm

I continued searching and think I may have found my answer. Posting in case anyone else finds this thread via similar search. If this is incorrect, or there is more to it, I'd appreciate a head's up. :)

https://www.2brightsparks.com/syncback/ ... ttings.htm
Use slower but more reliable method of file change detection: By default, SyncBackPro will not compute the hash value of a file. The reason is that it can dramatically increase the time taken for a profile to run. However, if you want to be absolutely certain that SyncBackPro detects if a file has changed, so that it is copied, then you can enable this option. The only reason to enable this option is if you do not trust the last file modification date & time of the files, and the file size may not change. For example, by default, TrueCrypt drive container files never change size or last modification date & time (note that you can configure TrueCrypt to change the contains last modification date & time via the Settings->Preferences main menu).

thorsten26
Newbie
Newbie
Posts: 1
Joined: Wed May 09, 2018 6:41 am

Re: Newbie Question: Name/Date vs. Hash vs. Binary

Post by thorsten26 » Wed May 09, 2018 7:40 am

I was also worried about my Data especially Videos and Pictures which are all stored on one Fixdisk - so i have a mirror to an other fixdisk.

However, there is a big Myth? Or a Fact? Thats called BITROT or Bit rotation.

If u search in the Internet u find a lot of Threads about this issue and it seems to be a never ending Story.... :shock:

For those, who didn't know what that is: It seems, that if a File is sitting on the Harddisk - over a longer time - that one Bit can change from 0 to 1 or to the other way arround. U may have seen this already, typically in jpg Pictures if u see suddenly only a half picture or colered lines - in streaming media like a video u may not see it at all.

So i looked for a way to

1. Mirror my Data from one Disk to a second
2. Verify, that I have no Bitrot ( Bit rotation)

Syncback is a good Tool for doing it - but it is not perfect and should be enhanced a little bit to follow the logic to recognize Bitrot faster.

Now what exactly is happening?

Lets say on the left side u have ur original files and u want to sync them to the right side with verify, secure copy and all the nice things.
As a result u should have an exact duplicate on the right side.

Now in ur original Files, u have in one or more pictures a Bitrot (because of bad Harddisk surface or what ever) - How u will recognize this with Syncback?

I guess, if there is a Bit rotation, that Fileattributes keep the same as they where: Date Time last modified, Creation Date Time and so on is the same as before also the file size is the same - only one Bit changed.

That means, if u do a Sync without using Hashing and comparing, the defective file will not be recognized (on left side ( source))

On the other Hand if u are using Hashing the File, which is defective!!!!, will be recognized and synced to the right side, overwriting the original, good File!!!!

A good solution would be, if u could use the integrety database which is an option in Syncback. But u can't do that today because there is no logic behind it for recognizing BitRot.

Sure, u can build hashes for Files on left and right side and u can add hashes for new files - but with integrety check u are getting a lot of errors because files are changing (by modifying them) and the already recorded hashes are not updated.
So there should be in the database also the date/time last modified and hash from a file. If u then do a Integrity check for only files which didn't changed (via Date/Time modified) u will recognize the Bitrot.

So my solution is, that i'm using two profiles - a Profile for syncing and one Profile for verify. To be secure im using also Versioning in a hidden Top-Folder

Because a Verify is very time consuming - evertime hashes from source and destination must be calculated u are now able to do it from time to time only.

The syncing Profile is configured to use the fastest syncing possible with Syncback - no binary compare, no secure copy nothing - simply using file size, date time modified to do the sync.

So, if ur Hardware is not having a malfunction, after using the Sync-Profile u should have on the right side an exact copy.

With the Verify Profile i'm using on both sides Hashing for doing a compare and also here i'm using Versioning in a hidden folder in the Top.

Regarding BitRot:

In the first Sync, using file size and date time modified, a bit rot will not be recognized and the defective File will not overwrite thg intact file in the mirror.

If u run the verify profile u should get as a result that it is nothing todo! If u get any results here then u have a bitrot in a file or somthing went wrong by syncing.

Ok, fine... but it is very time consuming because all hashes must be calculated again and again.

I would be happy if there would be an option to use a hash database for left and right wich is using the following logic:
For new files add hash and date time last modified
For old files, if date time last modified is different then update hash
For old files, if date time last modified is the same, but the hash is different = Error!

For doing all my test i have used the tool SetFileDate 2.0 - where u can manipulate the datetime modified, Accessed and Created Flag for a file.

Post Reply