How could one do a compare by content?

For technical support visit http://support.2brightsparks.com/
Biff
Enthusiastic
Enthusiastic
Posts: 19
Joined: Sun May 22, 2016 10:55 pm

How could one do a compare by content?

Postby Biff » Sat Jun 17, 2017 11:06 am

How could one do a compare by content? And make SB show whether a file is equal by content or not?

cliffhanger
Expert
Expert
Posts: 765
Joined: Tue May 31, 2011 5:59 pm

Re: How could one do a compare by content?

Postby cliffhanger » Sat Jun 17, 2017 3:29 pm

Hi, it's possible with certain limitations and/or default assumptions/settings by SyncBackPro that you may not want (so, may need to switch off). First thing to understand is that - no matter what file properties (including 'content') you compare - you cannot arrange for SBP to show you a list of those files where all properties that are being compared match, only those files where one or more do not (because 'matching files' need no action, so are not proposed for any, thus are never listed in the appropriately-named Differences window)

Assuming 'discovery' (and a list) of files that are different will suffice for your needs, you need to take into account that by default SBP will decide if files on one side and/or the other are 'different' by comparing

  1. 'presence versus absence' of a same-name file per side (note: you cannot turn off detection of this 'difference', so files on one side only will always be included in the list of differences * )
  2. LastModified stamp of the files (assuming one each side) - compared by default but comparison can be disabled
  3. Size of the files (assuming one each side) - compared by default but comparison can be disabled
To this you can optionally add 'hashing' of the contents of the files (assuming there is a same-name file each side) to calculate MD5 hash values of each file, and flag a difference if the hash values differ. The important thing to know is that this takes time to calculate - hence the comment 'slower but more reliable method' - so if a difference in LastModified stamp or Size is detected, SBP will normally skip hashing in the interests of performance because a difference in LastModified or Size is normally an obvious-enough difference to justify overwriting one side with the other (or whatever action rule you set for that situation)

So, if you have a situation where you think the LastModified and/or Size may differ but the contents each side might be the same or the reverse (LastModified & Size may match but contents may differ) you need to decide what to compare (or not) to get SBP to work how you want it to work. The settings are all on

  • Modify profile > Expert mode> Compare Options > 'Use slower but more reliable method of file change detection' (hashing - you may want to turn this on)
  • Modify profile > Expert mode> Compare Options > Date & Time sub-page (you may want to switch off LastModified comparison - see above)
  • Modify profile > Expert mode> Compare Options > File Size sub-page (you may want to switch off Size comparison- see above)
Note: there is an option on the main Compare Options settings page re 'always use slower...method' which you may feel you should use, but you probably do not need it unless your profile is a type (Intelligent Synchronization or Fast Backup) which stores state-databases and where you might want to store a 'hash value right now' for future comparison on a future run - even in a situation where a file only exists on one side...so you may slow the profile unnecessarily if you use it

If you are still uncertain what to do, try supplying more background what you are trying to achieve (and why) and which other file properties are (or are not) important to you in terms of decisions/actions such as inclusion in any list. I cannot guarantee I can answer personally but more background will probably improve your chances of someone who knows the logic used in SBP being able to apply that knowledge to your situation :)

BTW, in the absence of any details, i have assumed you are running the latest release of Pro V8


* you can however use the two options on the Filter menu of the Differences window to to toggle whether one-side-only files are listed on-screen

Biff
Enthusiastic
Enthusiastic
Posts: 19
Joined: Sun May 22, 2016 10:55 pm

Re: How could one do a compare by content?

Postby Biff » Sat Jun 17, 2017 6:30 pm

Hello cliffhanger,

Many thanks.

QUOTE]you cannot arrange for SBP to show you a list of those files where all properties that are being compared match, only those files where one or more do not (because 'matching files' need no action, so are not proposed for any, thus are never listed in the appropriately-named Differences window)[/QUOTE]
Sorry, I do not quite understand, I want SB show the results of a compare by file content (not by any other file property). So, show equal files by content, unequal files, files only being on the left side, only being on the right. So if I want an update of the right side, SB would show an arrotw (e.g. with a "+" for new file on the right side) to the right side, something like "nothing to do" for files being on the right side (not on the left side).

Assuming 'discovery' (and a list) of files that are different will suffice for your needs, you need to take into account that by default SBP will decide if files on one side and/or the other are 'different' by comparing

Sorry, I do not understand the whole sentence, what does "discovery" mean? And what list? And the rest. Very sorry again.

'presence versus absence' of a same-name file per side (note: you cannot turn off detection of this 'difference', so files on one side only will always be included in the list of differences * )

Yes, that is what I want. Can SB detect / compare by content different-name files? And such folders?

LastModified stamp of the files (assuming one each side) - compared by default but comparison can be disabled
Size of the files (assuming one each side) - compared by default but comparison can be disabled

Yes, just compare by content, nothing else.

To this you can optionally add 'hashing' of the contents of the files (assuming there is a same-name file each side)...

Does hashing work with different-name files as well?

...to calculate MD5 hash values of each file, and flag a difference if the hash values differ.

What is more reilable, hashing or comparing by content?

The important thing to know is that this takes time to calculate - hence the comment 'slower but more reliable method' - so if a difference in LastModified stamp or Size is detected, SBP will normally skip hashing in the interests of performance because a difference in LastModified or Size is normally an obvious-enough difference to justify overwriting one side with the other (or whatever action rule you set for that situation)

This would be contra productiv, not good, does not make any sense for my purpose, because the files shall be compared by content only. If a file is corrupt it won't be discovered with those settings / procedere.

So, if you have a situation where you think the LastModified and/or Size may differ but the contents each side might be the same or the reverse (LastModified & Size may match but contents may differ) you need to decide what to compare (or not) to get SBP to work how you want it to work.

Indeed, a compare by content only, nothing more, nothing less, absolutely a compare by content.

Note: there is an option on the main Compare Options settings page re 'always use slower...method' which you may feel you should use, but you probably do not need it unless your profile is a type (Intelligent Synchronization or Fast Backup) which stores state-databases and where you might want to store a 'hash value right now' for future comparison on a future run - even in a situation where a file only exists on one side...so you may slow the profile unnecessarily if you use it

Do I get it right, the hash of each file (if created) will be stored in a data base and can be used with compares by hash in the future for this file it was calculated for? That means if a hash for a file is created, after the file gets corrupt, the next time the file will be compared, actually its hash will be compared (with the one of another file), so one do not get to know the corrupt file is broken? Is it right?

If you are still uncertain what to do, try supplying more background what you are trying to achieve (and why) and which other file properties are (or are not) important to you in terms of decisions/actions such as inclusion in any list

No other properties, only content. I just want to check the correct transfer, is the destination file phyiscally the same as the source file, that's all.

BTW, in the absence of any details, i have assumed you are running the latest release of Pro V8

Yes, that's good.

* you can however use the two options on the Filter menu of the Differences window to to toggle whether one-side-only files are listed on-screen

Alright, thank you very much!

Biff
Enthusiastic
Enthusiastic
Posts: 19
Joined: Sun May 22, 2016 10:55 pm

Re: How could one do a compare by content?

Postby Biff » Sat Jun 17, 2017 7:21 pm

Whatever I do, I alway get a display like that (not a equal / unequal display):

Image

How could I show (un)equal files / folders)?

Biff
Enthusiastic
Enthusiastic
Posts: 19
Joined: Sun May 22, 2016 10:55 pm

Re: How could one do a compare by content?

Postby Biff » Sat Jun 17, 2017 7:23 pm

Settings:
Image

Image

Image

Image

cliffhanger
Expert
Expert
Posts: 765
Joined: Tue May 31, 2011 5:59 pm

Re: How could one do a compare by content?

Postby cliffhanger » Sun Jun 18, 2017 2:30 pm

SBP will not display (list) to you (in the Differences window, or anywhere else) files on both sides (with same path\name) that are found to be the same by comparison of [whatever properties you ask SBP to compare]. Unless none of the files match, you cannot get a list/display of *all* files in two file-sets, only of the files which do not match. Files which do match already (having compared whatever properties you specified) are not listed/displayed, and there is no way to force them to be displayed. This is because files (one per side) which match already = 'nothing for generic backup/sync software to do' - no matter what decision/action rules (that SB supports) are in force. So, SBP Ignores such files (and does not list them in a way you can see during the run).

To use your own phrasing, you cannot get an "equal / unequal display" - 'equal' files are never listed/displayed during a run, because SB is not designed to perform any action in respect of 'equal files'. It is designed to find 'unequal' files and propose actions to take in respect of such unequal files to make them 'equal' - generally, to either copy one file over the other - or, to delete or skip any orphans - though you can change the default and/or suggested actions from the displayed list. But, there are no actions to take in respect of 'equal-already' files, therefore they are not displayed.

To answer another question, no, you cannot set SBP to hash-compare files of different names (which by default would mean 'each path\file compared with every other path\file') because there is no reason why generic backup/sync software would want to do that to fulfill its advertised functionality. The very first thing it looks for (in respect of a file in a certain path on side-X) is a file of the same name in the same path on side-Y - and if a matching path\name is not found, the file it did find is treated as an orphan (and no properties are compared, because there is no other same-path\name file's properties to compare), and there is no functionality to search the rest of the profile's scope looking for a match somewhere else. TBH, it sounds like you maybe need dedicated de-duplication software.

Storage for future reference of hash values (if computed) only happens for certain profile types (Intelligent Synchronizations or non-archival Fast Backups) that normally/already store basic (date+time/size) info in a database to assist in comparison/decisions (note that in any case you cannot disable LastModified stamp comparison in a Intelligent Synchronization except by using an add-in script to change the way the profile works). Yes, if present, differences between any stored hash value from 'last run' and any calculated hash value from 'this run' will trigger a mismatch / listing of the file in the Differences window (though by virtue of the way Fast Backup works (the Destination is not usually scanned, the database is scanned instead...) this would not normally detect changes in hash-value on Destination (unless you Rescan), only changes on Source. This is fine to detect updates of 'live files' on your Source, but not so good if you are worried about possible corruption of backup copies on the Destination. If you are worried about that, use a normal profile so both sides are always scanned (but there is now no Fast Backup > no database > no stored hash value, so all you can compare The Destination copy's hash value with is the Source file's hash value).

For the question about hash-value versus content, see the Knowledge Base.

Finally, please note it is usual protocol on this forum to post in English, including SB screenshots. It is quite easy to switch the UI to English via Preferences menu (main UI) > Languages (Einstellungen > Sprache) before you take your screenshots ;)

Biff
Enthusiastic
Enthusiastic
Posts: 19
Joined: Sun May 22, 2016 10:55 pm

Re: How could one do a compare by content?

Postby Biff » Sun Jun 18, 2017 5:47 pm

Thank you very much!

SBP will not display (list) to you (in the Differences window, or anywhere else) files on both sides (with same path\name) that are found to be the same by comparison of [whatever properties you ask SBP to compare]. Unless none of the files match, you cannot get a list/display of *all* files in two file-sets, only of the files which do not match. Files which do match already (having compared whatever properties you specified) are not listed/displayed, and there is no way to force them to be displayed. This is because files (one per side) which match already = 'nothing for generic backup/sync software to do' - no matter what decision/action rules (that SB supports) are in force. So, SBP Ignores such files (and does not list them in a way you can see during the run).


These files are identical by content, why are they shown though when doing a compare by cotent respectively why are they not considered as equal and thus not shown (I guess, I missed a setting somewhere, but which one?):
Image

To use your own phrasing, you cannot get an "equal / unequal display" - 'equal' files are never listed/displayed during a run, because SB is not designed to perform any action in respect of 'equal files'.

If files are equal by content, I mostly want to remove one of them, so there is no manner to do it with SB, if I understand it right. It appears as if that were a huge disadvantage, very strange such option is missing.

To answer another question, no, you cannot set SBP to hash-compare files of different names (which by default would mean 'each path\file compared with every other path\file') because there is no reason why generic backup/sync software would want to do that to fulfill its advertised functionality. The very first thing it looks for (in respect of a file in a certain path on side-X) is a file of the same name in the same path on side-Y - and if a matching path\name is not found, the file it did find is treated as an orphan (and no properties are compared, because there is no other same-path\name file's properties to compare), and there is no functionality to search the rest of the profile's scope looking for a match somewhere else.

Yes, so it compares like usual sync / backup programs do.

TBH, it sounds like you maybe need dedicated de-duplication software.

Yes, indeed (in addition), I have a lot of such, but it appears there is not any program which compares folders (not files) by size and / or content.

Finally, please note it is usual protocol on this forum to post in English, including SB screenshots. It is quite easy to switch the UI to English via Preferences menu (main UI) > Languages (Einstellungen > Sprache) before you take your screenshots

Unfortunalaty it is not for me, I tried a fiew times, but the German language keeps being used:
Image

Very many thanks again.

cliffhanger
Expert
Expert
Posts: 765
Joined: Tue May 31, 2011 5:59 pm

Re: How could one do a compare by content?

Postby cliffhanger » Mon Jun 19, 2017 11:20 am

In your Differences window snippet screenshot, three of the visible files (1/2/4) exist one one side only. This represents the biggest 'difference' there can be (file exists on one side only, no corresponding file on other side). You cannot tell SB to ignore this difference, but as I already mentioned, you can use the Filter menu in the Differences window to prevent such one-side-only files being displayed. Note that suppressing their display does not mean the profile will not process them (suppression of display only helps 'visualising' the other rows). It depends what rules you set up for such a situation (in Decisions-Files & elsewhere) what action the profile will propose (if displayed) or silently take (if not displayed & you click Continue). The usual actions are to copy from [exists] to [empty], to delete from [exists], or to Skip, depending on your profile type, but you may have custom Actions specified for some profiles (or, you may want to do so, if you regularly plan to suppress their display)

The highlighted file in your screenshot (file #3) has a size of 6 bytes on the left side and 0 bytes on the right.Thus, the contents of these two files must be different (one contains 6 bytes more than the other), and therefore so must their hash values. If you highlight (select) such a row and check the attributes pane lower left in Differences window, the Hash boxes on one or both sides will be populated (if the profile is set to hash files, else they will be empty)

Re the language issue, i have no idea what is going on there. All I can suggest is maybe try closing & re-opening Pro with the English setting in place and see if it now 'latches' (though I certainly don't have to do that - languages change here in mid-flight, so to speak). If there is still an issue, maybe re-install the latest version of Pro 'over the top' (do not uninstall or you may lose your profiles & settings). If still a problem, you could raise a support ticket with 2BS using the support link on the outside of this forum section; there's nothing more I can suggest from here.

Biff
Enthusiastic
Enthusiastic
Posts: 19
Joined: Sun May 22, 2016 10:55 pm

Re: How could one do a compare by content?

Postby Biff » Mon Jun 19, 2017 6:31 pm

In your Differences window snippet screenshot, three of the visible files (1/2/4) exist one one side only. This represents the biggest 'difference' there can be (file exists on one side only, no corresponding file on other side). You cannot tell SB to ignore this difference, but as I already mentioned, you can use the Filter menu in the Differences window to prevent such one-side-only files being displayed. Note that suppressing their display does not mean the profile will not process them (suppression of display only helps 'visualising' the other rows). It depends what rules you set up for such a situation (in Decisions-Files & elsewhere) what action the profile will propose (if displayed) or silently take (if not displayed & you click Continue). The usual actions are to copy from [exists] to [empty], to delete from [exists], or to Skip, depending on your profile type, but you may have custom Actions specified for some profiles (or, you may want to do so, if you regularly plan to suppress their display)

Alright, I understand.

The highlighted file in your screenshot (file #3) has a size of 6 bytes on the left side and 0 bytes on the right.Thus, the contents of these two files must be different (one contains 6 bytes more than the other), and therefore so must their hash values. If you highlight (select) such a row and check the attributes pane lower left in Differences window, the Hash boxes on one or both sides will be populated (if the profile is set to hash files, else they will be empty)


Yes, indeed, they must be different (by size) but they aren't by content:
Image

Image

Ah, no, it is the same file that is compared, obviously the entry is wrong: %1 %2. How could I correct it?

Image

If there is still an issue, maybe re-install the latest version of Pro 'over the top' (do not uninstall or you may lose your profiles & settings)

I could export / backup them before uninstalling without losing any settings, I assume, is it right?

If files are equal by content, I mostly want to remove one of them, so there is no manner to do it with SB, if I understand it right. It appears as if that were a huge disadvantage, very strange such option is missing.

So, this is not possible with SB?

Many thanks

cliffhanger
Expert
Expert
Posts: 765
Joined: Tue May 31, 2011 5:59 pm

Re: How could one do a compare by content?

Postby cliffhanger » Tue Jun 20, 2017 7:34 am

Re the comparison program(s), I am not familiar with using such third-party software (nor its requirements), but it does state in the SB Help on Comparison Programs that you may need to add extra switches (I guess to specifically tell the comparison software what the next parameter/value being passed is/represents) and/or you may need to wrap %1 & %2 in double-quotes as shown n the screenshot in the SB Help. Check your documentation for WInMerge - it may require separators such as commas (,) rather than (or, in addition to) spaces. Sorry, I can't offer further help with this than these general comments.


Uninstall/reinstall might be necessary but suggest try it 'over the top' first, which will overwrite all 'factory' components with new ones but ignore all files that are not present in the installer (e.g. your profiles). Export/Uninstall/Re-install/Import has some hoops you may need to jump though in respect of grouped profiles (if you have any). This is not an exhaustive list of import issues - search the KB on 'import' for others.


SyncBackPro is not designed to look for copies and delete one or more of them, it is designed to make extra copies (backups) of only-one-copy/side files (and/or, to make already-one-copy-each-side-but-do-not-match files identical, by replacing one of them with a copy of the other). There is no automated action you can specify in respect of 'file exists on both sides and is identical' (that is, to 'correct' this situation), because 'both sides the same' is what SB is designed to achieve, and it will therefore skip/ignore such pairs of files (and will not display them). By definition, files listed in the Differences window are different, so although you can right-click a row and elect to delete the file on one side or the other (or, both), the row you are clicking on represents files which are not the same (do not match - or, exist on one side only) in the first place...

Biff
Enthusiastic
Enthusiastic
Posts: 19
Joined: Sun May 22, 2016 10:55 pm

Re: How could one do a compare by content?

Postby Biff » Tue Jun 20, 2017 7:47 am

Thank you very much!

Re the comparison program(s), I am not familiar with using such third-party software (nor its requirements), but it does state in the SB Help on Comparison Programs that you may need to add extra switches (I guess to specifically tell the comparison software what the next parameter/value being passed is/represents) and/or you may need to wrap %1 & %2 in double-quotes as shown n the screenshot in the SB Help. Check your documentation for WInMerge - it may require separators such as commas (,) rather than (or, in addition to) spaces. Sorry, I can't offer further help with this than these general comments

No, no, no reason to say sorry. And the quotes helped, now it works.

Uninstall/reinstall might be necessary but suggest try it 'over the top' first, which will overwrite all 'factory' components with new ones but ignore all files that are not present in the installer (e.g. your profiles). Export/Uninstall/Re-install/Import has some hoops you may need to jump though in respect of grouped profiles (if you have any). This is not an exhaustive list of import issues - search the KB on 'import' for others.

Alright, I understand.

SyncBackPro is not designed to look for copies and delete one or more of them, it is designed to make extra copies (backups) of only-one-copy/side files (and/or, to make already-one-copy-each-side-but-do-not-match files identical, by replacing one of them with a copy of the other). There is no automated action you can specify in respect of 'file exists on both sides and is identical' (that is, to 'correct' this situation), because 'both sides the same' is what SB is designed to achieve, and it will therefore skip/ignore such pairs of files (and will not display them). By definition, files listed in the Differences window are different, so although you can right-click a row and elect to delete the file on one side or the other (or, both), the row you are clicking on represents files which are not the same (do not match - or, exist on one side only) in the first place...

Alright, I understand.

Many thanks again, great help!


Return to “SyncBackPro (commercial)”



Who is online

Users browsing this forum: No registered users and 7 guests