What is Hashing? Benefits, Types & How It Protects Your Files

Author: Conrad Chung, 2BrightSparks Pte. Ltd.

If you are transferring a file from one computer to another, how do you ensure that the copied file is the same as the source? One method you could use is called hashing, which is essentially a process that translates information about the file into a code. Two hash values (of the original file and its copy) can be compared to ensure the files are equal.

Why Hashing Matters: Real-World Scenarios

Hashing isn't just a technical concept. Hashing solves real-world problems that impact security, data protection, and system reliability. Here are practical examples of why hashing is essential:

Verifying Software Downloads

When you download software, especially installers like .exe or .zip files, it's critical to ensure the file hasn't been tampered with. Many reputable developers publish (like 2BrightSparks) a hash value (usually MD5 or SHA-256) alongside the download link. By calculating your file's hash and comparing it to the official value, you can confirm its authenticity and protect yourself from malware or corrupted files.

Ensuring Backup Integrity

Hashing is vital for verifying that a backup is a precise, uncorrupted copy of your data. In products like SyncBackPro or SyncBackSE, hashing ensures your backups match the source files exactly, providing peace of mind that your important data remains intact and usable after transfer or storage.

Detecting "Bit Rot"

Over time, hard drives and SSDs can suffer from silent data corruption, known as bit rot. These subtle errors often go unnoticed until files fail to open or show visible damage. Regularly hashing files and comparing the results helps detect these hidden problems early, preserving data integrity before it's too late.

Password Security (Briefly)

Hashing also protects your online accounts. Most websites store only a hashed version of your password, not the password itself. When you log in, the system hashes your input and compares it to the stored hash. Even if a website's database is compromised, attackers cannot easily reverse-engineer your original password thanks to secure hash functions.

What is Hashing?

Hashing is an algorithm that calculates a fixed-size bit string value from a file. A file basically contains blocks of data. Hashing converts the file's data into a short, fixed-length code or key which represents the original string. The hash value can be considered the distilled summary of everything within that file.

A good hashing algorithm would exhibit a property called the avalanche effect, where the resulting hash output would change significantly or entirely even when a single bit or byte of data within a file is changed. A hash function that does not do this is considered to have poor randomization, which would be easy to break by hackers.

A hash is usually a hexadecimal string of several characters. Hashing is also a unidirectional process so you can never work backwards to get back the original data.

A good hash algorithm should be complex enough such that it does not produce the same hash value from two different inputs. If it does, this is known as a hash collision. A hash algorithm can only be considered good and acceptable if it can offer a very low chance of collision.

What are the benefits of Hashing?

One main use of hashing is to compare two files for equality. Without opening two document files to compare them word-for-word, the calculated hash values of these files will allow the owner to know immediately if they are different.

Hashing is also used to verify the integrity of a file after it has been transferred from one place to another, typically in a file backup program like SyncBack. To ensure the transferred file is not corrupted, a user can compare the hash value of both files. If they are the same, then the transferred file is an identical copy.

In some situations, an encrypted file may be designed to never change the file size nor the last modification date and time (for example, virtual drive container files). In such cases, it would be impossible to tell at a glance if two similar files are different or not, but the hash values would easily tell these files apart if they are different.

Types of Hashing

There are many different types of hash algorithms such as RipeMD, Tiger, xxhash and more, but the most common type of hashing used for file integrity checks are MD5, SHA-2 and CRC32.

Hash Algorithm	Speed	Security Level	Best Use Case
CRC32	Very Fast	Very Low	Quick error checking in file transfers (e.g., Zip files). Not for security.
MD5	Fast	Low (Vulnerable to collisions)	Basic file integrity checks where speed is more important than security.
SHA-1	Moderate	Medium (Considered insecure)	Legacy applications. Should be avoided in favor of SHA-2.
SHA-256	Slower	High (Industry Standard)	Verifying software downloads, digital signatures, and any security-sensitive application.
SHA-512	Slowest	Very High	Applications requiring the highest level of collision resistance.

MD5 - An MD5 hash function encodes a string of information and encodes it into a 128-bit fingerprint. MD5 is often used as a checksum to verify data integrity. However, due to its age, MD5 is also known to suffer from extensive hash collision vulnerabilities, but it’s still one of the most widely used algorithms in the world.

SHA-2 – SHA-2, developed by the National Security Agency (NSA), is a cryptographic hash function. SHA-2 includes significant changes from its predecessor, SHA-1. The SHA-2 family consists of six hash functions with digests (hash values) that are 224, 256, 384 or 512 bits: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256.

CRC32 – A cyclic redundancy check (CRC) is an error-detecting code often used for detection of accidental changes to data. Encoding the same data string using CRC32 will always result in the same hash output, thus CRC32 is sometimes used as a hash algorithm for file integrity checks. These days, CRC32 is rarely used outside of Zip files and FTP servers.

How to Check a File's Hash: A Step-by-Step Guide

HashOnClick, provided by 2BrightSparks and is completely free, can be used to calculate hash values for files. We have a tutorial that explains how to download HashOnClick and use it to hash files.

Download HashOnClick and install it.
Right-click on any file you want to check.
Select Calculate Hash Value from the context menu.
Choose the hash method (MD5, SHA-1, etc.).
Compare the generated hash to the original hash provided by the source.

Using Built-in Windows Tools (PowerShell)

If you want to use PowerShell to hash a file, you can:

Get-FileHash C:\path\to\your\file.ext -Algorithm SHA256

How long does hashing take for large files?

Hashing time depends on several factors: the hash algorithm used, your hardware specifications, and file size. Here are typical performance ranges:

Hash Algorithm	Speed by Algorithm (approximate rates on modern hardware)
CRC32	1-4 GB/second - extremely fast for basic error checking
MD5	400-800 MB/second - good balance of speed and basic integrity
SHA-256	200-400 MB/second - industry standard, moderate speed
SHA-512	300-500 MB/second - surprisingly faster than SHA-256 on 64-bit systems

To help, here are some real-world examples:

Hash Algorithm	1 GB file	10 GB backup	100 GB archive
CRC32	~1 second	~10 seconds	~2 minutes
MD5	~2-3 seconds	~20-30 seconds	~3-4 minutess
SHA-256	~3-5 seconds	~45-60 seconds	~6-8 minutes

There are performance factors:

Factor	Impact
Storage type	SSDs are much faster than traditional hard drives
CPU	Modern processors with AES-NI acceleration significantly boost SHA performance
System load	Other running programs can slow hash calculations
File fragmentation	Highly fragmented files take longer to process (when using hard drives)

With SyncBackPro/SE, hash verification typically adds 10-30% to backup time, but this investment ensures your data integrity. You can adjust hash settings in the Compare Options to balance speed vs. thoroughness based on your specific needs. For most backup scenarios, the extra time is worthwhile insurance against data corruption.

Using Hashing in 2BrightSparks software

In the backup and synchronization software, SyncBackPro/SE/Free, hashing is mainly used for file integrity checks during or after a data transfer session. For example, a SyncBack user can turn on file verification (Modify profile > Copy/Delete) or use a slower but more reliable method (Modify profile > Compare Options) which will enable hashing to check for file differences. Different hash functions will be used depending on which option is used and where the backup files are located.

Other areas where hashing is used are resuming in FTP, data integrity checking, scripting and occasionally for authentication in Cloud profiles (scripting and cloud backup is supported by SyncBackPro only).

2BrightSparks also has a utility program called HashOnClick that can be used to ensure files are identical. HashOnClick is part of OnClick Utilities, which is completely free. Several types of hashing algorithms are available in HashOnClick.

You can visit our Downloads page if you wish to check out any of 2BrightSparks software.

Summary

In conclusion, hashing is a useful tool to verify files are copied correctly between two resources. It can also be used to check if files are identical without opening and comparing them. To find out more about hashing, please visit the Wikipedia page.

What is Hashing? Benefits, Types & How It Protects Your Files

Why Hashing Matters: Real-World Scenarios

Verifying Software Downloads

Ensuring Backup Integrity

Detecting "Bit Rot"

Password Security (Briefly)

What is Hashing?

What are the benefits of Hashing?

Types of Hashing

How to Check a File's Hash: A Step-by-Step Guide

Using Built-in Windows Tools (PowerShell)

How long does hashing take for large files?

Using Hashing in 2BrightSparks software

Summary

Frequently Asked Questions

What's the difference between a hash and a checksum?

Is MD5 still safe to use?

Does hashing encrypt my file?

Can two different files have the same hash?

Which hash algorithm is best for backups?

Noted Customers