Internet users who download or upload files from/to the Internet, or use email to send or receive attachments, will most likely have encountered files in a compressed format. In this topic we will cover how compression works, the advantages and disadvantages of compression, as well as types of compression.
Compression is the process of encoding data more efficiently to achieve a reduction in file size. One type of compression available is referred to as lossless compression. This means the compressed file will be restored exactly to its original state with no loss of data during the decompression process. This is essential to data compression as the file would be corrupted and unusable should data be lost. Another compression category which will not be covered in this article is “lossy” compression often used in multimedia files for music and images and where data is discarded.
Lossless compression algorithms use statistic modelling techniques to reduce repetitive information in a file. Some of the methods may include removal of spacing characters, representing a string of repeated characters with a single character or replacing recurring characters with smaller bit sequences.Compression of files offer many advantages. When compressed, the quantity of bits used to store the information is reduced. Files that are smaller in size will result in shorter transmission times when they are transferred on the Internet. Compressed files also take up less storage space. File compression can zip up several small files into a single file for more convenient email transmission.
As compression is a mathematically intense process, it may be a time-consuming process, especially when there is a large number of files involved. Some compression algorithms also offer varying levels of compression, with the higher levels achieving a smaller file size but taking up an even longer amount of compression time. It is a system intensive process that takes up valuable resources that can sometimes result in “Out of Memory” errors. With so many compression algorithm variants, a user downloading a compressed file may not have the necessary program to un-compress it.
Some transmission protocols may include optional compression built-in (e.g. FTP has a MODE-Z compression option), so that taking time to compress data by another process before transmission may negate some of the advantages of using such an option in the protocol (because what is eventually submitted for transmission to/by the protocol is probably now not very further-compressible at all, and may waste time while the protocol tries and fails to achieve more compression). It is distinctly possible that ‘external’ compression beforehand is more efficient these days, and that any compression option in the protocol should probably be deprecated. However, it is not beyond the bounds of possibility that the built-in compression achieves faster overall results, but possibly with larger compressed files, or vice versa. Experimentation should be employed to ascertain which applies, versus which factor is most important to the user.In 1949, the Shannon-Fano coding was devised by Claude Shannon and Robert Fano to assign code words based on block probabilities. This technique was only considered fairly efficient in variable-length encodings. In 1951, David Huffman found an optimally efficient method that was better than the Shannon-Fano coding by using a frequency-sorted binary tree. Huffman coding is often used as a backend to other compression methods today.
In 1977, ground-breaking LZ77 and LZ78 algorithms were invented by Abraham Lempel and Jacob Ziv, which gained popularity rapidly. Some commonly used algorithms used today like DEFLATE, LZMA and LZX are derived from LZ77. Due to patent issues with LZ78 in 1984, UNIX developers began to adopt open source algorithms like the DEFLATE-based gzip and the Burrows-Wheeler Transform-based BZip2 formats, which managed to achieve significantly higher compression than those based on LZ78.
There are several types of compression available. In the following section, we shall review the types of compression offered by the backup and synchronization software SyncBackFree, SyncBackSE and SyncBackPro.
Different file types respond very differently to compression. Plain text files, log files, and source code typically compress very well, while files that are already compressed — such as JPEG images, MP3 audio, and video files — will barely reduce in size no matter which algorithm or level is used. Attempting to compress these files wastes CPU time and memory for little or no benefit.
SyncBack V12 introduces Intelligent Compression, which lets you assign different compression levels based on file type. You can configure file types that compress well (e.g. .TXT, .LOG, .CSV) to use high compression, while already-compressed file types (e.g. .JPG, .MP4, .ZIP) use low compression. This gives you the best trade-off between compression ratio and performance, ensuring that processing time is spent where it will have the greatest impact on reducing backup size. See our companion article on compression in SyncBack for more details.
In conclusion, data compression is very important in the computing world and it is commonly used by many applications, including the suite of SyncBack programs. With techniques ranging from classic DEFLATE to modern ZStandard, and V12's Intelligent Compression for file-type-aware optimization, users have a wide range of options to balance compression ratio, speed, and resource usage for their needs.
© 2003-2026 2BrightSparks Pte. Ltd. | Home | Support | Privacy | Terms | Affiliate Program