What is Unicode?

Author: Kostas Tsakiridis, 2BrightSparks Pte. Ltd.

Unicode is a computing standard aiming to provide a common encoding and representation of characters, and any symbols in general, that are being used in most of the world's written languages.

Before Unicode

Basically, computers can understand and communicate only with numbers. We may be seeing text on our computer screens but beneath, inside the computer circuits, everything is encoded as numbers in binary form, with each letter or symbol being represented by a number. The mapping of letters/symbols to numbers is done via a character set which is a predefined list of characters and their assigned numbers recognized by the computer hardware and software.

One of the most adopted character sets, named ASCII, uses the numbers 0 through 127 to represent all English characters as well as special control characters. European ISO character sets are similar to the ASCII, but they contain additional characters for European languages.

Until recently, compatibility issues among computer system using different character sets were very common. A typical example are FTP servers where file names contain text using a different character set compared to a user’s computer using a FTP client application. In this example the server may be using an east Asian character set (e.g. Japanese) and the user’s computer running the FTP client using an European character set (e.g. Central European). The server’s file listing on the user’s screen will be unreadable, making no sense at all, due to the different character sets being used with conflicting letters/symbols assignments since both sides are using different languages to interpret the same number associated to letters.

Similar problems existed with web pages written in languages using character sets not automatically “understood” by the web browsers but the users had to tell the browser which encoding to use in order to be able to render the pages properly.

Enter Unicode

The decades long incompatibility problems lead to the development and introduction of the Unicode Standard. It has changed all that by defining a unique number for each and every possible character/symbol used around the world, regardless of computer, operating system, application or language.

It has been adopted by all modern operating systems and software providers (including 2BrightSparks and its SyncBack applications) and now allows data to be transported through many different platforms, devices and applications without corruption and without the need of using translation tables. In addition, it allows user interfaces to be displayed in multiple languages on the same device or typing a document in a word processor in more than one language scripts, since the device is now capable of displaying multiple languages.

SyncBack applications support Unicode allowing correct filename preservation between source and destination during file transfers with the precondition that source and destination locations also support Unicode. Also, Unicode allows SyncBack users to display the application’s user interface in different languages instantly on any modern Windows installation without the need of installing separate language packs.


Noted Customers

© 2018 2BrightSparks Pte. Ltd.  | Home | Support | Privacy | Terms

Back to top