In our previous Java 9 series article we looked at the Garbage Collector in Java 9. This time we will be looking at Unicode support.
What is Unicode?
Java has always had support for Unicode, which allows Java to easily support multi-byte charsets such as Korean, Japanese or Chinese while still remaining compact on systems where 256 characters is more than enough (for example most European languages). The Unicode format continues to be developed and enhanced to provide support for more and more languages and this brings Java forward. Unicode 8.0 is the current release and there is now a draft specification for Unicode 9.0.
Why is it important?
Using Unicode allows you to write programs for an international audience. Let’s say, for example, you wanted to use Java to access a file written in Multani (the most common dialect of Saraiki, one of the languages spoken in Pakistan). In Java 8 this file would not be displayed correctly, but Unicode 8 (and therefore Java 9) supports this script.
Using the newest version of Unicode allows Java to keep up to date with the internationalisation improvements the industry is seeing.
What changes will we see in Java 9?
Unicode 7 introduces 2,834 new characters and 23 scripts, and Unicode 8 introduces 7,716 characters and 6 new scripts. So there is now support for more than 10,000 extra characters and 29 scripts with JDK9.
Unicode also supports many emojis and historical languages. New scripts available in Unicode 8 include Sutton SignWriting (a written form of sign language), Hatran (an old Iraqi dialect) and Old Hungarian.
Are there any characters or languages you would like to see supported in future versions of Java?