Wednesday, July 22, 2009

Sorting International Names

My current employer has data on companies from almost every country in the world; lots of lists of companies from the four corners of the planet. When listing these companies in a select box it would be nice to have them sorted alphabetically. The problem is the standard approach to sorting would be to use a comparator or comparable interface. This will return something like this.

A, B, C, E, d
because the string compareTo looks at the int char value and capital E is a smaller number than lower case d and therefore comes before in a sort. It gets even more complicated when you add names with accented characters from places like Sweden, like å. å int value is much higher than standard A-Z and a-z and therefore comes after both upper and lower case letters. Step up to the plate, java.text.Collator. This little class understands your pain. But more importantly it understands language. It can perform locale-sensitive string comparison. What that means is the sorted list would be

A, B, C, d, E
A, å, B, C, d, E
...if you want it to be. It has the ability to sort upper and lower case with the same strength amoung other cool features.

0 comments:

Post a Comment