This directory and its subdirectories contain .ump mapping files for
converting various character encodings to and from unicode.

To generate .ump files use a command like "makemapfile.pl -encoding
encodingname -mapfile textmapfile" where encodingname becomes the filename
of the two new .ump files and textmapfile is a plain text file containing a
tab separated list of the form:
0x8167	      0x201C
where the first column is the hexadecimal value of the encoded character
and the second is the hexadecimal value of it's unicode equivalent.



The following .ump files were generated from their corresponding Microsoft
codepages. These codepages do, in some cases, differ very slightly from the
standards they were based on but we've used them anyway as they're so
extensively used on the web.

* gbk.ump: Simplified Chinese - generated from Microsoft's codepage 936
* shiftjis.ump: Japanese - generated from Microsoft's codepage 932
* uhc.ump: UHC Korean - generated from Microsoft's codepage 949
* big5.ump: Traditional Chinese - generated from Microsoft's codepage 950