Cartographica has extensive support for unicode and importing and exporting of files in other character formats. However, the one problem that it can't solve is figuring out what character set the data is encoded in.
For some file formats, there are markers that indicate the character set. However, unfortunately, these are often not used. For example, it is common to find ESRI Shapefiles that are encoded as Local OEM (LDID 57 for those who are into the dbf file format) and which have had their data encoded in just about any format.
Some formats are nicely self-describing, such as KML, but most of them are legacy formats and don't do a good job of indicating how they are encoded.
The file itself contains only the encoding specifier. No spaces before, no return afterwards, no nothing. You will need to put the character, exactly as they are listed below, into the "cpg" file and save it out. Then, the next time that Cartographica loads your file, it will come in with the right character sets. Note that I said the next time it loads. If you have already started modifying the layer's styles, there is no need to delete the layer and re-load it later. You can save your Cartographica document, add the "cpg" file to the directory that has your files in it, and then load the Cartographica Map Set again and it should load with the right encoding.
Here is the table of the supported character encodings:
| Character Encoding | Place in file | Notes |
|---|---|---|
| Unicode UTF-8 | UTF-8 | |
| ISO Latin 1 (8859-1) | ISO 88591† | |
| ISO Latin 2 (8859-2) | ISO 88592† | |
| ISO Latin 3 (8859-3) | ISO 88593† | |
| ISO Latin 4 (8859-4) | ISO 88594† | |
| ISO Cyrillic (8859-5) | ISO 88595† | |
| ISO Arabic (8859-6) | ISO 88596† | |
| ISO Greek (8859-7) | ISO 88597† | |
| ISO Hebrew (8859-8) | ISO 88598† | |
| ISO Turkish (8859-9) | ISO 88599† | |
| ISO Nordic (8859-10) | ISO 885910† | |
| ISO Latin 7 (8859-13) | ISO 885913† | |
| ISO Latin 9 (8859-15) | ISO 885915† | |
| ISO 2022JP Japanese | ISO EUC† | |
| Chinese Big 5 | ANSI Big5† | |
| Japanese SJIS | ANSI SJIS† | |
| ANSI numeric | ANSI #† | |
| Windows Codepage (OEM) | OEM #† |
† These spaces are necessary in the cpg file: between ISO, ANSI, or OEM and the number.
# The # character should be replaced with the numeric character set identifier
Internally, Cartographica always uses UTF-8. If you import data and then modify it (or otherwise end up changing the file from a reference to an included file), the data will be stored in UTF-8 inside of the Cartographica Map Set document.