![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
|
|
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
This article is from the Nordic countries FAQ, by Antti Lahelma and Johan Olofsson, with numerous contributions by others.
If you have been a reader of this group for a while, you may have
noticed that discussion about characters and their representations
occasionally accounts for quite a bit of bandwidth. It often does not
take more than a question about the issue from a new reader, or
someone posting an article with an IBM character set, to get a new
thread going on the issue. Some want to keep 7-bit ISO-646 (be aware
that they may call it "true ASCII", although strictly speaking, is
not), since 7-bit codes will always get though with any setup; others
want ISO-Latin-1 since it is more universal; and yet others promote
digraphs as the greatest common denominator between the two.
Some pros and cons for each set:
Character set: Advantages: Disadvantages:
__________________________________________________________________
Digraphs * Requires 7-bit only * Ambiguous
("oe" or "o-slash"?)
* Non-optimal compromise
LaTeX * Non-ambiguous 7-bit * Made for typesetting;
representation. somewhat cryptic for
regular text.
* Non-optimal compromise
ISO-646-SE, * Only 7-bit "true" * Different standards
ISO-646-DK representation. for each language
<[\]{|}> * No data loss even * Getting harder to
with old hardware/ find font support
software/setup. (Dying out).
* Shadows the brace,
sqare bracket, pipe,
and backslash chars.
ISO Latin 1 * Utilizes all 8 bits * Requires 8-bit clean
(ISO-8859-1) in a byte; yet avoids connection; older
<ÐÞÆØÅÄÖðþæøåäö..> dangerous codes. systems may cause
* Universal for all data loss.
Western European * May require some
languages. setup.
* Supported by ISO and * In case of stripping,
MIME; true subset of becomes "FXEDVfxedv";
Unicode. difficult to read.
IBM CodePages * Uses all 256 codes; * Uses all 256 codes;
Machintosh set more characters incl. dangerous ones.
<Unacceptable> * Often used in PC * Incompatible with
environments such as the "de-facto" 8-bit
BBS'es. standard ISO-8859-1
__________________________________________________________________
 
Continue to:
travel, vacation, nordic, Denmark, Finland, Iceland, Norway, Sweden, tourism, history, books, language
![]() |
|
|