lotus

previous page: 5.2.2 Coding Hungarian accents: Fancy 8-bit character sets (extended ASCII)
  
page up: Hungarian FAQ
  
next page: 5.2.4 Coding Hungarian accents: Microcomputer products: The word processors

5.2.3 Coding Hungarian accents: Text formatting languages




Description

This article is from the Hungarian FAQ, by Zoli Fekete fekete@bc.edu with numerous contributions by others.

5.2.3 Coding Hungarian accents: Text formatting languages

The text formatting languages listed below, beyond their powerful text
formatting capabilities, also include the specification of [almost] all
the accented characters. These languages give an alternative way of
dealing with accents in 7-bit ASCII, especially if the software that
can display, print or convert these representations is available.
[Unlike notations in 5.2.1, the "raw" files of these languages are not
intended to be read by ordinary users.]

5.2.3.1 [La]TeX.

Invented by D. E. Knuth, TeX (pronounce as [tech]; 'X' denotes the
Greek letter 'chi'), and the macro collection based on it, LaTeX, are
today's most popular text formatting languages for document creation
and DTP.

To continue with the same example,

\"{O}t h\H{u}t\H{o}h\'{a}zb\'{o}l k\'{e}rt\"{u}nk sz\'{\i}nh\'{u}st

\'{a}rv\'{\i}zt\H{u}r\H{o} t\"{u}k\"{o}rf\'{u}r\'{o}g\'{e}p

\"{O}t sz\'{e}p sz\H{u}zl\'{a}ny \H{o}r\"{u}lt \'{i}r\'{o}t ny\'{u}z

This is meant to be printed with TeX or previewed as a dvi file.
Wholly unambiguous, can be automatically converted to/from several
other formats (see 5.2.6). Also check the babel system for LaTeX with
the Hungarian specific option, available from FTP sites kth.se or
goya.dit.upm.es.

5.2.3.2 HTML (HyperText Markup Language)

Unfortunately, the HTML-2 standard still does not contain notation for
Hungarumlaut (long umlaut, double acute). We use tilde or circumflex
instead. The preferred notation is o with tilde õ and u with
circumflex û. In the example above,

Öt hûtõházból kértünk
színhúst

árvíztûrõ
tükörfúrógép

Öt szép szûzlány õrült
írót nyúz

5.2.3.3 RTF (Rich Text Format)

This standard is widespread among Microsoft word processors. For
non-ASCII characters it uses the following coding:

\'XX

where XX is the code of the given ISO 8859/2 (or PC-852 for Word for
DOS) character in hexadecimal.

5.2.3.4 Adobe PostScript

It is a universal standard for describing any kind of graphics,
including fonts, but it is aimed at producing the final (typically
printed) copy of documents and not at word-processing per se. For a
starter document see <http://www.adobe.com/PS/PS-QA.html> or
<ftp://wilma.cs.brown.edu/pub/comp.lang.postscript/FAQ.txt> or
<ftp://rtfm.mit.edu/pub/usenet-by-group/comp.answers/postscript/faq/part1-4>.
If one has the right accented fonts sets then, in theory, the output is
transferable between different machines - but often we run into hurdles
in practice.

 

Continue to:













TOP
previous page: 5.2.2 Coding Hungarian accents: Fancy 8-bit character sets (extended ASCII)
  
page up: Hungarian FAQ
  
next page: 5.2.4 Coding Hungarian accents: Microcomputer products: The word processors