ASCII and JIS X 0208.
The latter is a 2-byte character set including Kanji, Hiragana, Katakana
and some other symbols and characters.
We use two Japanese encoding methods in this server.
One is EUC-JP (Extended Unix Code) and
the other is ISO-2022-JP.
EUC-JP is ISO-2022 compliant 8-bit encoding
for which initially designated ASCII to G0 and
JIS X 0208-1983 (or JIS X 0208-1990) to G1
without explicit announcement.
G2 and G3 are never used.
A sample file encoded in EUC-JP is
here.
ISO-2022-JP, which is registered as MIME charset name,
is a widely used encoding in Japanese IP communities
for electronic mail and network news messages.
It is ISO-2022 compliant 7-bit encoding
for which using only G0 codeset.
ASCII is initially designated to G0.
To switch character sets, you should designate it to G0
by escape sequences, for example:
ESC ( B ASCII
ESC ( J JIS X 0201-1976 ("Roman" set)
ESC $ @ JIS X 0208-1978
ESC $ B JIS X 0208-1983
A sample file is here.
For more detail about ISO-2022-JP, see
RFC-1468.
Although I think ISO-2022-JP is better than EUC-JP,
ISO-2022-JP causes some problems
in HTML.
Shift-JIS
(also called MS-Kanji Code).
Unfortunately, Shift-JIS is widely used with MS-DOS, Windows
and Macintosh; but I think Shift-JIS is rubbish
and should not be used anymore!!!
We never use this encoding under this server
except this example. ________________________________________________________________________
TAKADA Toshihiro