ASCII
and JIS X 0208
.
The latter is a 2-byte character set including Kanji, Hiragana, Katakana
and some other symbols and characters.
We use two Japanese encoding methods in this server.
One is EUC-JP
(Extended Unix Code) and
the other is ISO-2022-JP
.
EUC-JP
is ISO-2022 compliant 8-bit encoding
for which initially designated ASCII to G0 and
JIS X 0208-1983 (or JIS X 0208-1990) to G1
without explicit announcement.
G2 and G3 are never used.
A sample file encoded in EUC-JP
is
here.
ISO-2022-JP
, which is registered as MIME charset name,
is a widely used encoding in Japanese IP communities
for electronic mail and network news messages.
It is ISO-2022 compliant 7-bit encoding
for which using only G0 codeset.
ASCII is initially designated to G0.
To switch character sets, you should designate it to G0
by escape sequences, for example:
ESC ( B ASCII ESC ( J JIS X 0201-1976 ("Roman" set) ESC $ @ JIS X 0208-1978 ESC $ B JIS X 0208-1983A sample file is here. For more detail about
ISO-2022-JP
, see
RFC-1468.
Although I think ISO-2022-JP
is better than EUC-JP
,
ISO-2022-JP causes some problems
in HTML.
Shift-JIS
(also called MS-Kanji
Code).
Unfortunately, Shift-JIS
is widely used with MS-DOS, Windows
and Macintosh; but I think Shift-JIS
is rubbish
and should not be used anymore!!!
We never use this encoding under this server
except this example. ________________________________________________________________________
TAKADA Toshihiro