International standard
ISO/IEC 8859-8
,
Information technology ? 8-bit single-byte coded graphic character sets ? Part 8: Latin/Hebrew alphabet
, is part of the
ISO/IEC 8859
series of ASCII-based standard
character encodings
.
ISO/IEC 8859-8:1999
from 1999 represents its second and current revision, preceded by the first edition
ISO/IEC 8859-8:1988
in 1988. It is informally referred to as
Latin/Hebrew
.
ISO/IEC 8859-8
covers all the
Hebrew letters
, but no
Hebrew vowel signs
. IBM assigned
code page
916
(
CCSIDs
916 and 5012) to it.
[2]
[3]
[4]
This character set was also adopted by
Israeli Standard
SI1311:2002, with some extensions.
ISO-8859-8
is the
IANA
preferred charset name for this standard when supplemented with the
C0 and C1 control codes
from
ISO/IEC 6429
. The text is (usually) in logical order, so
bidi
processing is required for display. Nominally
ISO-8859-8
(
code page 28598
) is for “visual order”, and
ISO-8859-8-
I
(
code page 38598
) is for logical order. But usually in practice, and required for XML documents,
[
citation needed
]
ISO-8859-8
also stands for logical order text. The
WHATWG
Encoding Standard used by
HTML5
treats ISO-8859-8 and
ISO-8859-8-
I
as distinct encodings with the same mapping due to influence on the layout direction, but notes that this no longer applies to
ISO-8859-6
(Arabic), only to ISO-8859-8.
[5]
There is also
ISO-8859-8-E
which supposedly requires directionality to be explicitly specified with special control characters; this latter variant is in practice unused.
The
Microsoft Windows
code page for Hebrew,
Windows-1255
, is mostly an extension of ISO/IEC 8859-8 without C1 controls, except for the omission of the double underscore, and replacement of the generic currency sign (
¤
) with the
sheqel sign
(?). It adds support for
vowel points
as combining characters, and some additional punctuation.
Over a decade after the publication of that standard,
Unicode
is preferred, at least for the Internet
[6]
(meaning
UTF-8
, the dominant encoding for web pages). ISO-8859-8 is used by less than 0.1% of websites.
[7]
Code page layout
[
edit
]
FD is left-to-right mark (U+200E) and FE is right-to-left mark (U+200F), as specified in a newer amendment as ISO/IEC 8859-8:1999.
2002 Israeli Standard extensions
[
edit
]
Israeli Standard SI1311:2002 matches ISO/IEC 8859-8:1999 except for a number of additional character allocations for the
euro sign
,
new shekel
sign and more advanced
explicit bidirectional formatting
.
[12]
Absent from ISO/IEC 8859-8:1999, added in SI1311:2002.
See also
[
edit
]
References
[
edit
]
- ^
Character Sets
,
Internet Assigned Numbers Authority
(IANA), 2018-12-12
- ^
"Code page 916 information document"
. Archived from
the original
on 2017-02-16.
- ^
"CCSID 916 information document"
. Archived from
the original
on 2014-11-29.
- ^
"CCSID 5012 information document"
. Archived from
the original
on 2016-03-27.
- ^
van Kesteren, Anne
.
"9. Legacy single-byte encodings"
.
Encoding Standard
.
WHATWG
.
Note: ISO-8859-8 and ISO-8859-8-
I
are distinct encoding names, because ISO-8859-8 has influence on the layout direction. And although historically this might have been the case for ISO-8859-6 and "ISO-8859-6-
I
" as well, that is no longer true.
- ^
John, Nicholas A. (2013).
"The Construction of the Multilingual Internet: Unicode, Hebrew, and Globalization"
.
Journal of Computer-Mediated Communication
.
18
(3): 321?338.
doi
:
10.1111/jcc4.12015
.
ISSN
1083-6101
.
Background: the problem of Hebrew and the Internet
- ^
"Usage Statistics of ISO-8859-8 for Websites, January 2019"
.
w3techs.com
. Retrieved
2019-01-17
.
- ^
Code Page CPGID 00916 (pdf)
(PDF)
, IBM
- ^
Code Page CPGID 00916 (txt)
, IBM
- ^
International Components for Unicode (ICU), ibm-916_P100-1995.ucm
, 2002-12-03
- ^
International Components for Unicode (ICU), ibm-5012_P100-1999.ucm
, 2002-12-03
- ^
a
b
Standards Institution of Israel
.
ISO-IR-234: Latin/Hebrew character set for 8-bit codes
(PDF)
. ITSCJ/
IPSJ
.
External links
[
edit
]
ISO
standards
by standard number
|
---|
|
1?9999
| |
---|
10000?19999
| |
---|
20000?29999
| |
---|
30000+
| |
---|
|