The Hebrew alphabet has 22 characters, as shown in the following table. Each letter is considered to have a numerical value which is used in writing numbers and for numerological interpretations of words.
Sequence order | Numerical value | Character | Unicode code point | Unicode character name |
---|---|---|---|---|
1 | 1 | à | U05D0 | HEBREW LETTER ALEF |
2 | 2 | á | U05D1 | HEBREW LETTER BET |
3 | 3 | â | U05D2 | HEBREW LETTER GIMEL |
4 | 4 | ã | U05D3 | HEBREW LETTER DALET |
5 | 5 | ä | U05D4 | HEBREW LETTER HE |
6 | 6 | å | U05D5 | HEBREW LETTER VAV |
7 | 7 | æ | U05D6 | HEBREW LETTER ZAYIN |
8 | 8 | ç | U05D7 | HEBREW LETTER HET |
9 | 9 | è | U05D8 | HEBREW LETTER TET |
10 | 10 | é | U05D9 | HEBREW LETTER YOD |
11 | 20 | ë | U05DB | HEBREW LETTER KAF |
12 | 30 | ì | U05DC | HEBREW LETTER LAMED |
13 | 40 | î | U05DE | HEBREW LETTER MEM |
14 | 50 | ð | U05E0 | HEBREW LETTER NUN |
15 | 60 | ñ | U05E1 | HEBREW LETTER SAMEKH |
16 | 70 | ò | U05E2 | HEBREW LETTER AYIN |
17 | 80 | ô | U05E4 | HEBREW LETTER PE |
18 | 90 | ö | U05E6 | HEBREW LETTER TSADI |
19 | 100 | ÷ | U05E7 | HEBREW LETTER QOF |
20 | 200 | ø | U05E8 | HEBREW LETTER RESH |
21 | 300 | ù | U05E9 | HEBREW LETTER SHIN |
22 | 400 | ú | U05EA | HEBREW LETTER TAV |
Five letters have alternative glyphs when they occur at the end of words. These are encoded in Unicode as separate code points before the respective base characters, as follows:
Numerical value | Character | Unicode code point | Unicode character name |
---|---|---|---|
500 | ê | U05DA | HEBREW LETTER FINAL KAF |
600 | í | U05DD | HEBREW LETTER FINAL MEM |
700 | ï | U05DF | HEBREW LETTER FINAL NUN |
800 | ó | U05E3 | HEBREW LETTER FINAL PE |
900 | õ | U05E5 | HEBREW LETTER FINAL TSADI |
As the table shows, the final letters are sometimes assigned numerical values of their own which can be used in numerology, but they are rarely if ever used to express numbers so they will not concern us here.
There are also two punctuation marks which we will be referring to:
Character | Unicode code point | Unicode character name |
---|---|---|
× | U05F3 | HEBREW PUNCTUATION GERESH |
Ø | U05F4 | HEBREW PUNCTUATION GERSHAYIM |
These punctuation marks may not be available in all fonts (and legacy encodings), so an implementation should be prepared to degrade gracefully. U0027 APOSTROPHE for GERESH and U0022 QUOTATION MARK for GERSHAYIM are acceptable fallbacks.
Classical Hebrew has no numerals. The letters of the alphabet are used to express numbers and to index lists of items. For indexing there are two possible systems, the alphabetical system and the numerical system. To express numbers, only the numerical system is relevant. Both systems are written from right to left, like other Hebrew text.
This section applies to both systems, but I use the term “number” for the sake of simplicity.
When numbers appear in isolation, e.g. as page numbers or as list indexes, they should be written with the letters alone. If they appear embedded in other text, punctuation marks are added to clarify that they are numbers and not words. The most common convention is as follows:
If a number is written as a single character, add GERESH after this character:
éåí à× (Day 1, i.e. Sunday)
If a number is written as more than one character, insert GERSHAYIM before the last character:
ëØá àåúéåú ý(22 letters)
ãó ÷ò"å ý(Page 176)
The expression of numerals in speech is rather inconsistent. Sometimes they are spelt out letter by letter, sometimes pronounced as if they were words, and sometimes as the number they represent.
ìØå öãé÷éí “Lamed Vav Tzaddikim” (36 righteous ones)
èØå áùáè “Tu Bishvat” (The date 15th of Shevat)
ã× àîåú “Arba amot” (four cubits)
A text-to-speech implementation should use the first possibility and spell numbers out letter-by-letter. This may not always be the most natural option to a native speaker, but it will never sound as wrong as the other options will if misplaced.
This simply uses the 22 letters of the alphabet in sequence:
à á â … ø ù ú
This can be extended to arbitrary length by chaining:
à á â … ø ù ú àà àá àâ … úø úù úú ààà ààá ààâ …
The basis of the numerical system is quite simple, at least for numbers less than 1,000. Numbers are expressed using the numerical values in the table, written from greatest to least. For numbers greater than 499, the letter TAV is repeated as necessary.
1 | à | |
2 | á | |
3 | â | |
… | … | |
9 | è | |
10 | é | |
11 | éà | |
… | … | |
19 | éè | |
20 | ë | |
21 | ëà | |
… | … | |
99 | öè | |
100 | ÷ | |
101 | ֈ | |
… | … | |
499 | úöè | |
500 | ú÷ | |
501 | ú÷à | |
… | … | |
997 | úú÷öæ | |
998 | úú÷öç | |
999 | úú÷öè |
If the last two digits of a number are 15 or 16, they should be expressed not as YUD HE (10+5) and YUD VAV (10+6), but as TET VAV (9+6) and TET ZAYIN (9+7). This is done to avoid a close resemblance to the Tetragrammaton (four-letter name of God) YUD HE VAV HE. Although this convention is originally derived from religious practice, it is universally used even in completely secular contexts.
The numerical value of each letter is fixed and not determined by position, so reordering a number will not change its value. This may be done when a number spells out a word with negative connotations (e.g. 298: RESH TSADI HET is the Hebrew for “murder” so it is sometimes written as RESH HET TSADI), or when the reordered form has especially positive connotations (e.g 18: YUD HET is often written as HET YUD, the Hebrew for “alive”). Unlike the previous exception, using the regular form in these cases is not considered an error.
To extend the numerical system to numbers above 1,000, the same numerical values are used multiplied by 1,000, 1,000,000 etc.
ä úùñâ ý5,763
à øìã ú÷ñæ 1,234,567
Or with the GERESH and GERSHAYIM where appropriate:
ùðú ä× úùñØâ The year 5,763
à× øìØã ú÷ñØæ ãâéí ý1,234,567 fish
Because there is no symbol for zero, this will cause ambiguity in some cases.
â Is this 3? 3,000? 3,000,000?
Possible solutions to this problem:
Adding the word for “thousands”. This will need special handling for 1,000 and 2,000 because of grammatical considerations.
1000 | àìó |
2000 | á× àìôéí or àìôééí |
3000 | â× àìôéí |
4000 | ã× àìôéí |
1000000 | àìó àìôéí or îéìéåï |
2000000 | á× àìôé àìôéí or á× îéìéåï |
3000000 | â× àìôé àìôéí or â× îéìéåï |
3000000123 | â× àìôé àìôé àìôéí ÷ì"â or â× îéìéàøã ÷ì"â |
Many reference books (e.g. Even Shoshan's Hebrew Dictionary and "A Universal History of Numbers" by Georges Ifrah) prescribe two dots above the letters to represent thousands, but I have never seen this usage in print and I doubt if it would be widely understood unless the context made it very clear.
It could also cause problems for implementation. I'm not sure if
it's a limitation in fonts or a bug in browsers, but the character I
have tried to use for this in the follwing examples,
U0308 COMBINING DIAERESIS
, doesn't combine
well with right-to-left characters, so I have had to use
<bdo dir="ltr">
to make them align correctly.
1000 | à̈ |
2000 | á̈ |
3000 | â̈ |
This convention suggests a possible extension to three dots over the letters to represent millions, etc., which is worth considering, though all the previous caveats would apply with no less force.
At some order of magnitude this numbering system just becomes too clumsy, and one should be prepared to bail out to another system when the numbers get too large.
TODO: Add references
Updated October 2005, with thanks to Jacob Wexler