| Khmer | ||
|---|---|---|
| Type | Abugida | |
| Spoken languages | Khmer | |
| Time period | c. An abugida ( from Ge‘ez አቡጊዳ ’äbugida or Amharic አቡጊዳ ’abugida is a segmental Writing system which Khmer (ភាសាខ្មែរ or Cambodian, is the language of the Khmer people and the official language of Cambodia. 600–present | |
| Parent systems | Proto-Canaanite alphabet (disputed) → Phoenician alphabet (disputed) → Aramaic alphabet (disputed) → Brāhmī → Pallava → Khmer |
|
| Child systems | Thai Lao |
|
| Sister systems | Mon Old Kawi |
|
| Unicode range | U+1780–U+17FF, U+19E0–U+19FF |
|
| ISO 15924 | Khmr | |
![]() |
||
| Note: This page may contain IPA phonetic symbols in Unicode. Events By Place World The population of the Earth rises to about 208 million people The Proto-Canaanite alphabet is a consonantal alphabet of twenty-two acrophonic glyphs found in Levantine texts of the Late Bronze Age (from ca The Phoenician alphabet is a continuation of the Proto-Canaanite alphabet, by convention taken to originate around 1050 BC The Aramaic alphabet is an Abjad, a Consonantal Alphabet, used for writing Aramaic. Brāhmī script refers to the oldest members of the Brahmic family of alphabets. Vatteluttu () or "rounded writing" is an Abugida Writing system originating from the Dravidian peoples of Southern India and The Thai Alphabet (อักษรไทย àksŏn thai) is used to write the Thai language and other minority languages in Thailand The Lao script is used mainly to write the Lao language. The minority languages of Laos are also written in the Lao script and officially it is the only script Unicode ’s ISO 15924, Codes for the representation of names of scripts, defines two sets of codes for a number of Writing systems (scripts In Computing, Unicode is an Industry standard allowing Computers to consistently represent and manipulate text expressed in most of the world's | ||
| This article contains Indic text. The Brahmic family is a family of syllabaries (writing systems used in South Asia, Southeast Asia, and parts of Central Asia and East Asia, Without rendering support, you may see question marks, boxes or other symbols instead of Indic characters; or irregular vowel positioning and a lack of conjuncts. Mojibake is the happenstance of incorrect unreadable characters (garbage characters shown when Computer software fails to render a text correctly according to its associated |
| The Brahmic script and its descendants |
|---|
|
| History of the alphabet |
|---|
|
Middle Bronze Age 19 c. The history of the Alphabet begins in Ancient Egypt, more than a millennium into the History of writing. The Middle Bronze Age alphabets are two similar Undeciphered scripts dated to be from the Middle Bronze Age (2000-1500 BCE and believed to be ancestral BCE
|
| Meroitic 3 c. The Meroitic script is an Alphabetic script originally derived from Egyptian hieroglyphs used to write the Meroitic language of the Kingdom of Meroë / BCE |
| Ogham 4 c. Ogham (ogam ˈɔɣam Modern Irish or, English) is an Early Medieval Alphabet used primarily to represent the Old Irish language (and CE |
| Hangul 1443 CE |
| Canadian syllabics 1840 CE |
| Zhuyin 1913 CE |
| complete genealogy |
The Khmer script (អក្ខរក្រមខេមរភាសា; âkkhârâkrâm khémârâ phéasa) is used to write the Khmer language which is the official language of Cambodia. Canadian Aboriginal syllabic writing', or simply syllabics, is a family of Abugidas {dubious}} used to write a number of Aboriginal Canadian Nearly all the segmental scripts (loosely " Alphabets " but see below for more precise terminology used around the globe appear to have derived from the Khmer (ភាសាខ្មែរ or Cambodian, is the language of the Khmer people and the official language of Cambodia. The Kingdom of Cambodia ( formerly known as Kampuchea (, transliterated: Preăh Réachéanachâkr Kâmpŭchea) is a country in South East It is generally thought that the Khmer script developed from the Pallava script of India. Grantha ( Tamil: கிரந்த ௭ழுத்து Bengali: গ্রন্থলিপি Malayalam: ml ഗ്രന്ഥലിപി Sanskrit [1] The oldest dated inscription in Khmer was found at Angkor Borei in Takev Province south of Phnom Penh and dates from 611 AD. Angkor Borei District ( Khmer: ស្រុកអង្គរបុរី is a district located in Takéo Province, in southern Cambodia. Takéo is a province of Cambodia. Its capital is Takéo. Takeo town is an easygoing place that possesses a fair amount of natural and manmade beauty [2] Those inscriptions that have survived are engraved in stone and the evolution of Khmer script is as follows:
The Khmer alphabet has fewer symbols for vowels than the language has vowel phonemes. In Phonetics, a vowel is a Sound in spoken Language, such as English ah! or oh!, pronounced with an open Vocal tract The phoneME project is Sun Microsystems reference implementation of Java virtual machine and associated libraries of Java ME with source licensed under the GNU To account for this, each consonant belongs to one of two series, and the vowel produced depends on which series the consonant belongs to (making it an abugida rather than a true alphabet). An abugida ( from Ge‘ez አቡጊዳ ’äbugida or Amharic አቡጊዳ ’abugida is a segmental Writing system which An alphabet is a standardized set of letters basic written symbols each of which roughly represents a Phoneme, a Spoken language, either Therefore, most vowel signs have two possible pronunciations, depending on which series the consonant belongs to. When no vowel sign is present, usually the inherent vowel of the consonant is used. Vowels signs can be divided into two groups: dependent vowel signs, which are written around a consonant letter, and independent vowel letters, which can stand alone. Dependent vowel signs are used more frequently than independent vowels and all independent vowel letters can be phonetically rendered with a dependent vowel. Khmer also has a number of diacritics, which can change the series of the consonant or change the pronunciation of the vowel. A diacritic ( also called a diacritic or diacritical mark, point, or sign, is a small sign added to a letter to alter pronunciation
Contents |
There are several styles of Khmer script which are used for different purposes.
The last two styles, when handwritten, are usually pencil-line width, however, in printed form and on computer fonts, they are usually seen in wider widths. Most Khmer computer fonts depict neither style correctly; in fact, some may meld elements of 'âksâr mul' and 'âksâr khâm' into one style, so generally either is referred to as 'âksâr mul'.
There are 35 Khmer consonants symbols, although modern Khmer only uses 33, two having become obsolete. Subscript consonants are special forms used to form consonant clusters. Also sometimes referred to as "sub-consonants", subscript consonants often resemble the corresponding consonant symbol, only smaller. In Khmer, they are known as 'cheung âksâr' (ជើងអក្សរ), meaning the foot of a letter. In forming consonant clusters, the second (and where necessary, the third) consonant sound of the cluster is written as a subscript which cancels the inherent vowel of the preceding consonant. Most subscript consonants are written directly below consonant which they follow, although subscript /r/ is written before while a few others have ascending elements which appear after.
Listed in the table below are the pronunciations of the consonants when recited. Although Khmer spelling is very regular, the pronunciation of some consonants may be slightly different from the recited version in a few words. This is especially true in loan words. The IPA values given are for consonants in the initial or medial position. The International Phonetic Alphabet (IPA is a system of phonetic notation based on the Latin alphabet, devised by the International Phonetic Because of Khmer phonology, in which final stops are unreleased and possible finals are limited, word-final values may differ. An unreleased stop or unreleased plosive is a Plosive consonant without an audible release burst For example, word-final /s/ is pronounced /h/ and, in most dialects, word-final /r/ is silent. The inherent vowels of consonants in the final position are almost never pronounced. The two obsolete consonants are highlighted in gray.
| Consonants | Subscript form | Transliteration | IPA |
|---|---|---|---|
| ក | ្ក | kâ | kɑ |
| ខ | ្ខ | khâ | kʰɑ |
| គ | ្គ | kô | kɔ |
| ឃ | ្ឃ | khô | kʰɔ |
| ង | ្ង | ngô | ŋɔ |
| ច | ្ច | châ | cɑ |
| ឆ | ្ឆ | chhâ | cʰɑ |
| ជ | ្ជ | chô | cɔ |
| ឈ | ្ឈ | chhô | cʰɔ |
| ញ | ្ញ | nhô | ɲɔ |
| ដ | ្ដ | dâ | ɗɑ |
| ឋ | ្ឋ | thâ | tʰɑ |
| ឌ | ្ឌ | dô | ɗɔ |
| ឍ | ្ឍ | thô | tʰɔ |
| ណ | ្ណ | nâ | nɑ |
| ត | ្ត | tâ | tɑ |
| ថ | ្ថ | thâ | tʰɑ |
| ទ | ្ទ | tô | tɔ |
| ធ | ្ធ | thô | tʰɔ |
| ន | ្ន | nô | nɔ |
| ប | ្ប | bâ | ɓɑ |
| ផ | ្ផ | phâ | pʰɑ |
| ព | ្ព | pô | pɔ |
| ភ | ្ភ | phô | pʰɔ |
| ម | ្ម | mô | mɔ |
| យ | ្យ | yô | jɔ |
| រ | ្រ | rô | rɔ |
| ល | ្ល | lô | lɔ |
| វ | ្វ | vô | vɔ |
| ឝ | ្ឝ | shâ | - |
| ឞ | ្ឞ | ssô | - |
| ស | ្ស | sâ | sɑ |
| ហ | ្ហ | hâ | hɑ |
| ឡ | ្ឡ* | lâ | lɑ |
| អ | ្អ | qâ | ʔɑ |
* The subscript for the consonant lâ is included in Unicode although its usage in modern Khmer is generally non-existent.
For some phonemes in loanwords, the Khmer writing system has 'created' supplementary consonants. A loanword (or loan word) is a word directly taken into one Language from another with little or no translation Most of these consonants are created by stacking a subscript under the character for/hɑ/ to form digraphs. The consonant for /pɑ/, however, is created by using the diacritical sign called musĕkâtônd over the consonant for /bɑ/. These additional consonants are mainly used to represent sounds in French and Thai loanwords.
| Digraph consonants | Transliteration | IPA |
|---|---|---|
| ហ្គ | gâ | gɑ |
| ហ្ន | nâ | nɑ |
| ប៉ | pâ | pɑ |
| ហ្ម | mâ | mɑ |
| ហ្ល | lâ | lɑ |
| ហ្វ | fâ, wâ | fɑ, wɑ |
| ហ្ស | žâ | ʒɑ |
There are 16 unique dependent vowel symbols. Although this name can be added up to 24 when dependent vowels with diacritical symbols are included. A diacritic ( also called a diacritic or diacritical mark, point, or sign, is a small sign added to a letter to alter pronunciation Dependent vowels are known in Khmer as srăk nissăy (ស្រៈនិស្ស័យ) or srăk phsâm (ស្រៈផ្សំ). Dependent vowels always have to be combined with a consonant in orthography. The orthography of a language specifies the correct way of using a specific Writing system to write the language For most the vowel symbols, there are two sounds (registers). The sound of the vowel used depends on the series (the inherent vowel) of the dominant consonant in a syllable cluster.
| Dependent vowels |
Transliteration | IPA | ||
|---|---|---|---|---|
| a-series | o-series | a-series | o-series | |
| អា | a | éa | aː | iːə |
| អិ | ĕ | ĭ | e | i |
| អី | ei | i | əj | iː |
| អឹ | ŏe | ə | ɨ | |
| អឺ | œ | əːɨ | ɨː | |
| អុ | ŏ | ŭ | o | u |
| អូ | o | u | oːu | uː |
| អួ | uŏ | uːə | ||
| អើ | aeu | eu | aːə | əː |
| អឿ | eua | ɨːə | ||
| អៀ | iĕ | iːə | ||
| អេ | é | eːi | eː | |
| អែ | ê | aːe | ɛː | |
| អៃ | ai | ey | aj | ɨj |
| អោ | aô | oŭ | aːo | oː |
| អៅ | au | ŏu | aw | ɨw |
| Dependent vowels & diacritics |
Transliteration | IPA | ||
|---|---|---|---|---|
| a-series | o-series | a-series | o-series | |
| អុំ | om | ŭm | om | um |
| អំ | âm | um | ɑm | um |
| អាំ | ăm | ŏâm | am | oəm |
| អះ | ăh | eăh | aʰ | eəʰ |
| អុះ | ŏh | uh | oʰ | uʰ |
| អេះ | éh | eiʰ | eʰ | |
| អែ | êh | aeʰ | ɛʰ | |
| អោះ | aŏh | uŏh | ɑʰ | ʊəʰ |
Independent vowels are vowels that do not have to be paired with a consonant in a syllable, hence the name. In Khmer they are called srăk penhtuŏ (ស្រៈពេញតួ) which means complete vowels.
| Independent vowels |
Transliteration | IPA |
|---|---|---|
| ឣ | â | ʔɑʔ |
| ឤ | a | ʔa |
| ឥ | ĕ | ʔe |
| ឦ | ei | ʔəj |
| ឧ | ŏ | ʔ |
| ឨ | ||
| ឩ | ŭ | ʔu |
| ឪ | ŏu | ʔɨw |
| ឫ | rŏe | ʔrɨ |
| ឬ | rœ | ʔrɨː |
| ឭ | lŏe | ʔlɨ |
| ឮ | lœ | ʔlɨː |
| ឯ | é | ʔeː |
| ឰ | ai | ʔaj |
| ឱ, ឲ | aô | ʔaːo |
| ឳ | âu | ʔaw |
| Diacritics | Name | Notes |
|---|---|---|
| ំ | nĭkkôhĕt (និគ្គហិត) | niggahita; nasalizes the inherent vowels and some of the dependent vowels, see anusvara, sometimes used to represent [aɲ] in Sanskrit loanwords |
| ះ | reăhmŭkh (រះមុខ) | shining face; adds final aspiration to dependent or inherent vowels, usually omitted, corresponds to the visarga diacritic, it maybe included as dependent vowel symbol |
| ៈ | yŭkôleăkpĭntŭ (យុគលពិន្ទុ) | yugalabindu (pair of dots); adds final glottalness to dependent or inherent vowels, usually omitted, a relatively new diacritic |
| ៉ | musĕkâtônd (មូសិកទន្ដ) | musikadanta (mouse teeth); used to convert some o-series consonants to the a-series |
| ៊ | trei sâpt (ត្រីសព្ទ) | trisabda; used to convert some a-series consonants to the o-series |
| ុ | kbiĕh kraôm (ក្បៀសក្រោម) | also known as bŏkcheung (បុកជើង); used in place when the diacritics trei sâpt and musĕkâtônd impede with superscript vowels |
| ់ | bântăk (បន្តក់) | used to shorten some vowels |
| ៌ | rôbat (របាទ), répheăk (រេផៈ) | rapada, repha; behaves similarly to the tôndâkhéat, corresponds to the Devanagari diacritic 'repha', however it lost its original function which was to represent a vocalic r |
| ៍ | tôndâkhéat (ទណ្ឌឃាដ) | daṇḍaghata; used to render some letters as unpronounced |
| ៎ | kakâbat (កាកបាទ) | kakapada (the crow's foot); more a punctuation mark than a diacritic; used in writing to indicate the rising intonation of an exclamation or interjection; often placed on particles such as /na/, /nɑː/, /nɛː/, /vəːj/, and the feminine response /cah/ |
| ័ | sanhyoŭk sannha (សំយោគសញ្ញា) | represents a short inherent vowel in Sanskrit and Pali words; usually omitted |
| ៑ | vĭréam (វិរាម) | a mostly obsolete diacritic, corresponds to the virama |
| ្ | cheung (ជើង) | a. Anusvara (Dev अनुस्वार anusvāra) is the diacritic used to mark a type of Nasalization used in a number of Indic languages. Visarga ( visarga) is a Sanskrit word meaning "sending forth discharge" This article is about the sound in spoken language For the letter see Glottal stop (letter. In Phonetics, vocalic r refers to the phenomenon of a Rhotic segment such as or occurring as the Syllable nucleus. In Linguistics, the term particle is a word lacking a strict definition but has the function of changing the relation of the parts of the sentence to one another and is therefore Virama is a generic term for the Diacritic character in many Brahmic scripts that is used to suppress an inherent Vowel sound that occurs with every consonant w. coeng; a sign developed for Unicode to input subscript consonants, appearance of this sign varies among fonts |
The Khmer script uses several unique punctuation marks as well as some borrowed from the Latin script such as the question mark. The question mark (? also known as an interrogation point, question point, query, or eroteme, is a punctuation mark that replaces The period in the Khmer language "។" resembles an eighth rest in music writing. In Music notation, a note value indicates the relative Duration of a note, using the color or shape of the Note head, the presence
Most consonants, including a few of the subscripts, form ligatures with all dependent vowels that contain the symbol used for the vowel a (ា). A lot of these ligatures are easily recognizable, however a few may not be. One of the more unrecognizable is the ligature for the bâ and a which was created to differentiate it from the consonant symbol hâ as well as the ligature for châ and a. It is not always necessary to connect consonants with the dependent vowel a.
Examples of ligatured symbols:

Ligatured consonant subscript and vowel combination:

The numerals of the Khmer script, similar to that used by other civilizations in Southeast Asia, are also derived from the southern Indian script. Khmer numerals are the numerals used in the Khmer language of Cambodia. Arabic numerals are also used, but to a lesser extent. The arabic numerals (often capitalized are the ten Digits (0 1 2 3 4 5 6 7 8 9 which—along with the system
| Khmer numerals | ០ | ១ | ២ | ៣ | ៤ | ៥ | ៦ | ៧ | ៨ | ៩ |
|---|---|---|---|---|---|---|---|---|---|---|
| Arabic numerals | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
The Unicode range for Khmer consists of two ranges: U+1780 . In Computing, Unicode is an Industry standard allowing Computers to consistently represent and manipulate text expressed in most of the world's . . U+17FF for the basic characters, and U+19E0 - U+19FF for additional symbols. Grey areas indicate non-assigned code points.
| Khmer Unicode.org chart (PDF) |
||||||||||||||||
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| U+178x | ក | ខ | គ | ឃ | ង | ច | ឆ | ជ | ឈ | ញ | ដ | ឋ | ឌ | ឍ | ណ | ត |
| U+179x | ថ | ទ | ធ | ន | ប | ផ | ព | ភ | ម | យ | រ | ល | វ | ឝ | ឞ | ស |
| U+17Ax | ហ | ឡ | អ | ឣ | ឤ | ឥ | ឦ | ឧ | ឨ | ឩ | ឪ | ឫ | ឬ | ឭ | ឮ | ឯ |
| U+17Bx | ឰ | ឱ | ឲ | ឳ | ឴ | ឵ | ា | ិ | ី | ឹ | ឺ | ុ | ូ | ួ | ើ | ឿ |
| U+17Cx | ៀ | េ | ែ | ៃ | ោ | ៅ | ំ | ះ | ៈ | ៉ | ៊ | ់ | ៌ | ៍ | ៎ | ៏ |
| U+17Dx | ័ | ៑ | ្ | ៓ | ។ | ៕ | ៖ | ៗ | ៘ | ៙ | ៚ | ៛ | ៜ | ៝ | ||
| U+17Ex | ០ | ១ | ២ | ៣ | ៤ | ៥ | ៦ | ៧ | ៨ | ៩ | ||||||
| U+17Fx | ៰ | ៱ | ៲ | ៳ | ៴ | ៵ | ៶ | ៷ | ៸ | ៹ | ||||||
| Khmer Symbols Unicode.org chart (PDF) |
||||||||||||||||
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| U+19Ex | ᧠ | ᧡ | ᧢ | ᧣ | ᧤ | ᧥ | ᧦ | ᧧ | ᧨ | ᧩ | ᧪ | ᧫ | ᧬ | ᧭ | ᧮ | ᧯ |
| U+19Fx | ᧰ | ᧱ | ᧲ | ᧳ | ᧴ | ᧵ | ᧶ | ᧷ | ᧸ | ᧹ | ᧺ | ᧻ | ᧼ | ᧽ | ᧾ | ᧿ |