IBM PC or MS-DOS code page 437, often abbreviated CP437 and also known as, DOS-US, OEM-US or sometimes misleadingly referred to as the OEM font, High ASCII or Extended ASCII,[1][2] is the original character set of the IBM PC, circa 1981. MS-DOS (short for M icro' s' oft D isk O perating S ystem is an Operating system commercialized by Microsoft. Code page is the traditional IBM term used to map a specific set of characters to numerical Code point values. Year 1981 ( MCMLXXXI) was a Common year starting on Thursday (link displays the 1981
In a more strict sense, this character set was not born as a real code page (in its present sense) but being merely the graphical glyph repertoire available in the ROM of the IBM Monochrome Display Adapter (MDA) and Color Graphics Adapter (CGA) video output cards of the original IBM PC; that is, it was implemented on hardware. Code page is the traditional IBM term used to map a specific set of characters to numerical Code point values. A glyph is an element of writing Two or more glyphs representing the same symbol whether interchangeable or context-dependent are called Allographs the abstract unit they The Monochrome Display Adapter ( MDA, also MDA card, Monochrome Display and Printer Adapter, MDPA) introduced in 1981 was IBM The Monochrome Display Adapter ( MDA, also MDA card, Monochrome Display and Printer Adapter, MDPA) introduced in 1981 was IBM The Color Graphics Adapter ( CGA) originally also called the Color/Graphics Adapter or IBM Color/Graphics Monitor Adapter The Color Graphics Adapter ( CGA) originally also called the Color/Graphics Adapter or IBM Color/Graphics Monitor Adapter The expression "Original Equipment Manufacturer" (OEM) arises from this kind of fact. Today, is still the primary font in the core of any EGA and VGA compatible graphic card, i. The term Video Graphics Array ( VGA) refers specifically to the display hardware first introduced with the IBM PS/2 line of computers in 1987, but through its widespread e. the text you can see on screen when a PC reboots is rendered with this code page.
All these display adapters have a basic 80-column text mode, in which every character cell is represented in the video RAM as a single byte (plus an additional byte which carries information about its colour and/or effect), giving 256 possible values for graphic characters. A byte (pronounced "bite" baɪt is the basic unit of measurement of information storage in Computer science. This way, beyond the original ASCII graphical character set (values 32 to 126, 95 in total), the implementors put in ROM a handful of miscellaneous characters even for the range 0 to 31, reserved in ASCII for control (non graphical) purposes. American Standard Code for Information Interchange ( ASCII)
So this code page has two main uses: as an information interchange code (through files and telecom), in which the values 0 to 127 plays the same role as in ASCII plus the international text characters 128 to 175 (see the table below), and as a graphical resource for screen and printers (by merely writing in the video RAM character cell/sending through line the appropriate code), in which the full range can be used to build fine presentations.
Contents |
The following is a table representing CP437 using the equivalent Unicode characters. In Computing, Unicode is an Industry standard allowing Computers to consistently represent and manipulate text expressed in most of the world's Standard ASCII and ISO 8859-1 (Latin-1) character glyphs, along with the Greek letters, are shown as coloured cells. American Standard Code for Information Interchange ( ASCII) ISO 8859-1, more formally cited as ISO/IEC 8859-1 is part 1 of ISO/IEC 8859, a standard Character encoding of the Latin alphabet.
Due to the dual use of values in the range 0 to 31 (0h to 20h), there are two sets for these, the first being their meanings as ASCII control characters and the second their graphical output on screen/printer.
For value 127 (7Fh), its graphical output is shown in the last table, its meaning being the ASCII control character "DEL" (delete), Unicode value U+007F.
| —0 | —1 | —2 | —3 | —4 | —5 | —6 | —7 | —8 | —9 | —A | —B | —C | —D | —E | —F | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0− |
NUL 0000 0 |
SOH 0001 1 |
STX 0002 2 |
ETX 0003 3 |
EOT 0004 4 |
ENQ 0005 5 |
ACK 0006 6 |
BEL 0007 7 |
BS 0008 8 |
HT 0009 9 |
LF 000A 10 |
VT 000B 11 |
FF 000C 12 |
CR 000D 13 |
SO 000E 14 |
SI 000F 15 |
| 1− |
DLE 0010 16 |
DC1 0011 17 |
DC2 0012 18 |
DC3 0013 19 |
DC4 0014 20 |
NAK 0015 21 |
SYN 0016 22 |
ETB 0017 23 |
CAN 0018 24 |
EM 0019 25 |
SUB 001A 26 |
ESC 001B 27 |
FS 001C 28 |
GS 001D 29 |
RS 001E 30 |
US 001F 31 |
| —0 | —1 | —2 | —3 | —4 | —5 | —6 | —7 | —8 | —9 | —A | —B | —C | —D | —E | —F | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0− |
FSP 2007 0 |
☺ 263A 1 |
☻ 263B 2 |
♥ 2665 3 |
♦ 2666 4 |
♣ 2663 5 |
♠ 2660 6 |
• 2022 7 |
◘ 25D8 8 |
○ 25CB 9 |
◙ 25D9 10 |
♂ 2642 11 |
♀ 2640 12 |
♪ 266A 13 |
♫ 266B 14 |
☼ 263C 15 |
| 1− |
► 25BA 16 |
◄ 25C4 17 |
↕ 2195 18 |
‼ 203C 19 |
¶ 00B6 20 |
§ 00A7 21 |
▬ 25AC 22 |
↨ 21A8 23 |
↑ 2191 24 |
↓ 2193 25 |
→ 2192 26 |
← 2190 27 |
∟ 221F 28 |
↔ 2194 29 |
▲ 25B2 30 |
▼ 25BC 31 |
| 2− |
SP 0020 32 |
! 0021 33 |
" 0022 34 |
# 0023 35 |
$ 0024 36 |
% 0025 37 |
& 0026 38 |
' 0027 39 |
( 0028 40 |
) 0029 41 |
* 002A 42 |
+ 002B 43 |
, 002C 44 |
- 002D 45 |
. The null character (also null terminator) is a character with the value zero present in the ASCII and Unicode character sets and available The End Of Text character (ETX is an ASCII Control character used to inform the receiving computer that the end of the data stream has been reached In Telecommunication, an end-of-transmission character (EOT is a transmission Control character used to indicate the conclusion of a transmission that For Teleprinters Acknowledge character (ACK is a transmission control character transmitted by the receiving station as an affirmative response to the sending station Bell character is an ASCII Control character, code 7 (^G When it is sent to a printer or a terminal, nothing is printed but an Backspace is the keyboard key that originally pushed the Typewriter carriage one position backwards and in modern computer displays moves the cursor one position backwards Tab key (abbreviation of tabulator key) on a keyboard is used to advance the cursor to the next Tab stop. In Computing, a newline (also known as a line break or end-of-line / EOL character is a special character or sequence of characters A page break is a marker in an electronic Document, which tells the document interpreter that the contents which follows is part of a new page Originally carriage return was the term for the control character in Baudot code on a teletypewriter for end of line Return to beginning of line and Shift Out (SO and Shift In (SI are ASCII Control characters 14 and 15 respectively (0xE and 0xF Shift Out (SO and Shift In (SI are ASCII Control characters 14 and 15 respectively (0xE and 0xF In Telecommunications a negative-acknowledge character (NAK is a transmission Control character sent by a station as a negative Response In Telecommunication, the term cancel character has the following meanings A precision Control character (In Unicode, the Substitute character (␚ A control character that is used in the place of a character that is recognized to be invalid or in error or that cannot be represented on a given device In Computing and Telecommunication, an escape character is a single character which in a sequence of characters signifies that what is to follow takes an alternative In writing a space () is a blank area that is devoid of content which separates words letters numbers and punctuation 002E 46 |
/ 002F 47 |
| 3− |
0 0030 48 |
1 0031 49 |
2 0032 50 |
3 0033 51 |
4 0034 52 |
5 0035 53 |
6 0036 54 |
7 0037 55 |
8 0038 56 |
9 0039 57 |
: 003A 58 |
; 003B 59 |
< 003C 60 |
= 003D 61 |
> 003E 62 |
? 003F 63 |
| 4− |
@ 0040 64 |
A 0041 65 |
B 0042 66 |
C 0043 67 |
D 0044 68 |
E 0045 69 |
F 0046 70 |
G 0047 71 |
H 0048 72 |
I 0049 73 |
J 004A 74 |
K 004B 75 |
L 004C 76 |
M 004D 77 |
N 004E 78 |
O 004F 79 |
| 5− |
P 0050 80 |
Q 0051 81 |
R 0052 82 |
S 0053 83 |
T 0054 84 |
U 0055 85 |
V 0056 86 |
W 0057 87 |
X 0058 88 |
Y 0059 89 |
Z 005A 90 |
[ 005B 91 |
\ 005C 92 |
] 005D 93 |
^ 005E 94 |
_ 005F 95 |
| 6− |
` 0060 96 |
a 0061 97 |
b 0062 98 |
c 0063 99 |
d 0064 100 |
e 0065 101 |
f 0066 102 |
g 0067 103 |
h 0068 104 |
i 0069 105 |
j 006A 106 |
k 006B 107 |
l 006C 108 |
m 006D 109 |
n 006E 110 |
o 006F 111 |
| 7− |
p 0070 112 |
q 0071 113 |
r 0072 114 |
s 0073 115 |
t 0074 116 |
u 0075 117 |
v 0076 118 |
w 0077 119 |
x 0078 120 |
y 0079 121 |
z 007A 122 |
{ 007B 123 |
| 007C 124 |
} 007D 125 |
~ 007E 126 |
⌂ 2302 127 |
| 8− |
Ç 00C7 128 |
ü 00FC 129 |
é 00E9 130 |
â 00E2 131 |
ä 00E4 132 |
à 00E0 133 |
å 00E5 134 |
ç 00E7 135 |
ê 00EA 136 |
ë 00EB 137 |
è 00E8 138 |
ï 00EF 139 |
î 00EE 140 |
ì 00EC 141 |
Ä 00C4 142 |
Å 00C5 143 |
| 9− |
É 00C9 144 |
æ 00E6 145 |
Æ 00C6 146 |
ô 00F4 147 |
ö 00F6 148 |
ò 00F2 149 |
û 00FB 150 |
ù 00F9 151 |
ÿ 00FF 152 |
Ö 00D6 153 |
Ü 00DC 154 |
¢ 00A2 155 |
£ 00A3 156 |
¥ 00A5 157 |
₧ 20A7 158 |
ƒ 0192 159 |
| A− |
á 00E1 160 |
í 00ED 161 |
ó 00F3 162 |
ú 00FA 163 |
ñ 00F1 164 |
Ñ 00D1 165 |
ª 00AA 166 |
º 00BA 167 |
¿ 00BF 168 |
⌐ 2310 169 |
¬ 00AC 170 |
½ 00BD 171 |
¼ 00BC 172 |
¡ 00A1 173 |
« 00AB 174 |
» 00BB 175 |
| B− |
░ 2591 176 |
▒ 2592 177 |
▓ 2593 178 |
│ 2502 179 |
┤ 2524 180 |
╡ 2561 181 |
╢ 2562 182 |
╖ 2556 183 |
╕ 2555 184 |
╣ 2563 185 |
║ 2551 186 |
╗ 2557 187 |
╝ 255D 188 |
╜ 255C 189 |
╛ 255B 190 |
┐ 2510 191 |
| C− |
└ 2514 192 |
┴ 2534 193 |
┬ 252C 194 |
├ 251C 195 |
─ 2500 196 |
┼ 253C 197 |
╞ 255E 198 |
╟ 255F 199 |
╚ 255A 200 |
╔ 2554 201 |
╩ 2569 202 |
╦ 2566 203 |
╠ 2560 204 |
═ 2550 205 |
╬ 256C 206 |
╧ 2567 207 |
| D− |
╨ 2568 208 |
╤ 2564 209 |
╥ 2565 210 |
╙ 2559 211 |
╘ 2558 212 |
╒ 2552 213 |
╓ 2553 214 |
╫ 256B 215 |
╪ 256A 216 |
┘ 2518 217 |
┌ 250C 218 |
█ 2588 219 |
▄ 2584 220 |
▌ 258C 221 |
▐ 2590 222 |
▀ 2580 223 |
| E− |
α 03B1 224 |
β 03B2 225 |
Γ 0393 226 |
π 03C0 227 |
Σ 03A3 228 |
σ 03C3 229 |
µ 00B5 230 |
τ 03C4 231 |
Φ 03A6 232 |
Θ 0398 233 |
Ω 03A9 234 |
δ 03B4 235 |
∞ 221E 236 |
![]() 2205 237 |
∈ 2208 238 |
∩ 2229 239 |
| F− |
≡ 2261 240 |
± 00B1 241 |
≥ 2265 242 |
≤ 2264 243 |
⌠ 2320 244 |
⌡ 2321 245 |
÷ 00F7 246 |
≈ 2248 247 |
° 00B0 248 |
∙ 2219 249 |
· 00B7 250 |
√ 221A 251 |
ⁿ 207F 252 |
² 00B2 253 |
■ 25A0 254 |
NBSP 00A0 255 |
| —0 | —1 | —2 | —3 | —4 | —5 | —6 | —7 | —8 | —9 | —A | —B | —C | —D | —E | —F |
NOTE: graphical output for characters 0 (0h), 32 (20h) and 255 (FFh) is mere blank cells, without marks of any kind. In computer-based Text processing and Digital typesetting, a non-breaking space or no-break space ( NBSP) is
NOTE: the graphical output chosen for character number 0 is U+2007 FIGURE SPACE (FSP), a space of the same width as digits in the variable-pitch fonts.
In DOS and Windows, most characters from the currently active DOS code page can be inserted by holding down the Alt key and entering the character's three-digit decimal code on the numpad. DOS, short for "Disk Operating System" is a shorthand term for several closely related Operating systems that dominated the IBM PC compatible market Microsoft Windows is a series of Software Operating systems and Graphical user interfaces produced by Microsoft. For a list of keyboard shortcuts see Table of keyboard shortcuts The Alt key on a computer keyboard is used to change (alternate the function A numeric keypad, or numpad for short is the small palm-sized seventeen key section of a Computer keyboard, usually on the very far right This technique is called Windows Alt keycodes. In PCs running the Microsoft Windows or DOS Operating systems additional characters to those available by the current Keyboard layout may be typed One can find out which DOS code page is currently active by issuing the DOS command mode con or chcp. A partial list of the most common commands for Microsoft 's MS-DOS Operating system follows A partial list of the most common commands for Microsoft 's MS-DOS Operating system follows A partial list of the most common commands for Microsoft 's MS-DOS Operating system follows
CP437 is based on ASCII, with the following modifications:
The repertoire of CP437 was taken from the character set of Wang word-processing machines, according to Bill Gates in an interview with Gates and Paul Allen that in the 2 October 1995 edition of Fortune Magazine:
The graphic character set selection, often accused to be somewhat bizarre, has some internal logic:
CP437 has a series of international characters, mainly values 128 to 175 (80H to AFh). However, it lacks many characters important to several Western languages:
Along with the cent (¢), pound sterling (£) and yen/yuan (¥) currency symbols, it has a couple of European currency symbols, for the florin (ƒ, Netherlands) and the peseta (₧, Spain). In many national currencies, the cent is a monetary unit that equals 1/100 of the basic monetary unit The Pound Sterling ( symbol £; ISO code: GBP) subdivided into 100 pence (singular penny) is the Currency The presence of the last is a real surprise, since the Spanish peseta was never an internationally relevant currency, and also never had a symbol of its own; it was simply abbreviated as "Pt", "Pta", "Pts", or "Ptas". The only related fact is that Spanish models of the IBM electric typewriter also had a single type devoted to it. The IBM Electric typewriters were a series of electric Typewriters that IBM manufactured starting in the late 1940s
Later MS-DOS character sets, such as CP850 (DOS Latin-1), CP852 (DOS Central-European) and CP737 (DOS Greek), filled the gaps for international use with some compatibility to with CP437 by retaining the single and double box-drawing characters, while discarding the mixed ones (e. Code page 850 is a Code page that was used in western Europe under systems such as DOS. Code page 852 (CP 852 IBM 852 OEM 852 is a Code page to be used under MS-DOS with Central European languages that use Latin script (such as Code page 737 (CP 737 IBM 737 OEM 737 is a Code page to be used under MS-DOS to write Greek language. g. horizontal double/vertical single). All CP437 characters are in Unicode and in Microsoft's WGL4 character set, therefore in most of the fonts on Microsoft Windows, and also in the default VGA font of the Linux kernel, and the ISO 10646 fonts for X11. In Computing, Unicode is an Industry standard allowing Computers to consistently represent and manipulate text expressed in most of the world's Windows Glyph List 4, or more commonly WGL4 for short also known as the Pan-European character set, is a character repertoire on recent Microsoft's operating Microsoft Windows is a series of Software Operating systems and Graphical user interfaces produced by Microsoft. Linux (commonly pronounced ˈlɪnəks The Universal Character Set (UCS defined by the ISO / IEC 10646 International Standard, is a standard set of characters upon which
Along with the characters in the range 0 to 31, which can be interpreted as ASCII controls as well as graphical dingbats, some characters with ambiguous look (to the eyes of its implementors, not to the eyes of a typographer) have overloaded meanings, depending upon context:
) and it was also used as Greek phi symbol in italics (U+03D5,
) to name angles, diameter sign (U+2300,
) and as an approximated surrogate for the Latin lowercase O with stroke (U+00F8, ø), but rarely as Greek lowercase phi (U+03D6, φ) due to its IBM original shape, which seems to be merely a circle crossed by a slash, and does not closely resemble this Greek lowercase letter. In Mathematics, and more specifically Set theory, the empty set is the unique set having no ( Zero) members Phi (uppercase Φ, lowercase φ or ϕ) pronounced in modern Greek and as in English is the 21st letter of the Greek alphabet The main reason for this spawning is that the CP437 character set of the original IBM PC MDA and CGA display adapters, as well that of compatible printers, was fixed in ROM and could not be changed by software, so developers and users tried to take the maximum advantage of the available resources. The Monochrome Display Adapter ( MDA, also MDA card, Monochrome Display and Printer Adapter, MDPA) introduced in 1981 was IBM The Color Graphics Adapter ( CGA) originally also called the Color/Graphics Adapter or IBM Color/Graphics Monitor Adapter
Implementors of mapping tables to Unicode should note that these "unified" characters may have not a unique, single meaning: the correct choice depend upon context.
In the Microsoft reference documentation, the following CP437 characters have Unicode values assigned which depart from the values given in the table above:
00h = U+0000 NULL
7Fh = U+007F DELETE
E1h = U+00DF LATIN SMALL LETTER SHARP S
EDh = U+03C6 GREEK SMALL LETTER PHI
EEh = U+03B5 GREEK SMALL LETTER EPSILON
| Fixed control values− |
NUL 0000 0 |
DEL 007F 127 |
Alternate character values− |
ß 00DF 225 |
φ 03C6 237 |
ε 03B5 238 |
|---|
It should be noted that the Unicode character U+03D5 GREEK PHI SYMBOL (
) would be a better choice[4] for value number 237 (EDh) of CP437. The null character (also null terminator) is a character with the value zero present in the ASCII and Unicode character sets and available