Optical Character Recognition is a Unicode block containing signal characters for OCR and MICR standards.
Block
Optical Character Recognition[1][2]Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+244x | ⑀ | ⑁ | ⑂ | ⑃ | ⑄ | ⑅ | ⑆ | ⑇ | ⑈ | ⑉ | ⑊ | |||||
U+245x | ||||||||||||||||
Notes1.^ As of Unicode version 16.02.^ Grey areas indicate non-assigned code points |
Subheadings
The Optical Character Recognition block has three informal subheadings (groupings) within its character collection: OCR-A, MICR, and OCR.1
OCR-A
Further information: OCR-A
The OCR-A subheading contains six characters taken from the OCR-A font described in the ISO 1073-1:1976 standard: U+2440 ⑀ OCR HOOK, U+2441 ⑁ OCR CHAIR, U+2442 ⑂ OCR FORK, U+2443 ⑃ OCR INVERTED FORK, U+2444 ⑄ OCR BELT BUCKLE, and U+2445 ⑅ OCR BOW TIE. The OCR bow tie is given the informative alias "unique asterisk".
The hook, chair and fork, in addition to a long vertical bar, are included in the most basic "numeric" implementation level of OCR-A, which includes digits but excludes letters and conventional punctuation.2 By contrast, the most basic implementation level of OCR-B instead includes the digits, plus sign, less-than sign, greater-than sign, long vertical bar and seven of the capital letters;3 as such, there are no characters specific to OCR-B in the Optical Character Recognition block.
MICR
Further information: Magnetic ink character recognition
The MICR subheading contains four punctuation characters for bank cheque identifiers, taken from the magnetic ink character recognition E-13B font (codified in the ISO 1004:1995 standard): U+2446 ⑆ OCR BRANCH BANK IDENTIFICATION, U+2447 ⑇ OCR AMOUNT OF CHECK, U+2448 ⑈ OCR DASH, and U+2449 ⑉ OCR CUSTOMER ACCOUNT NUMBER.
The latter two characters are misnamed: their names were inadvertently switched when they were named in the 1993 (first) edition of ISO/IEC 10646,4 a mistake which had been present since Unicode 1.0.0.5 Although their formal names remain unchanged due to the Unicode stability policy, they both have corrected normative aliases: U+2448 ⑈ is MICR ON US SYMBOL, and U+2449 ⑉ is MICR DASH SYMBOL6 (the standard notes that "the Unicode character names include several misnomers").
These symbols had previously been encoded by the ISO-IR-98 encoding defined by ISO 2033:1983, in which they were simply named SYMBOL ONE through SYMBOL FOUR.7 All four characters have informative aliases in the Unicode charts: "transit", "amount", "on us", and "dash" respectively.
OCR
Further information: JIS X 9008
The OCR subheading consists of a single character: U+244A ⑊ OCR DOUBLE BACKSLASH.
History
The following Unicode-related documents record the purpose and process of defining specific characters in the Optical Character Recognition block:
Version | Final code points8 | Count | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|
1.0.0 | U+2440..244A | 11 | (to be determined) | ||
L2/10-416R | Moore, Lisa (2010-11-09), "Consensus 125-C39", UTC #125 / L2 #222 Minutes, Create two formal aliases, U+2448 MICR ON US SYMBOL and U+2449 MICR DASH SYMBOL for Unicode 6.1. | ||||
N4103 | "T.3. Optical Character Recognition", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03 | ||||
L2/22-065 | Whistler, Ken (2022-04-13), "Opt Subject: Unicode 14.0 "Optical Character Recognition" code chart [Affects U+2447]", Editorial Committee Report and Recommendations for UTC #171Meeting | ||||
References
"Unicode Code Charts: Optical Character Recognition" (PDF). The Unicode Standard, Version 6.3. Retrieved 27 February 2014. https://www.unicode.org/charts/PDF/U2440.pdf ↩
European Computer Manufacturers Association (1977). "Nominal Character Dimensions of the Numeric OCR-A Font" (PDF) (2nd ed.). ECMA-8. /wiki/Ecma_International ↩
ISO/IEC JTC1/SC2/WG3 (1998-09-28). "9.1: Subset 1: Minimal alphanumeric subset" (PDF). Proposal for Type 3 Technical Report, TR 15907, Information technology—Revision of OCR-B standard (ISO 1073-2:1976). p. 8. ISO/IEC JTC1/SC2/WG3 N470.{{cite web}}: CS1 maint: numeric names: authors list (link) /wiki/ISO/IEC_JTC_1/SC_2 ↩
ISO/IEC JTC 1/SC 2/WG 2 (2012-01-03). "T.3. Optical Character Recognition". Unconfirmed minutes of WG 2 meeting 58 (PDF). p. 29. SC2 N4188 / WG2 N4103. These Magnetic Ink Character Recognition (MICR) symbols are used by banks on checks. The names of these characters were inadvertently mixed up in the 1993 edition of ISO/IEC 10646.{{citation}}: CS1 maint: numeric names: authors list (link) /wiki/ISO/IEC_JTC_1/SC_2 ↩
"3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium. https://www.unicode.org/versions/Unicode1.0.0/CodeCharts2.pdf ↩
Freytag, Asmus; McGowan, Rick; Whistler, Ken (2017-04-10). Known Anomalies in Unicode Character Names (4 ed.). Unicode Consortium. Unicode Technical Note #27. https://www.unicode.org/notes/tn27/tn27-4.html ↩
ISO/TC97/SC2 (1985-08-01). ISO-IR-98: E13B Graphic Character Set (PDF). ITSCJ/IPSJ.{{citation}}: CS1 maint: numeric names: authors list (link) /wiki/ISO/IEC_JTC_1/SC_2#History ↩
Proposed code points and characters names may differ from final code points and names ↩