Windows-1252 - Reference.org

On this page

MIME / IANA windows-1252

Language(s) Basically all supported by ISO/IEC 8859-1 e.g. English, Irish, Italian, Norwegian, Portuguese, Spanish, Swedish. Plus also German, Finnish, Icelandic and French. And Dutch except the Ĳ character. And Slovenian except the č character.

Created by Microsoft

Standard WHATWG Encoding Standard

Classification extended ASCII, Windows-125x

Extends ISO 8859-1 (excluding C1 controls)

Transforms / Encodes ISO 8859-15

Additional information

Windows-1252

Code page used for the Latin alphabets of Western European languages

Windows-1252, also known as Windows code page 1252, is a legacy single-byte character encoding used by default in Microsoft Windows across the Americas, Western Europe, Oceania, and parts of Africa. Originally matching ISO 8859-1, it diverged in Windows 2.0 by adding printable characters in the hex 0x80–0x9F range, including curly quotation marks and characters from ISO 8859-15. Despite widespread adoption of UTF-8, as of 2024, about 1.4% of websites still declare ISO 8859-1 or Windows-1252, with higher usage in countries like Brazil (2.9%) and Germany (2.5%).

Related Image Collections Add Image

Profiles

1 Image

We don't have any YouTube videos related to Windows-1252 yet.

You can add one yourself here.

We don't have any PDF documents related to Windows-1252 yet.

You can add one yourself here.

We don't have any Books related to Windows-1252 yet.

You can add one yourself here.

We don't have any archived web articles related to Windows-1252 yet.

You can submit a link to a page to archive here.

Name

It is known to Windows by the code page number 1252, and by the IANA-approved name "windows-1252".

Historically, the phrase "ANSI Code Page" was used in Windows to refer to non-DOS encodings; the intention was that most of these would be ANSI standards such as ISO-8859-1. Even though Windows-1252 was the first and by far most popular code page named so in Microsoft Windows parlance, the code page has never been an ANSI standard. Microsoft explains, "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community."¹⁰

LaTeX can input Windows-1252 by using inputenc.sty with parameter ansinew (and more recently cp1252). ¹¹ ¹²

IBM uses code page 1252 (CCSID 1252 and euro sign extended CCSID 5348) for Windows-1252.¹³ ¹⁴ ¹⁵

It is called "WE8MSWIN1252" by Oracle Database.¹⁶

History

The first version of the codepage was used in Microsoft Windows 1.0. It matched the ISO-8859-1 standard (including leaving code points 0xD7 and 0xF7 undefined, as they were not in the standard at that time).
The second version of the codepage was introduced in Microsoft Windows 2.0. In this version, code points 0xD7, 0xF7, 0x91, and 0x92 are defined.
The third version of the codepage was introduced in Microsoft Windows 3.1. It defined all code points used in the final version except the euro sign and the Z with caron character pair.
The final version (shown below) was introduced in Microsoft Windows 98.

Starting in the 1990s, many Microsoft products that could produce HTML included Windows-1252-exclusive characters, but marked the encoding as ISO-8859-1, ASCII, or undeclared. Characters exclusive to Windows-1252 would render incorrectly on non-Windows operating systems (often as question marks).¹⁷ ¹⁸ In particular, typographers' quotes—curly variants of the standard straight apostrophes and quotation marks in US-ASCII—were commonly used in files produced in Windows applications such as Microsoft Word due to the smart quotes feature, which can automatically convert straight apostrophes and quotation marks to the curly variants.¹⁹ To fix this, by 2000 most web browsers and e-mail clients treated the charsets ISO-8859-1 and US-ASCII as Windows-1252—this behavior is now required by the HTML5 specification.²⁰ Undeclared charsets in HTML are also assumed to be Windows-1252.²¹ ²²

Although Windows NT supported Unicode and attempted to encourage programs to use it, it only provided the 16-bit code units of UCS-2/UTF-16, despite the existing support for other multibyte character encodings such as Shift-JIS. As many applications preferred to use 8-bit strings, Windows-1252 remained the most popular encoding on Windows. UTF-8 has been supported since Windows 10 so this is gradually changing.

Codepage layout

The following table shows Windows-1252. Differences from ISO-8859-1 have the Unicode code point number below the character, based on the Unicode.org mapping of Windows-1252 with "best fit". A tooltip, generally available only when one points to the immediate right of the character, shows the Unicode code point name and the decimal Alt code.

Windows-1252 (CP1252)²³ ²⁴ ²⁵ ²⁶ ²⁷

€20AC

‚201A

ƒ0192

„201E

…2026

†2020

‡2021

ˆ02C6

‰2030

Š0160

‹2039

Œ0152

Ž017D

‘2018

’2019

“201C

”201D

•2022

–2013

—2014

˜02DC

™2122

š0161

›203A

œ0153

ž017E

Ÿ0178

NBSP

SHY

According to the information on Microsoft's and the Unicode Consortium's websites, positions 81, 8D, 8F, 90, and 9D are unused; however, the Windows API MultiByteToWideChar maps these to the corresponding C1 control codes. The "best fit" mapping documents this behavior, too.²⁸

Notes

External links

Microsoft's code charts for Windows-1252 ("Code Page 1252 Windows Latin 1 (ANSI)")
Unicode mapping table and code page definition with best fit mappings for Windows-1252

References

"Encoding. Living Standard". WHATWG. 13 June 2024. § 9. Legacy single-byte encodings. Retrieved 2024-06-28. https://encoding.spec.whatwg.org/ ↩
Karl-Bridge-Microsoft (2021-10-26). "Code Pages - Win32 apps". learn.microsoft.com. Retrieved 2024-10-09. https://learn.microsoft.com/en-us/windows/win32/intl/code-pages ↩
"Historical trends in the usage statistics of character encodings for websites, December 2024". w3techs.com. Retrieved 2024-12-16. https://w3techs.com/technologies/history_overview/character_encoding ↩
"Encoding". WHATWG. 27 January 2015. sec. 5.2 Names and labels. Archived from the original on 4 February 2015. Retrieved 4 February 2015. https://encoding.spec.whatwg.org/#names-and-labels ↩
"Historical trends in the usage statistics of character encodings for websites, December 2024". w3techs.com. Retrieved 2024-12-16. https://w3techs.com/technologies/history_overview/character_encoding ↩
"Frequenty Asked Questions". w3techs.com. https://w3techs.com/faq ↩
"Distribution of Character Encodings among websites that use Brazil". W3Techs. Retrieved 2024-12-16. https://w3techs.com/technologies/segmentation/sl-br-/character_encoding ↩
"Distribution of Character Encodings among websites that use .de". W3Techs. Retrieved 2024-12-16. https://w3techs.com/technologies/segmentation/tld-de-/character_encoding ↩
"Distribution of Character Encodings among websites that use German". W3Techs. Archived from the original on 4 April 2024. Retrieved 2024-12-16. https://w3techs.com/technologies/segmentation/cl-de-/character_encoding ↩
Wissink, Cathy (5 April 2002). "Unicode and Windows XP" (PDF). Microsoft. p. 1. Archived from the original (PDF) on 4 February 2015. Retrieved 4 February 2015. https://web.archive.org/web/20150204175931/http://download.microsoft.com/download/5/6/8/56803da0-e4a0-4796-a62c-ca920b73bb17/21-Unicode_WinXP.pdf ↩
"LaTeX News, Issue 28" (PDF; 379 KB). The LaTeX Project. Apr 2018. Retrieved 2024-07-27. https://www.latex-project.org/news/latex2e-news/ltnews28.pdf ↩
"Inputenc – Accept different input encodings". The LaTeX Project. 2024-02-08. Retrieved 2024-07-27. https://ctan.org/pkg/inputenc ↩
"Code page 1252 information document". IBM. 30 September 1997. Archived from the original on 2016-03-03. https://web.archive.org/web/20160303215813/http://www-01.ibm.com/software/globalization/cp/cp01252.html ↩
"CCSID 1252 information document". IBM. Archived from the original on 2016-03-26. https://web.archive.org/web/20160326201651/http://www-01.ibm.com/software/globalization/ccsid/ccsid1252.html ↩
"CCSID 5348 information document". IBM. Archived from the original on 2014-11-29. https://web.archive.org/web/20141129215139/http://www-01.ibm.com/software/globalization/ccsid/ccsid5348.html ↩
"Database Client Installation Guide". Oracle. Retrieved 2021-02-14. https://docs.oracle.com/cd/B19306_01/install.102/b14312/gblsupp.htm ↩
Texin, Tex. "Comparing Characters in Windows-1252, ISO-8859-1, ISO-8859-15". I18nQA.com. https://www.i18nqa.com/debug/table-iso8859-1-vs-windows-1252.html ↩
van Emden, Eva (28 January 2011). "How to make typographers' quotes in HTML". vancouvereditor.com. Retrieved 7 January 2024. If you use typographers' quotes without specifying the right character encoding for your HTML file, some of your viewers are going to see question marks, boxes, or other crazy symbols instead of the beautiful curly quotes you intended them to see. https://blog.vancouvereditor.com/2011/01/how-to-make-typographers-quotes-in-html.html ↩
"Smart quotes in Word". Microsoft Support. Microsoft. Retrieved 7 January 2024. https://support.microsoft.com/en-us/office/smart-quotes-in-word-702fc92e-b723-4e3d-b2cc-71dedaf2f343 ↩
"Encoding". WHATWG. 27 January 2015. sec. 5.2 Names and labels. Archived from the original on 4 February 2015. Retrieved 4 February 2015. https://encoding.spec.whatwg.org/#names-and-labels ↩
"NetWare Web Search: Understanding Character Set Encodings". Novell Documentation. Novell. if a document does not contain a CHARSET encoding value, the default encoding for HTML documents is ISO-8859-1, also known as Latin1. The default encoding for plain text documents is US-ASCII. https://www.novell.com/documentation/webserv/?page=/documentation/webserv/nsrchenu/data/a30k3eo.html ↩
Observed behavior in Chrome, this may be UTF-8 in some browsers.[original research?] /wiki/Wikipedia:No_original_research ↩
"Unicode mappings of Windows-1252 with 'Best Fit'". Unicode. Archived from the original on 4 February 2015. Retrieved 4 February 2015. https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1252.txt ↩
Code Page 01252 (PDF), IBM, 1998, archived (PDF) from the original on 27 October 2023 https://public.dhe.ibm.com/software/globalization/gcoc/attachments/CP01252.pdf ↩
Code Page (CPGID) 01252 (txt), IBM, 1998, archived from the original on 8 April 2023 https://public.dhe.ibm.com/software/globalization/gcoc/attachments/CP01252.txt ↩
International Components for Unicode (ICU), ibm-1252_P100-2000.ucm, 2002-12-03 https://github.com/unicode-org/icu/blob/master/icu4c/source/data/mappings/ibm-1252_P100-2000.ucm ↩
International Components for Unicode (ICU), ibm-5348_P100-1997.ucm, 2002-12-03 https://github.com/unicode-org/icu/blob/master/icu4c/source/data/mappings/ibm-5348_P100-1997.ucm ↩
"Unicode mappings of Windows-1252 with 'Best Fit'". Unicode. Archived from the original on 4 February 2015. Retrieved 4 February 2015. https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1252.txt ↩

Name

History

Codepage layout

See also

Notes

External links

References