The header of an MHTML file contains metadata such as a date and time stamp, page title, the source URL, and a unique randomized boundary string for separating resources contained within the file. The boundary string is defined at the beginning and used throughout the file.
Then, the page resources are contained sequentially, starting with the page's rendered HTML source code. Each resource has its own metadata header which specifies its MIME type and the original location.
The MHTML file ends with a boundary string that is not followed by any data.2
MIME type for MHTML is not well agreed upon. Used MIME types include:
Some browsers support the MHTML format, either directly or through third-party extensions, but the process for saving a web page along with its resources as an MHTML file is not standardized. Due to this, a web page saved as an MHTML file using one browser may render differently on another.
In May 2015, a researcher noted that attackers could build malicious documents by creating an MHT file, appending an MSO object at the end (MSO is a file format used by the Microsoft Outlook e-mail application), and renaming the resulting file with a .doc extension.7 The delivery method would be by spam emails.8
In April 2019, a security researcher published details about an XML external entity (XXE) vulnerability that could be exploited when a user opens an MHT file. Since the Windows operating system is set to automatically open all MHT files, by default, in Internet Explorer, the exploit could be triggered when a user double-clicked on a file that they received via email, instant messaging, or another vector, including a different browser.9
The data URI scheme offers an alternative for including separate elements such as images, style-sheets and scripts in-line when serving an HTML request or saving an HTML resource for offline use. Like the embedded content within MHTML, data URIs use Base64 encoding of the external resources (which may be binary or text) to embed them in-line within the HTML markup. HTML pages saved with external elements embedded using the data URI scheme are standard web pages, and can be opened by any modern browser, including browsers not supporting MHTML such as Mozilla Firefox.10 Unlike MHTML, saving web pages with their external resources embedded using data URIs requires a third-party extension to be installed in the browser.11
The Mozilla Archive Format (MAFF) is a legacy Web archive file format that was supported by Firefox from 2004 to 2018 through an add-on.12 Unlike both MHTML and data URIs, MAFF uses a ZIP container to preserve both the HTML file and its external elements. In October 2017 the add-on developer announced the format would no longer be supported in future versions of Firefox.13
Holden, Amanda. "Difference of HTML & MHTML". Archived from the original on 17 November 2017. Retrieved 17 November 2017. https://web.archive.org/web/20171117122700/https://www.techwalla.com/articles/difference-of-html-mhtml ↩
"2. The MHTML File Format - Hunchly Knowledge Base". support.hunch.ly. October 17, 2018. Retrieved 24 September 2022. https://support.hunch.ly/article/51-1-the-mhtml-file-format ↩
Santambrogio, Claudio (10 March 2006). "…and one more weekly!". Opera Software. Archived from the original on 15 January 2010. Retrieved 2009-05-15. https://web.archive.org/web/20100115001636/http://my.opera.com/desktopteam/blog/show.dml/172375 ↩
février 6, Publié sur; Tetzchner, 2019-Par Jon von (2019-02-06). "Vivaldi Update | Auto-Stacking Tabs". Vivaldi (in French). Retrieved 2019-05-16.{{cite web}}: CS1 maint: numeric names: authors list (link) https://vivaldi.com/fr/blog/auto-stacking-tabs/ ↩
"Bug 40873 - Save as rfc 2557 MHTML; complete webpage in one file". https://bugzilla.mozilla.org/show_bug.cgi?id=40873 ↩
"NEWS · master · GNOME / Epiphany". 28 July 2023. https://gitlab.gnome.org/GNOME/epiphany/blob/master/NEWS#L1061 ↩
Kovacs, Eduard (May 11, 2015). "Attackers Hide Malicious Macros in MHTML Documents". SecurityWeek.Com. Retrieved April 19, 2019. https://www.securityweek.com/attackers-hide-malicious-macros-mhtml-documents ↩
Mosuela, Lordian (July 10, 2015). "New Tricks of Macro Malware". Cyren. Retrieved April 19, 2019. https://www.cyren.com/blog/articles/new-tricks-of-macro-malware ↩
Cimpanu, Catalin (April 12, 2019). "Internet Explorer zero-day lets hackers steal files from Windows PCs". ZDNet. Retrieved April 19, 2019. https://www.zdnet.com/article/internet-explorer-zero-day-lets-hackers-steal-files-from-windows-pcs/ ↩
"Data URLs - HTTP". MDN. Retrieved April 2, 2023. https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs#browser_compatibility ↩
Brinkmann, Martin (September 3, 2018). "Save any webpage as a single file in Chrome or Firefox - gHacks Tech News". ghacks.net. Retrieved April 2, 2023. https://www.ghacks.net/2018/09/03/save-any-webpage-as-a-single-file-in-chrome-or-firefox/ ↩
"Mozilla Archive Format Add-on - File Format Overview". amadzone. Retrieved April 2, 2023. https://www.amadzone.org/mozilla-archive-format/ ↩
"Firefox Addon: MAF - Mozilla Archive Format". Archived from the original on 2 November 2017. Retrieved 2 April 2023. https://web.archive.org/web/20171102005204/https://addons.mozilla.org/en-US/firefox/addon/mozilla-archive-format/ ↩