AMX was introduced by Intel in June 2020 and first supported by Intel with the Sapphire Rapids microarchitecture for Xeon servers, released in January 2023.23 It introduced 2-dimensional registers called tiles upon which accelerators can perform operations. It is intended as an extensible architecture; the first accelerator implemented is called tile matrix multiply unit (TMUL).45
In Intel Architecture Instruction Set Extensions and Future Features revision 46, published in September 2022, a new AMX-FP16 extension was documented. This extension adds support for half-precision floating-point numbers. In revision 48 from March 2023, AMX-COMPLEX was documented, adding support for half-precision floating-point complex numbers. Both extensions are available in the Granite Rapids set of server processors (with AMX-COMPLEX support only being available in Granite Rapids-D6).
TMUL unit supports BF16 and INT8 input types.7 AMX-FP16 and AMX-COMPLEX also add support for real and complex FP16 numbers. The register file consists of 8 tiles, each with 16 rows of size of 64 bytes (32 BF16/FP16 or 64 INT8 elements). The only supported operation is matrix multiplication C n m + = ∑ j = 1 J A n j B j m . {\textstyle C_{nm}+=\sum _{j=1}^{J}A_{nj}B_{jm}.} 8
4th Gen Intel Xeon Scalable processor can perform 2048 INT8 or 1024 BF16 operations per cycle:910 the maximal input sizes are 16 × J {\textstyle 16\times J} for A and J × 16 {\textstyle J\times 16} for B, where J is 64 for INT8 and 32 for BF16. The matrix multiplication requires 256 J {\textstyle 256J} multiplication and 256 J {\textstyle 256J} additions, thus performing 512 J {\textstyle 512J} operations in 16 cycles.11
Hemsoth, Nicole (August 19, 2021). "With AMX, Intel Adds AI/ML Sparkle to Sapphire Rapids". The Next Platform. https://www.nextplatform.com/2021/08/19/with-amx-intel-adds-ai-ml-sparkle-to-sapphire-rapids/ ↩
online, heise (28 June 2020). "Intel AMX: Erste Informationen zur Advanced Matrix Extensions Architecture". heise online. https://www.heise.de/news/Intel-AMX-Erste-Informationen-zur-Advanced-Matrix-Extensions-Architecture-4797415.html ↩
Cutress, Ian. "Intel Xeon Sapphire Rapids: How To Go Monolithic with Tiles". AnandTech. https://www.anandtech.com/show/16921/intel-sapphire-rapids-nextgen-xeon-scalable-gets-a-tiling-upgrade ↩
"Intel® Architecture Instruction Set Extensions and Future Features". https://www.intel.com/content/www/us/en/content-details/790021/intel-architecture-instruction-set-extensions-programming-reference.html ↩
Schor, David (June 29, 2020). "The x86 Advanced Matrix Extension (AMX) Brings Matrix Operations; To Debut with Sapphire Rapids". https://fuse.wikichip.org/news/3600/the-x86-advanced-matrix-extension-amx-brings-matrix-operations-to-debut-with-sapphire-rapids/ ↩
Larabel, Michael (July 12, 2023). "Intel Granite Rapids D Support Merged Into GCC 14". Phoronix. https://www.phoronix.com/news/Granite-Rapids-D-GCC-14 ↩
"Advanced Matrix Extension (AMX) - x86 - WikiChip". en.wikichip.org. https://en.wikichip.org/wiki/x86/amx ↩
"Accelerate Artificial Intelligence (AI) Workloads with Intel Advanced Matrix Extensions (Intel AMX)" (PDF). Intel. Retrieved 2023-04-13. https://www.intel.com/content/dam/www/central-libraries/us/en/documents/2022-12/accelerate-ai-with-amx-sb.pdf ↩
"Intel® 64 and IA-32 Architectures Optimization Reference Manual Volume 1". Intel. https://www.intel.com/content/www/us/en/content-details/671488/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html ↩
"What's New in LLVM for 4th Gen Intel® Xeon® & Max Series CPUs". Retrieved 21 April 2023. https://www.intel.com/content/www/us/en/developer/articles/technical/whats-new-in-llvm-for-4th-gen-intel-xeon-processor.html ↩
Larabel, Michael (2020-07-02). "Intel AMX Support Begins Landing In LLVM". Phoronix. Retrieved 2020-07-02. https://www.phoronix.com/scan.php?page=news_item&px=Intel-AMX-LLVM-Starts ↩
"[X86-64] Support Intel AMX instructions". GitHub. 2020-07-02. Retrieved 2020-07-02. https://github.com/llvm/llvm-project/commit/aded4f0cc070fcef6763c9a3c2ba764d652b692e ↩
Larabel, Michael (2020-07-02). "Intel AMX Support Lands In The GNU Assembler". Phoronix. Retrieved 2020-07-02. https://www.phoronix.com/scan.php?page=news_item&px=Intel-AMX-Gas ↩
"GCC 11 Release Series — Changes, New Features, and Fixes - GNU Project". Retrieved 21 April 2023. https://gcc.gnu.org/gcc-11/changes.html ↩
"[PATCH] Enable GCC support for AMX". 2020-07-06. Retrieved 2020-07-09. https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549415.html ↩
"Enable GCC support for AMX-TILE, AMX-INT8, AMX-BF16. · gcc-mirror/gcc@5c60984". GitHub. Retrieved 2022-09-05. https://github.com/gcc-mirror/gcc/commit/5c609842d13a4c9c6be1a10f7980a74d27daeb85 ↩
"commits with Intel AMX". 2020-07-02. Retrieved 2020-07-02. https://sourceware.org/git/?p=binutils-gdb.git&a=search&st=commit&s=Intel+AMX ↩
"x86: Detect Intel Advanced Matrix Extensions". 2020-07-02. Retrieved 2020-07-02. https://sourceware.org/git/?p=glibc.git;a=commit;h=4fdd4d41a17dda26c854ed935658154a17d4b906 ↩
"Linux 5.16 Features Include FUTEX2, Intel AMX, Folios, DG2/Alchemist, More Apple Silicon Support". Phoronix. https://www.phoronix.com/review/linux-516-features ↩
"Accessing Sapphire Rapids AMX instructions on vSphere". Earl C. Ruby III. 2023-08-24. https://earlruby.org/2023/08/accessing-sapphire-rapids-amx-instructions-on-vsphere/ ↩