Comment
Author: Admin | 2025-04-27
Web. Dumb.) In contrast AMD/Xilinx do not presently implement hard block floating point support in their FPGA fabric, instead providing Versal A.I.Engine (AIE) vector processor arrays.MX Shared Microexponents, ISCA 2023Rouhani et al, With Shared Microexponents, A Little Shifting Goes a Long Way. DOI (PDF paywalled). PDF (arxiv.org).This paper describes a framework Block Data Representations (BDR) to compare the efficiency and quantization loss over a diverse design space of various vector block quantization granularities, with various hierarchies of shared scaling factors, including the 1-level Brainwave block floating point (BFP) formats MSFP*, and novel 2-level shared microexponents (MX) formats, including MX4, MX6, and MX9.These MX formats are not the MX Alliance formats! Consider Table 2:If I understand correctly, MX9 is a block of k1=16 elements, all scaled by 2d1, d1 is 8b, further subdivided into 8 subblocks of k2=2 elements, further scaled by 2d2, d2 is 1b, each element being sign × 7b magnitude. In all an MX9 of 16 elements occupies (8b + 8×1b + 16×8b)/16 = 9b per element.The shared microexponents here contrasts with the present MX Alliance spec, in which each element in an MXINT8, MXFP8 (E5M2 or E4M3), MXFP6 (E3M2 or E2M3), and MXFP4 (E2M1) has a standalone private microexponent (0, 5, 4, 3, 2, 2 bits, respectively).Or, if you prefer, through the lens of this paper, the MX Alliance MXFP[468] formats are 2-level designs with k1=32 elements, with d1 is 8b, and k2=1 element, with d2 variously 5, 4, 3, 2, 2 bits, respectively.I wonder what transpired between this ISCA2023 paper and the OCP MX Alliance spec and announcement. I imagine intense multiparty discussions, “can we all please get behind a common MX data schema that evaluates reasonably well on our existing diverse CPUs, GPUs, AIEs, NPUs, etc., that we can evolve future silicon towards, so that
Add Comment