This is a collection of Intel®’ IA32® Software Developer's Manuals (URL of the day) and AMD' AMD64 Architecture Programmer's Manual together with the related specifications, application notes, white papers, and change logs. The collection aims to keep all available revisions. It was originally created by Michal Necasek, see OS/2 Museum.

If you have a public document, related to the IA32® specifications and missing from the collection, please mail it to me. The content of this URL and all sub-ULRs is available for convenient bulk download by rsync x86docs password "" (empty).

VUMMLA -- AArch32

VUMMLA

The widening integer matrix multiply-accumulate instruction multiplies the 2x8 matrix of unsigned 8-bit integer values held in the first source vector by the 8x2 matrix of unsigned 8-bit integer values in the second source vector. The resulting 2x2 32-bit integer matrix product is destructively added to the 32-bit integer matrix accumulator held in the destination vector. This is equivalent to performing an 8-way dot product per destination element.

From Armv8.2, this is an OPTIONAL instruction. ID_ISAR6.I8MM indicates whether this instruction is supported in the T32 and A32 instruction sets.

It has encodings from the following instruction sets: A32 ( A1 ) and T32 ( T1 ) .

A1
(FEAT_AA32I8MM)

313029282726252423222120191817161514131211109876543210
111111000D10VnVd1100N1M1Vm
Bop2op3op4QU

Encoding

VUMMLA{<q>}.U8 <Qd>, <Qn>, <Qm>

Decode for this encoding

if !IsFeatureImplemented(FEAT_AA32I8MM) then Undefined(); end; var op1_unsigned : boolean; var op2_unsigned : boolean; case B::U of when '00' => op1_unsigned = FALSE; op2_unsigned = FALSE; when '01' => op1_unsigned = TRUE; op2_unsigned = TRUE; when '10' => op1_unsigned = TRUE; op2_unsigned = FALSE; when '11' => Undefined(); end; if Vd[0] == '1' || Vn[0] == '1' || Vm[0] == '1' then Undefined(); end; let d : integer = UInt(D::Vd); let n : integer = UInt(N::Vn); let m : integer = UInt(M::Vm);

T1
(FEAT_AA32I8MM)

15141312111098765432101514131211109876543210
111111000D10VnVd1100N1M1Vm
Bop2op3op4QU

Encoding

VUMMLA{<q>}.U8 <Qd>, <Qn>, <Qm>

Decode for this encoding

if InITBlock() then UnpredictableProcedure(); end; if !IsFeatureImplemented(FEAT_AA32I8MM) then Undefined(); end; var op1_unsigned : boolean; var op2_unsigned : boolean; case B::U of when '00' => op1_unsigned = FALSE; op2_unsigned = FALSE; when '01' => op1_unsigned = TRUE; op2_unsigned = TRUE; when '10' => op1_unsigned = TRUE; op2_unsigned = FALSE; when '11' => Undefined(); end; if Vd[0] == '1' || Vn[0] == '1' || Vm[0] == '1' then Undefined(); end; let d : integer = UInt(D::Vd); let n : integer = UInt(N::Vn); let m : integer = UInt(M::Vm);

Assembler Symbols

<q>

See Standard assembler syntax fields.

<Qd>

Is the 128-bit name of the SIMD&FP third source and destination register, encoded in the "D:Vd" field as <Qd>*2.

<Qn>

Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2.

<Qm>

Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2.

Operation

CheckAdvSIMDEnabled(); let operand1 : bits(128) = Q(n>>1); let operand2 : bits(128) = Q(m>>1); let addend : bits(128) = Q(d>>1); Q(d>>1) = MatMulAdd(addend, operand1, operand2, op1_unsigned, op2_unsigned);


2026-03_rel 2026-03-26 20:48:11

Copyright © 2010-2026 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.