VDOT (vector) -- AArch32

This is a collection of Intel®’ IA32® Software Developer's Manuals (URL of the day) and AMD' AMD64 Architecture Programmer's Manual together with the related specifications, application notes, white papers, and change logs. The collection aims to keep all available revisions. It was originally created by Michal Necasek, see OS/2 Museum.

If you have a public document, related to the IA32® specifications and missing from the collection, please mail it to me. The content of this URL and all sub-ULRs is available for convenient bulk download by rsync x86docs password "" (empty).

VDOT (vector)

BFloat16 floating-point (BF16) dot product (vector). This instruction delimits the source vectors into pairs of 16-bit BF16 elements. Within each pair, the elements in the first source vector are multiplied by the corresponding elements in the second source vector. The resulting single-precision products are then summed and added destructively to the single-precision element in the destination vector which aligns with the pair of BF16 values in the first source vector. The instruction does not update the FPSCR exception status.

It has encodings from the following instruction sets: A32 ( A1 ) and T32 ( T1 ) .

A1
(FEAT_AA32BF16)

Decode for all variants of this encoding

if !IsFeatureImplemented(FEAT_AA32BF16) then Undefined(); end; if Q == '1' && (Vd[0] == '1' || Vn[0] == '1' || Vm[0] == '1') then Undefined(); end; let d : integer = UInt(D::Vd); let n : integer = UInt(N::Vn); let m : integer = UInt(M::Vm); let regs : integer = if Q == '1' then 2 else 1;

T1
(FEAT_AA32BF16)

Decode for all variants of this encoding

if InITBlock() then UnpredictableProcedure(); end; if !IsFeatureImplemented(FEAT_AA32BF16) then Undefined(); end; if Q == '1' && (Vd[0] == '1' || Vn[0] == '1' || Vm[0] == '1') then Undefined(); end; let d : integer = UInt(D::Vd); let n : integer = UInt(N::Vn); let m : integer = UInt(M::Vm); let regs : integer = if Q == '1' then 2 else 1;

Assembler Symbols

<q>	See Standard assembler syntax fields.

<Dd>	Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field.

<Dn>	Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field.

<Dm>	Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field.

<Qd>	Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2.

<Qn>	Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2.

<Qm>	Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2.

Operation

CheckAdvSIMDEnabled(); let fpcr : FPCR_Type = StandardFPCR(); var operand1 : bits(64); var operand2 : bits(64); var result : bits(64); for r = 0 to regs-1 do operand1 = Din(n+r); operand2 = Din(m+r); result = Din(d+r); for e = 0 to 1 do let elt1_a : bits(16) = operand1[(2 * e + 0)*:16]; let elt1_b : bits(16) = operand1[(2 * e + 1)*:16]; let elt2_a : bits(16) = operand2[(2 * e + 0)*:16]; let elt2_b : bits(16) = operand2[(2 * e + 1)*:16]; let sum : bits(32) = FPAdd_BF16(BFMulH(elt1_a, elt2_a, fpcr), BFMulH(elt1_b, elt2_b, fpcr), fpcr); result[e*:32] = FPAdd_BF16(result[e*:32], sum, fpcr); end; D(d+r) = result; end;

2026-03_rel 2026-03-26 20:48:11

31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	1	1	1	1	1	0	0	0	D	0	0	Vn				Vd				1	1	0	1	N	Q	M	0	Vm
							op1			op2											op3		op4				U

15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	1	1	1	1	1	0	0	0	D	0	0	Vn				Vd				1	1	0	1	N	Q	M	0	Vm
							op1			op2											op3		op4				U

VDOT (vector)

A1
(FEAT_AA32BF16)

Encoding for the 64-bit SIMD vector variant

Encoding for the 128-bit SIMD vector variant

Decode for all variants of this encoding

T1
(FEAT_AA32BF16)

Encoding for the 64-bit SIMD vector variant

Encoding for the 128-bit SIMD vector variant

Decode for all variants of this encoding

Assembler Symbols

Operation

31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	1	1	1	1	1	0	0	0	D	0	0	Vn				Vd				1	1	0	1	N	Q	M	0	Vm
							op1			op2											op3		op4				U

15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	1	1	1	1	1	0	0	0	D	0	0	Vn				Vd				1	1	0	1	N	Q	M	0	Vm
							op1			op2											op3		op4				U

31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	1	1	1	1	1	0	0	0	D	0	0	Vn				Vd				1	1	0	1	N	Q	M	0	Vm
							op1			op2											op3		op4				U

15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	1	1	1	1	1	0	0	0	D	0	0	Vn				Vd				1	1	0	1	N	Q	M	0	Vm
							op1			op2											op3		op4				U

VDOT (vector)

A1(FEAT_AA32BF16)

Encoding for the 64-bit SIMD vector variant

Encoding for the 128-bit SIMD vector variant

Decode for all variants of this encoding

T1(FEAT_AA32BF16)

Encoding for the 64-bit SIMD vector variant

Encoding for the 128-bit SIMD vector variant

Decode for all variants of this encoding

Assembler Symbols

Operation

A1
(FEAT_AA32BF16)

T1
(FEAT_AA32BF16)

31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	1	1	1	1	1	0	0	0	D	0	0	Vn				Vd				1	1	0	1	N	Q	M	0	Vm
							op1			op2											op3		op4				U

15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	1	1	1	1	1	0	0	0	D	0	0	Vn				Vd				1	1	0	1	N	Q	M	0	Vm
							op1			op2											op3		op4				U