coopVecOuterProductAccumulate
Description
Atomically accumulates the outer product of two cooperative vectors into a matrix. Given an M-element vector a, and an N-element vector b, compute the outer product of a and b, forming a M-row by N-col matrix. The elements in the matrix is then atomically accumulated to memory location represented by matrix.
Signature
/// Requires Capability Set 1: void coopVecOuterProductAccumulate<T, int M, int N>( CoopVec<T, M> a, CoopVec<T, N> b, RWByteAddressBuffer matrix, int matrixOffset, uint matrixStride, CoopVecMatrixLayout memoryLayout, CoopVecComponentType matrixInterpretation) where T : __BuiltinArithmeticType; /// Requires Capability Set 2: void coopVecOuterProductAccumulate<T, int M, int N, IgnoredBufferElementType>( CoopVec<T, M> a, CoopVec<T, N> b, RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout> matrix, int matrixOffset, uint matrixStride, CoopVecMatrixLayout memoryLayout, CoopVecComponentType matrixInterpretation) where T : __BuiltinArithmeticType; /// Requires Capability Set 2: void coopVecOuterProductAccumulate<T, int M, int N, U, int IgnoredBufferSize>( CoopVec<T, M> a, CoopVec<T, N> b, U[IgnoredBufferSize] matrix, int matrixOffset, uint matrixStride, CoopVecMatrixLayout memoryLayout, CoopVecComponentType matrixInterpretation) where T : __BuiltinArithmeticType where U : __BuiltinArithmeticType; /// Requires Capability Set 2: void coopVecOuterProductAccumulate<T, int M, int N>( CoopVec<T, M> a, CoopVec<T, N> b, Ptr<void> matrixPtr, uint matrixStride, CoopVecMatrixLayout memoryLayout, CoopVecComponentType matrixInterpretation) where T : __BuiltinArithmeticType;
Generic Parameters
T: __BuiltinArithmeticType
M : int
N : int
IgnoredBufferElementType
U: __BuiltinArithmeticType
IgnoredBufferSize : int
Parameters
a : CoopVec<T, M>
The first cooperative vector.
b : CoopVec<T, N>
The second cooperative vector.
matrix : RWByteAddressBuffer
The matrix buffer to accumulate the result into.
matrixOffset : int
Byte offset into the matrix buffer.
matrixStride : uint
The stride between matrix rows/columns in bytes.
memoryLayout : CoopVecMatrixLayout
Specifies the memory layout of the matrix (row-major or column-major).
matrixInterpretation : CoopVecComponentType
Specifies how to interpret the values in the matrix.
matrix : RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout>
The matrix buffer to accumulate the result into.
matrix : U [ IgnoredBufferSize ]
The matrix buffer to accumulate the result into.
matrixPtr : Ptr<void>
Remarks
On current hardware, memoryLayout must be TrainingOptimal.
When memoryLayout is RowMajor, this function is equivalent to:
uint8_t* matrixPtr = matrix + matrixOffset;
for (int i = 0; i < M; i++)
{
for (int j = 0; j < N; j++)
{
let elem = a[i] * b[j];
atomicAdd(matrixPtr + i * matrixStride + j * sizeof(T), elem);
}
}
Availability and Requirements
Capability Set 1
Defined for the following targets:
hlsl
Available in all stages.
Requires capability: hlsl_coopvec_poc
.
cpp
Available in all stages.
cuda
Available in all stages.
Requires capability: optix_coopvec
.
spirv
Available in all stages.
Requires capability: spvCooperativeVectorNV
.
Capability Set 2
Defined for the following targets:
spirv
Available in all stages.
Requires capability: spvCooperativeVectorTrainingNV
.