• Slang Standard Library Reference
    • Interfaces
    • Types
    • Attributes
    • Global Declarations
      • Atomic functions
      • Memory and control barriers
      • Bit operation functions
      • Conversion functions
      • Derivative functions
      • Vertex Interpolation Functions
      • Math functions
        • abs
        • acos
        • acosh
        • asin
        • asinh
        • atan
        • atan2
        • atanh
        • ceil
        • clamp
        • copysign
        • copysign_double
        • copysign_float
        • copysign_half
        • cos
        • cosh
        • cospi
        • cross
        • degrees
        • determinant
        • distance
        • divide
        • dot
        • dot2add
        • dot4add_i8packed
        • dot4add_u8packed
        • dst
        • exp
        • exp10
        • exp2
        • fabs
        • faceforward
        • fdim
        • floor
        • fma
        • fmax
        • fmax3
        • fmedian3
        • fmin
        • fmin3
        • fmod
        • frac
        • fract
        • frexp
        • isfinite
        • isinf
        • isnan
        • ldexp
        • length
        • lerp
        • lit
        • log
        • log10
        • log2
        • mad
        • max
        • max3
        • median3
        • min
        • min3
        • modf
        • msad4
        • mul
        • normalize
        • pow
        • powr
        • radians
        • rcp
        • reflect
        • refract
        • rint
        • round
        • rsqrt
        • saturate
        • sign
        • sin
        • sincos
        • sinh
        • sinpi
        • smoothstep
        • sqrt
        • step
        • tan
        • tanh
        • tanpi
        • transpose
        • trunc
      • Mesh shading
      • Ray-tracing
      • Tessellation functions
      • Wave and quad functions
      • CheckAccessFullyMapped
      • D3DCOLORtoUBYTE4
      • GetAttributeAtVertex
      • GetRenderTargetSampleCount
      • GetRenderTargetSamplePosition
      • IsHelperLane
      • NonUniformResourceIndex
      • QuadAll
      • QuadAny
      • ReorderThread
      • WaveClusteredRotate
      • WaveRotate
      • WorkgroupSize
      • abort
      • all
      • any
      • bitfieldExtract
      • bitfieldInsert
      • clip
      • clock2x32ARB
      • clockARB
      • concat
      • coopVecLoad
      • coopVecLoadGroupshared
      • coopVecMatMul
      • coopVecMatMulAdd
      • coopVecMatMulAddPacked
      • coopVecMatMulPacked
      • coopVecOuterProductAccumulate
      • coopVecReduceSumAccumulate
      • createDynamicObject
      • cudaBlockDim
      • cudaBlockIdx
      • cudaThreadIdx
      • debugBreak
      • defaultGetDescriptorFromHandle
      • detach
      • diffPair
      • getDescriptorFromHandle
      • getRealtimeClock
      • getRealtimeClockLow
      • getStringHash
      • isDifferentialNull
      • loadAligned
      • makeArrayFromElement
      • makeTuple
      • nextafter
      • nonuniform
      • operator*
      • packHalf2x16
      • packInt4x8
      • packInt4x8Clamp
      • packSnorm2x16
      • packSnorm4x8
      • packUint4x8
      • packUint4x8Clamp
      • packUnorm2x16
      • packUnorm4x8
      • pack_clamp_s8
      • pack_clamp_u8
      • pack_s8
      • pack_u8
      • printf
      • select
      • static_assert
      • storeAligned
      • syncTorchCudaStream
      • unmodified
      • unpackHalf2x16ToFloat
      • unpackHalf2x16ToHalf
      • unpackInt4x8ToInt16
      • unpackInt4x8ToInt32
      • unpackSnorm2x16ToFloat
      • unpackSnorm2x16ToHalf
      • unpackSnorm4x8ToFloat
      • unpackSnorm4x8ToHalf
      • unpackUint4x8ToUint16
      • unpackUint4x8ToUint32
      • unpackUnorm2x16ToFloat
      • unpackUnorm2x16ToHalf
      • unpackUnorm4x8ToFloat
      • unpackUnorm4x8ToHalf
      • unpack_s8s16
      • unpack_s8s32
      • unpack_u8u16
      • unpack_u8u32
      • unused
      • updateDiff
      • updatePair
      • updatePrimal
      • workgroupUniformLoad

dot4add_u8packed

Description

Treats x and y as 4-component vectors of UInt8 and computes dot(x, y)+acc

Signature

uint dot4add_u8packed(
    uint x,
    uint y,
    uint acc);

Parameters

x : uint

y : uint

acc : uint

Availability and Requirements

Defined for the following targets:

hlsl

Available in all stages.

glsl

Available in all stages.

cpp

Available in all stages.

cuda

Available in all stages.

metal

Available in all stages.

wgsl

Available in all stages.

spirv

Available in all stages.

Requires capabilities: SPV_KHR_non_semantic_info, SPV_GOOGLE_user_type, spvDerivativeControl, spvImageQuery, spvImageGatherExtended, spvSparseResidency, spvMinLod, spvFragmentBarycentricKHR, spvFragmentFullyCoveredEXT, spvGroupNonUniformBallot, spvGroupNonUniformShuffle, spvGroupNonUniformArithmetic, spvGroupNonUniformQuad, spvGroupNonUniformVote, spvRayTracingPositionFetchKHR, spvShaderNonUniformEXT.