• Slang Standard Library Reference
    • Interfaces
      • IArithmetic
        • add
        • div
        • init
        • mod
        • mul
        • neg
        • sub
      • IArithmeticAtomicable
      • IArray
        • getCount
        • subscript
      • IAtomicable
      • IBitAtomicable
      • IBufferDataLayout
      • IComparable
        • equals
        • lessThan
        • lessThanOrEquals
      • IDefaultInitializable
        • init
      • IDiffTensorWrapper
        • loadOnce_backward
        • loadOnce_forward
        • load_backward
        • load_forward
        • storeOnce_backward
        • storeOnce_forward
        • store_backward
        • store_forward
      • IDifferentiable
        • dadd
        • dmul
        • dzero
      • IDifferentiableFunc
        • operator()
      • IDifferentiableMutatingFunc
        • operator()
      • IDifferentiablePtrType
      • IFloat
        • add
        • div
        • init
        • mod
        • mul
        • neg
        • scale
        • sub
        • toFloat
      • IFunc
        • operator()
      • IInteger
        • init
        • toInt
        • toInt64
        • toUInt
        • toUInt64
      • ILogical
        • and
        • bitAnd
        • bitNot
        • bitOr
        • bitXor
        • init
        • not
        • or
        • shl
        • shr
      • IMutatingFunc
        • operator()
      • IOpaqueDescriptor
        • descriptorAccess
        • kind
      • IPhysicalBuffer
        • GetBufferPointer
        • LoadByteOffset
      • IRWArray
        • subscript
      • IRWPhysicalBuffer
        • StoreByteOffset
      • IRangedValue
        • maxValue
        • minValue
      • ITexelElement
        • elementCount
        • init
      • __BuiltinArithmeticType
      • __BuiltinFloatingPointType
        • getPi
      • __BuiltinIntegerType
      • __BuiltinLogicalType
      • __ITextureShape
        • dimensions
        • flavor
        • planeDimensions
      • __ITextureShape1D2D3D
    • Types
      • Buffer types
        • AppendStructuredBuffer
          • Append
          • GetDimensions
          • Handle
          • descriptorAccess
          • init
          • kind
        • ByteAddressBuffer
          • GetBufferPointer
          • GetDimensions
          • Handle
          • Load
          • Load2
          • Load2Aligned
          • Load3
          • Load3Aligned
          • Load4
          • Load4Aligned
          • LoadAligned
          • LoadByteOffset
          • descriptorAccess
          • init
          • kind
        • ConsumeStructuredBuffer
          • Consume
          • GetDimensions
          • Handle
          • descriptorAccess
          • init
          • kind
        • RWByteAddressBuffer
          • GetBufferPointer
          • GetDimensions
          • Handle
          • InterlockedAdd
          • InterlockedAdd64
          • InterlockedAddF16
          • InterlockedAddF16Emulated
          • InterlockedAddF32
          • InterlockedAddF64
          • InterlockedAddI64
          • InterlockedAddU64
          • InterlockedAnd
          • InterlockedAnd64
          • InterlockedAndU64
          • InterlockedCompareExchange
          • InterlockedCompareExchange64
          • InterlockedCompareExchangeFloatBitwise
          • InterlockedCompareExchangeU64
          • InterlockedCompareStore
          • InterlockedCompareStore64
          • InterlockedCompareStoreFloatBitwise
          • InterlockedExchange
          • InterlockedExchange64
          • InterlockedExchangeFloat
          • InterlockedExchangeU64
          • InterlockedMax
          • InterlockedMax64
          • InterlockedMaxU64
          • InterlockedMin
          • InterlockedMin64
          • InterlockedMinU64
          • InterlockedOr
          • InterlockedOr64
          • InterlockedOrU64
          • InterlockedXor
          • InterlockedXor64
          • InterlockedXorU64
          • Load
          • Load2
          • Load2Aligned
          • Load3
          • Load3Aligned
          • Load4
          • Load4Aligned
          • LoadAligned
          • LoadByteOffset
          • Store
          • Store2
          • Store2Aligned
          • Store3
          • Store3Aligned
          • Store4
          • Store4Aligned
          • StoreAligned
          • StoreByteOffset
          • _NvInterlockedAddFp16x2
          • descriptorAccess
          • init
          • kind
        • RWStructuredBuffer
          • DecrementCounter
          • GetDimensions
          • Handle
          • IncrementCounter
          • Load
          • descriptorAccess
          • getCount
          • init
          • kind
          • subscript
        • RasterizerOrderedByteAddressBuffer
          • GetDimensions
          • Handle
          • InterlockedAdd
          • InterlockedAnd
          • InterlockedCompareExchange
          • InterlockedCompareStore
          • InterlockedExchange
          • InterlockedMax
          • InterlockedMin
          • InterlockedOr
          • InterlockedXor
          • Load
          • Load2
          • Load2Aligned
          • Load3
          • Load3Aligned
          • Load4
          • Load4Aligned
          • LoadAligned
          • Store
          • Store2
          • Store2Aligned
          • Store3
          • Store3Aligned
          • Store4
          • Store4Aligned
          • StoreAligned
          • descriptorAccess
          • init
          • kind
        • RasterizerOrderedStructuredBuffer
          • DecrementCounter
          • GetDimensions
          • Handle
          • IncrementCounter
          • Load
          • descriptorAccess
          • getCount
          • init
          • kind
          • subscript
        • StructuredBuffer
          • GetDimensions
          • Handle
          • Load
          • descriptorAccess
          • getCount
          • init
          • kind
          • subscript
      • Math types
        • matrix
          • Differential
          • T
          • add
          • dadd
          • div
          • dmul
          • dzero
          • equals
          • getCount
          • init
          • lessThan
          • lessThanOrEquals
          • mod
          • mul
          • neg
          • scale
          • sub
          • toFloat
        • vector
          • Differential
          • Element
          • add
          • and
          • bitAnd
          • bitNot
          • bitOr
          • bitXor
          • dadd
          • div
          • dmul
          • dzero
          • elementCount
          • equals
          • getCount
          • init
          • lessThan
          • lessThanOrEquals
          • mod
          • mul
          • neg
          • not
          • or
          • scale
          • shl
          • shr
          • sub
          • toFloat
          • toInt
          • toInt64
          • toUInt
          • toUInt64
      • Miscelaneous types
        • DefaultDataLayout
        • MemoryOrder
        • NativeString
          • getBuffer
          • getLength
          • init
          • length
        • ScalarDataLayout
        • SideEffectBehavior
        • Std140DataLayout
        • Std430DataLayout
        • __Shape1D
          • dimensions
          • flavor
          • planeDimensions
        • __Shape2D
          • dimensions
          • flavor
          • planeDimensions
        • __Shape3D
          • dimensions
          • flavor
          • planeDimensions
        • __ShapeBuffer
          • dimensions
          • flavor
          • planeDimensions
        • __ShapeCube
          • dimensions
          • flavor
          • planeDimensions
        • string
      • Ray-tracing
        • BuiltInTriangleIntersectionAttributes
          • barycentrics
        • CANDIDATE_TYPE
        • COMMITTED_STATUS
        • HitObject
          • GetAttributes
          • GetClusterID
          • GetCurrentTime
          • GetGeometryIndex
          • GetHitKind
          • GetInstanceID
          • GetInstanceIndex
          • GetObjectRayDirection
          • GetObjectRayOrigin
          • GetObjectToWorld
          • GetPrimitiveIndex
          • GetRayDesc
          • GetShaderRecordBufferHandle
          • GetShaderTableIndex
          • GetWorldToObject
          • Invoke
          • IsHit
          • IsMiss
          • IsNop
          • LoadLocalRootTableConstant
          • MakeHit
          • MakeMiss
          • MakeMotionHit
          • MakeMotionMiss
          • MakeNop
          • TraceMotionRay
          • TraceRay
          • init
        • RAY_FLAG
        • RayDesc
          • Direction
          • Origin
          • TMax
          • TMin
        • RayQuery
          • Abort
          • CandidateClusterID
          • CandidateGeometryIndex
          • CandidateGetIntersectionTriangleVertexPositions
          • CandidateInstanceContributionToHitGroupIndex
          • CandidateInstanceID
          • CandidateInstanceIndex
          • CandidateObjectRayDirection
          • CandidateObjectRayOrigin
          • CandidateObjectToWorld3x4
          • CandidateObjectToWorld4x3
          • CandidatePrimitiveIndex
          • CandidateProceduralPrimitiveNonOpaque
          • CandidateRayBarycentrics
          • CandidateRayFrontFace
          • CandidateRayGeometryIndex
          • CandidateRayInstanceCustomIndex
          • CandidateRayInstanceId
          • CandidateRayInstanceShaderBindingTableRecordOffset
          • CandidateRayObjectRayDirection
          • CandidateRayObjectRayOrigin
          • CandidateRayObjectToWorld
          • CandidateRayPrimitiveIndex
          • CandidateRayWorldToObject
          • CandidateTriangleBarycentrics
          • CandidateTriangleFrontFace
          • CandidateTriangleRayT
          • CandidateType
          • CandidateWorldToObject3x4
          • CandidateWorldToObject4x3
          • CommitNonOpaqueTriangleHit
          • CommitProceduralPrimitiveHit
          • CommittedClusterID
          • CommittedGeometryIndex
          • CommittedGetIntersectionTriangleVertexPositions
          • CommittedInstanceContributionToHitGroupIndex
          • CommittedInstanceID
          • CommittedInstanceIndex
          • CommittedObjectRayDirection
          • CommittedObjectRayOrigin
          • CommittedObjectToWorld3x4
          • CommittedObjectToWorld4x3
          • CommittedPrimitiveIndex
          • CommittedRayBarycentrics
          • CommittedRayFrontFace
          • CommittedRayGeometryIndex
          • CommittedRayInstanceCustomIndex
          • CommittedRayInstanceId
          • CommittedRayInstanceShaderBindingTableRecordOffset
          • CommittedRayObjectRayDirection
          • CommittedRayObjectRayOrigin
          • CommittedRayObjectToWorld
          • CommittedRayPrimitiveIndex
          • CommittedRayT
          • CommittedRayWorldToObject
          • CommittedStatus
          • CommittedTriangleBarycentrics
          • CommittedTriangleFrontFace
          • CommittedWorldToObject3x4
          • CommittedWorldToObject4x3
          • Proceed
          • RayFlags
          • RayTMin
          • TraceRayInline
          • WorldRayDirection
          • WorldRayOrigin
          • init
        • RaytracingAccelerationStructure
          • Handle
          • descriptorAccess
          • init
          • kind
      • Sampler types
        • SamplerComparisonState
          • Handle
          • descriptorAccess
          • init
          • kind
        • SamplerState
          • Handle
          • descriptorAccess
          • init
          • kind
      • Scalar types
        • float16_t
        • float32_t
        • float64_t
        • int32_t
        • size_t
        • ssize_t
        • uint32_t
        • usize_t
      • Stage IO types
        • InputPatch
          • subscript
        • LineStream
          • Append
          • RestartStrip
        • OutputIndices
          • subscript
        • OutputPatch
          • subscript
        • OutputPrimitives
          • subscript
        • OutputVertices
          • _metalSetVertex
          • _setVertex
          • subscript
        • PointStream
          • Append
          • RestartStrip
        • SubpassInput
        • SubpassInputMS
        • TextureFootprint
          • _isSingleLevel
          • isSingleLevel
        • TextureFootprint2D
        • TextureFootprint3D
        • TriangleStream
          • Append
          • RestartStrip
      • Texture types
        • Buffer
        • FeedbackTexture2D
        • FeedbackTexture2DArray
        • RWBuffer
        • RWSampler1D
        • RWSampler1DArray
        • RWSampler2D
        • RWSampler2DArray
        • RWSampler2DMS
        • RWSampler2DMSArray
        • RWSampler3D
        • RWTexture1D
        • RWTexture1DArray
        • RWTexture2D
        • RWTexture2DArray
        • RWTexture2DMS
        • RWTexture2DMSArray
        • RWTexture3D
        • RasterizerOrderedBuffer
        • RasterizerOrderedSampler1D
        • RasterizerOrderedSampler1DArray
        • RasterizerOrderedSampler2D
        • RasterizerOrderedSampler2DArray
        • RasterizerOrderedSampler3D
        • RasterizerOrderedTexture1D
        • RasterizerOrderedTexture1DArray
        • RasterizerOrderedTexture2D
        • RasterizerOrderedTexture2DArray
        • RasterizerOrderedTexture3D
        • SAMPLER_FEEDBACK_MIN_MIP
          • Element
          • elementCount
          • init
        • SAMPLER_FEEDBACK_MIP_REGION_USED
          • Element
          • elementCount
          • init
        • Sampler1D
        • Sampler1DArray
        • Sampler1DArrayShadow
        • Sampler1DShadow
        • Sampler2D
        • Sampler2DArray
        • Sampler2DArrayShadow
        • Sampler2DMS
        • Sampler2DMSArray
        • Sampler2DShadow
        • Sampler3D
        • Sampler3DArrayShadow
        • Sampler3DShadow
        • SamplerCube
        • SamplerCubeArray
        • SamplerCubeArrayShadow
        • SamplerCubeShadow
        • Texture1D
        • Texture1DArray
        • Texture2D
        • Texture2DArray
        • Texture2DMS
        • Texture2DMSArray
        • Texture3D
        • TextureBuffer
          • Handle
          • descriptorAccess
          • init
          • kind
        • TextureCube
        • TextureCubeArray
        • WSampler1D
        • WSampler1DArray
        • WSampler2D
        • WSampler2DArray
        • WSampler3D
        • WTexture1D
        • WTexture1DArray
        • WTexture2D
        • WTexture2DArray
        • WTexture3D
        • _Texture
          • CalculateLevelOfDetail
          • CalculateLevelOfDetailUnclamped
          • Coords
          • Footprint
          • FootprintGranularity
          • Gather
          • GatherAlpha
          • GatherBlue
          • GatherCmp
          • GatherCmpAlpha
          • GatherCmpBlue
          • GatherCmpGreen
          • GatherCmpRed
          • GatherGreen
          • GatherRed
          • GetDimensions
          • GetSamplePosition
          • Handle
          • InterlockedAddF32
          • Load
          • Sample
          • SampleBias
          • SampleCmp
          • SampleCmpLevel
          • SampleCmpLevelZero
          • SampleGrad
          • SampleLevel
          • Store
          • TextureCoord
          • WriteSamplerFeedback
          • WriteSamplerFeedbackBias
          • WriteSamplerFeedbackGrad
          • WriteSamplerFeedbackLevel
          • descriptorAccess
          • init
          • kind
          • queryFootprintCoarse
          • queryFootprintCoarseBias
          • queryFootprintCoarseBiasClamp
          • queryFootprintCoarseClamp
          • queryFootprintCoarseGrad
          • queryFootprintCoarseGradClamp
          • queryFootprintCoarseLevel
          • queryFootprintFine
          • queryFootprintFineBias
          • queryFootprintFineBiasClamp
          • queryFootprintFineClamp
          • queryFootprintFineGrad
          • queryFootprintFineGradClamp
          • queryFootprintFineLevel
          • subscript
        • Element
        • elementCount
        • init
      • Array
        • Differential
        • dadd
        • dmul
        • dzero
        • getCount
      • Atomic
        • add
        • and
        • compareExchange
        • decrement
        • exchange
        • increment
        • load
        • max
        • min
        • or
        • store
        • sub
        • xor
      • AtomicAdd
        • diff
        • loadOnce_backward
        • loadOnce_forward
        • load_backward
        • load_forward
        • storeOnce_backward
        • storeOnce_forward
        • store_backward
        • store_forward
      • ConstantBuffer
        • Handle
        • descriptorAccess
        • init
        • kind
      • CoopMat
        • add
        • copyFrom
        • div
        • equals
        • fill
        • getColumnCount
        • getCount
        • getLength
        • getRowCount
        • init
        • lessThan
        • lessThanOrEquals
        • load
        • loadAny
        • mod
        • mul
        • neg
        • store
        • storeAny
        • sub
        • subscript
      • CoopMatMatrixLayout
      • CoopMatMatrixOperands
      • CoopMatMatrixUse
      • CoopMatScope
      • CoopVec
        • add
        • copyFrom
        • div
        • equals
        • fill
        • getCount
        • init
        • lessThan
        • lessThanOrEquals
        • load
        • loadAny
        • matMulAccum
        • matMulAccumPacked
        • matMulAddAccum
        • matMulAddAccumPacked
        • mod
        • mul
        • neg
        • replicate
        • store
        • storeAny
        • sub
        • subscript
      • CoopVecComponentType
      • CoopVecMatrixLayout
      • DescriptorAccess
      • DescriptorHandle
        • equals
        • init
        • lessThan
        • lessThanOrEquals
      • DescriptorKind
      • DiffTensorView
        • diff
        • dims
        • init
        • load
        • loadOnce
        • primal
        • size
        • store
        • storeOnce
        • stride
        • subscript
      • DifferentialPair
        • Differential
        • DifferentialElementType
        • d
        • dadd
        • dmul
        • dzero
        • getDifferential
        • getPrimal
        • init
        • p
        • v
      • DifferentialPtrPair
        • Differential
        • DifferentialElementType
        • d
        • init
        • p
        • v
      • DispatchNodeInputRecord
        • Get
      • NodePayloadPtr
      • NullDifferential
        • Differential
        • dadd
        • dmul
        • dummy
        • dzero
      • Optional
        • hasValue
        • init
        • value
      • ParameterBlock
      • Ptr
        • init
        • subscript
      • String
        • getLength
        • init
        • length
      • TensorView
        • InterlockedAdd
        • InterlockedAnd
        • InterlockedCompareExchange
        • InterlockedExchange
        • InterlockedMax
        • InterlockedMin
        • InterlockedOr
        • InterlockedXor
        • data_ptr
        • data_ptr_at
        • dims
        • init
        • load
        • size
        • store
        • stride
        • subscript
      • TorchTensor
        • alloc
        • data_ptr
        • dims
        • emptyLike
        • fillValue
        • fillZero
        • getView
        • size
        • stride
        • zerosLike
      • Tuple
        • Differential
        • dadd
        • dmul
        • dzero
        • equals
        • init
        • lessThan
        • lessThanOrEquals
      • _AttributeTargets
      • int8_t4_packed
      • uint8_t4_packed
    • Attributes
      • AutoPyBindCUDA
      • BackwardDerivative
      • BackwardDerivativeOf
      • BackwardDifferentiable
      • COM
      • CUDADeviceExport
      • CUDAHost
      • CUDAKernel
      • CudaDeviceExport
      • CudaHost
      • CudaKernel
      • DerivativeGroupLinear
      • DerivativeGroupQuad
      • DerivativeMember
      • Differentiable
      • DllExport
      • DllImport
      • Flags
      • ForceInline
      • ForceUnroll
      • ForwardDerivative
      • ForwardDerivativeOf
      • ForwardDifferentiable
      • KnownBuiltin
      • MaxIters
      • MaximallyReconverges
      • NoDiffThis
      • NonUniformReturn
      • NumThreads
      • OverloadRank
      • PreferCheckpoint
      • PreferRecompute
      • PrimalSubstitute
      • PrimalSubstituteOf
      • PyExport
      • QuadDerivatives
      • RequireFullQuads
      • RequirePrelude
      • Shader
      • SpecializationConstant
      • Specialize
      • TorchEntryPoint
      • TreatAsDifferentiable
      • UnscopedEnum
      • WaveSize
      • allow
      • allow_uav_condition
      • anyValueSize
      • branch
      • builtin
      • call
      • constref
      • deprecated
      • disable_array_flattening
      • domain
      • earlydepthstencil
      • fastopt
      • flatten
      • forcecase
      • format
      • gl_binding
      • instance
      • loop
      • maxtessfactor
      • maxvertexcount
      • mutating
      • noRefInline
      • noinline
      • nonmutating
      • numthreads
      • open
      • outputcontrolpoints
      • outputtopology
      • partitioning
      • patchconstantfunc
      • payload
      • push_constant
      • raypayload
      • require
      • sealed
      • shader
      • shader_record
      • spv_target_env_1_3
      • unroll
      • vk_binding
      • vk_constant_id
      • vk_image_format
      • vk_location
      • vk_offset
      • vk_push_constant
      • vk_shader_record
      • vk_specialization_constant
      • vk_spirv_instruction
    • Global Declarations
      • Atomic functions
        • InterlockedAdd
        • InterlockedAnd
        • InterlockedCompareExchange
        • InterlockedCompareExchangeFloatBitwise
        • InterlockedCompareStore
        • InterlockedCompareStoreFloatBitwise
        • InterlockedExchange
        • InterlockedMax
        • InterlockedMin
        • InterlockedOr
        • InterlockedXor
      • Memory and control barriers
        • AllMemoryBarrier
        • AllMemoryBarrierWithGroupSync
        • DeviceMemoryBarrier
        • DeviceMemoryBarrierWithGroupSync
        • GroupMemoryBarrier
        • GroupMemoryBarrierWithGroupSync
      • Bit operation functions
        • countbits
        • firstbithigh
        • firstbitlow
        • reversebits
      • Conversion functions
        • asdouble
        • asfloat
        • asfloat16
        • asint
        • asint16
        • asuint
        • asuint16
        • bit_cast
        • f16tof32
        • f32tof16
        • f32tof16_
        • reinterpret
      • Derivative functions
        • ddx
        • ddx_coarse
        • ddx_fine
        • ddy
        • ddy_coarse
        • ddy_fine
        • fwidth
        • fwidth_coarse
        • fwidth_fine
      • Vertex Interpolation Functions
        • EvaluateAttributeAtCentroid
        • EvaluateAttributeAtSample
        • EvaluateAttributeSnapped
      • Math functions
        • abs
        • acos
        • acosh
        • asin
        • asinh
        • atan
        • atan2
        • atanh
        • ceil
        • clamp
        • copysign
        • copysign_double
        • copysign_float
        • copysign_half
        • cos
        • cosh
        • cospi
        • cross
        • degrees
        • determinant
        • distance
        • divide
        • dot
        • dot2add
        • dot4add_i8packed
        • dot4add_u8packed
        • dst
        • exp
        • exp10
        • exp2
        • fabs
        • faceforward
        • fdim
        • floor
        • fma
        • fmax
        • fmax3
        • fmedian3
        • fmin
        • fmin3
        • fmod
        • frac
        • fract
        • frexp
        • isfinite
        • isinf
        • isnan
        • ldexp
        • length
        • lerp
        • lit
        • log
        • log10
        • log2
        • mad
        • max
        • max3
        • median3
        • min
        • min3
        • modf
        • msad4
        • mul
        • normalize
        • pow
        • powr
        • radians
        • rcp
        • reflect
        • refract
        • rint
        • round
        • rsqrt
        • saturate
        • sign
        • sin
        • sincos
        • sinh
        • sinpi
        • smoothstep
        • sqrt
        • step
        • tan
        • tanh
        • tanpi
        • transpose
        • trunc
      • Mesh shading
        • DispatchMesh
        • SetMeshOutputCounts
      • Ray-tracing
        • AcceptHitAndEndSearch
        • CANDIDATE_NON_OPAQUE_TRIANGLE
        • CANDIDATE_PROCEDURAL_PRIMITIVE
        • COMMITTED_NOTHING
        • COMMITTED_PROCEDURAL_PRIMITIVE_HIT
        • COMMITTED_TRIANGLE_HIT
        • CallShader
        • DispatchRaysDimensions
        • DispatchRaysIndex
        • GeometryIndex
        • GetClusterID
        • HIT_KIND_TRIANGLE_BACK_FACE
        • HIT_KIND_TRIANGLE_FRONT_FACE
        • HitKind
        • HitTriangleVertexPosition
        • IgnoreHit
        • InstanceID
        • InstanceIndex
        • ObjectRayDirection
        • ObjectRayOrigin
        • ObjectToWorld
        • ObjectToWorld3x4
        • ObjectToWorld4x3
        • PrimitiveIndex
        • RAY_FLAG_ACCEPT_FIRST_HIT_AND_END_SEARCH
        • RAY_FLAG_CULL_BACK_FACING_TRIANGLES
        • RAY_FLAG_CULL_FRONT_FACING_TRIANGLES
        • RAY_FLAG_CULL_NON_OPAQUE
        • RAY_FLAG_CULL_OPAQUE
        • RAY_FLAG_FORCE_NON_OPAQUE
        • RAY_FLAG_FORCE_OPAQUE
        • RAY_FLAG_NONE
        • RAY_FLAG_SKIP_CLOSEST_HIT_SHADER
        • RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES
        • RAY_FLAG_SKIP_TRIANGLES
        • RayCurrentTime
        • RayFlags
        • RayTCurrent
        • RayTMin
        • ReportHit
        • ReportHitOptix
        • TraceMotionRay
        • TraceRay
        • WorldRayDirection
        • WorldRayOrigin
        • WorldToObject
        • WorldToObject3x4
        • WorldToObject4x3
      • Tessellation functions
        • Process2DQuadTessFactorsAvg
        • Process2DQuadTessFactorsMax
        • Process2DQuadTessFactorsMin
        • ProcessIsolineTessFactors
        • ProcessQuadTessFactorsAvg
        • ProcessQuadTessFactorsMax
        • ProcessQuadTessFactorsMin
        • ProcessTriTessFactorsAvg
        • ProcessTriTessFactorsMax
        • ProcessTriTessFactorsMin
      • Wave and quad functions
        • QuadReadAcrossDiagonal
        • QuadReadAcrossX
        • QuadReadAcrossY
        • QuadReadLaneAt
        • WaveActiveAllEqual
        • WaveActiveAllTrue
        • WaveActiveAnyTrue
        • WaveActiveBallot
        • WaveActiveBitAnd
        • WaveActiveBitOr
        • WaveActiveBitXor
        • WaveActiveCountBits
        • WaveActiveMax
        • WaveActiveMin
        • WaveActiveProduct
        • WaveActiveSum
        • WaveBroadcastLaneAt
        • WaveGetActiveMulti
        • WaveGetConvergedMulti
        • WaveGetLaneCount
        • WaveGetLaneIndex
        • WaveIsFirstLane
        • WaveMatch
        • WaveMultiPrefixBitAnd
        • WaveMultiPrefixBitOr
        • WaveMultiPrefixBitXor
        • WaveMultiPrefixCountBits
        • WaveMultiPrefixProduct
        • WaveMultiPrefixSum
        • WavePrefixCountBits
        • WavePrefixProduct
        • WavePrefixSum
        • WaveReadLaneAt
        • WaveReadLaneFirst
        • WaveShuffle
        • _WaveCountBits
      • CheckAccessFullyMapped
      • D3DCOLORtoUBYTE4
      • GetAttributeAtVertex
      • GetRenderTargetSampleCount
      • GetRenderTargetSamplePosition
      • IsHelperLane
      • NonUniformResourceIndex
      • QuadAll
      • QuadAny
      • ReorderThread
      • WaveClusteredRotate
      • WaveRotate
      • WorkgroupSize
      • abort
      • all
      • any
      • bitfieldExtract
      • bitfieldInsert
      • clip
      • clock2x32ARB
      • clockARB
      • concat
      • coopMatLoad
      • coopMatMulAdd
      • coopVecLoad
      • coopVecLoadGroupshared
      • coopVecMatMul
      • coopVecMatMulAdd
      • coopVecMatMulAddPacked
      • coopVecMatMulPacked
      • coopVecOuterProductAccumulate
      • coopVecReduceSumAccumulate
      • createDynamicObject
      • cudaBlockDim
      • cudaBlockIdx
      • cudaThreadIdx
      • debugBreak
      • defaultGetDescriptorFromHandle
      • detach
      • diffPair
      • getDescriptorFromHandle
      • getRealtimeClock
      • getRealtimeClockLow
      • getStringHash
      • isDifferentialNull
      • loadAligned
      • makeArrayFromElement
      • makeTuple
      • nextafter
      • nonuniform
      • operator*
      • packHalf2x16
      • packInt4x8
      • packInt4x8Clamp
      • packSnorm2x16
      • packSnorm4x8
      • packUint4x8
      • packUint4x8Clamp
      • packUnorm2x16
      • packUnorm4x8
      • pack_clamp_s8
      • pack_clamp_u8
      • pack_s8
      • pack_u8
      • printf
      • select
      • static_assert
      • storeAligned
      • syncTorchCudaStream
      • unmodified
      • unpackHalf2x16ToFloat
      • unpackHalf2x16ToHalf
      • unpackInt4x8ToInt16
      • unpackInt4x8ToInt32
      • unpackSnorm2x16ToFloat
      • unpackSnorm2x16ToHalf
      • unpackSnorm4x8ToFloat
      • unpackSnorm4x8ToHalf
      • unpackUint4x8ToUint16
      • unpackUint4x8ToUint32
      • unpackUnorm2x16ToFloat
      • unpackUnorm2x16ToHalf
      • unpackUnorm4x8ToFloat
      • unpackUnorm4x8ToHalf
      • unpack_s8s16
      • unpack_s8s32
      • unpack_u8u16
      • unpack_u8u32
      • unused
      • updateDiff
      • updatePair
      • updatePrimal
      • workgroupUniformLoad

coopVecMatMulAddPacked

Description

Multiply a cooperative vector with a matrix and add a bias vector.

Signature

/// Requires Capability Set 1:
CoopVec<T, M> coopVecMatMulAddPacked<T, M:int, PackedK:int, U>(
    CoopVec<U, PackedK> input,
    CoopVecComponentType inputInterpretation,
    int k,
    RWByteAddressBuffer matrix,
    int matrixOffset,
    CoopVecComponentType matrixInterpretation,
    RWByteAddressBuffer bias,
    int biasOffset,
    CoopVecComponentType biasInterpretation,
    CoopVecMatrixLayout memoryLayout,
    bool transpose,
    uint matrixStride)
    where T : __BuiltinArithmeticType
    where U : __BuiltinArithmeticType;

/// Requires Capability Set 1:
CoopVec<T, M> coopVecMatMulAddPacked<T, M:int, PackedK:int, U>(
    CoopVec<U, PackedK> input,
    CoopVecComponentType inputInterpretation,
    int k,
    ByteAddressBuffer matrix,
    int matrixOffset,
    CoopVecComponentType matrixInterpretation,
    ByteAddressBuffer bias,
    int biasOffset,
    CoopVecComponentType biasInterpretation,
    CoopVecMatrixLayout memoryLayout,
    bool transpose,
    uint matrixStride)
    where T : __BuiltinArithmeticType
    where U : __BuiltinArithmeticType;

/// Requires Capability Set 2:
CoopVec<T, M> coopVecMatMulAddPacked<T, M:int, PackedK:int, U, IgnoredBufferElementType>(
    CoopVec<U, PackedK> input,
    CoopVecComponentType inputInterpretation,
    int k,
    RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout> matrix,
    int matrixOffset,
    CoopVecComponentType matrixInterpretation,
    RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout> bias,
    int biasOffset,
    CoopVecComponentType biasInterpretation,
    CoopVecMatrixLayout memoryLayout,
    bool transpose,
    uint matrixStride)
    where T : __BuiltinArithmeticType
    where U : __BuiltinArithmeticType;

/// Requires Capability Set 2:
CoopVec<T, M> coopVecMatMulAddPacked<T, M:int, PackedK:int, U, IgnoredBufferElementType>(
    CoopVec<U, PackedK> input,
    CoopVecComponentType inputInterpretation,
    int k,
    StructuredBuffer<IgnoredBufferElementType, DefaultDataLayout> matrix,
    int matrixOffset,
    CoopVecComponentType matrixInterpretation,
    StructuredBuffer<IgnoredBufferElementType, DefaultDataLayout> bias,
    int biasOffset,
    CoopVecComponentType biasInterpretation,
    CoopVecMatrixLayout memoryLayout,
    bool transpose,
    uint matrixStride)
    where T : __BuiltinArithmeticType
    where U : __BuiltinArithmeticType;

Generic Parameters

T: __BuiltinArithmeticType

M : int

PackedK : int

U: __BuiltinArithmeticType

IgnoredBufferElementType

Parameters

input : CoopVec<U, PackedK>

The input cooperative vector to multiply with the matrix.

inputInterpretation : CoopVecComponentType

Specifies how to interpret the values in the input vector (e.g. as packed values).

k : int

The number of columns in the matrix.

matrix : RWByteAddressBuffer

The matrix buffer to multiply with.

matrixOffset : int

Byte offset into the matrix buffer.

matrixInterpretation : CoopVecComponentType

Specifies how to interpret the values in the matrix.

bias : RWByteAddressBuffer

The bias buffer to add after multiplication.

biasOffset : int

Byte offset into the bias buffer.

biasInterpretation : CoopVecComponentType

Specifies how to interpret the values in the bias vector.

memoryLayout : CoopVecMatrixLayout

Specifies the memory layout of the matrix (row-major or column-major).

transpose : bool

Whether to transpose the matrix before multiplication.

matrixStride : uint

The stride between matrix rows/columns in bytes.

matrix : ByteAddressBuffer

The matrix buffer to multiply with.

bias : ByteAddressBuffer

The bias buffer to add after multiplication.

matrix : RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout>

The matrix buffer to multiply with.

bias : RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout>

The bias buffer to add after multiplication.

matrix : StructuredBuffer<IgnoredBufferElementType, DefaultDataLayout>

The matrix buffer to multiply with.

bias : StructuredBuffer<IgnoredBufferElementType, DefaultDataLayout>

The bias buffer to add after multiplication.

Return value

A new cooperative vector containing the result of the matrix multiplication with added bias.

Remarks

Unlike coopVecMatMulAdd, this function supports packed input interpretations where multiple values can be packed into each element of the input vector. The k parameter specifies the actual number of values to use from the packed input.

Availability and Requirements

Capability Set 1

Defined for the following targets:

hlsl

Available in all stages.

cpp

Available in all stages.

cuda

Available in all stages.

spirv

Available in all stages.

Requires capability: spvCooperativeVectorNV.

Capability Set 2

Defined for the following targets:

spirv

Available in all stages.

Requires capability: spvCooperativeVectorNV.