• Slang Standard Library Reference
    • Interfaces
      • IArithmetic
        • add
        • div
        • init
        • mod
        • mul
        • neg
        • sub
      • IArithmeticAtomicable
      • IArray
        • getCount
        • subscript
      • IAtomicable
      • IBitAtomicable
      • IBufferDataLayout
      • IComparable
        • equals
        • lessThan
        • lessThanOrEquals
      • IDefaultInitializable
        • init
      • IDiffTensorWrapper
        • loadOnce_backward
        • loadOnce_forward
        • load_backward
        • load_forward
        • storeOnce_backward
        • storeOnce_forward
        • store_backward
        • store_forward
      • IDifferentiable
        • dadd
        • dmul
        • dzero
      • IDifferentiableFunc
        • operator()
      • IDifferentiableMutatingFunc
        • operator()
      • IDifferentiablePtrType
      • IFloat
        • add
        • div
        • init
        • mod
        • mul
        • neg
        • scale
        • sub
        • toFloat
      • IFunc
        • operator()
      • IInteger
        • init
        • toInt
        • toInt64
        • toUInt
        • toUInt64
      • ILogical
        • and
        • bitAnd
        • bitNot
        • bitOr
        • bitXor
        • init
        • not
        • or
        • shl
        • shr
      • IMutatingFunc
        • operator()
      • IOpaqueDescriptor
        • descriptorAccess
        • kind
      • IPhysicalBuffer
        • GetBufferPointer
        • LoadByteOffset
      • IRWArray
        • subscript
      • IRWPhysicalBuffer
        • StoreByteOffset
      • IRangedValue
        • maxValue
        • minValue
      • ITexelElement
        • elementCount
        • init
      • __BuiltinArithmeticType
      • __BuiltinFloatingPointType
        • getPi
      • __BuiltinIntegerType
      • __BuiltinLogicalType
      • __ITextureShape
        • dimensions
        • flavor
        • planeDimensions
      • __ITextureShape1D2D3D
    • Types
      • Buffer types
        • AppendStructuredBuffer
          • Append
          • GetDimensions
          • Handle
          • descriptorAccess
          • init
          • kind
        • ByteAddressBuffer
          • GetBufferPointer
          • GetDimensions
          • Handle
          • Load
          • Load2
          • Load2Aligned
          • Load3
          • Load3Aligned
          • Load4
          • Load4Aligned
          • LoadAligned
          • LoadByteOffset
          • descriptorAccess
          • init
          • kind
        • ConsumeStructuredBuffer
          • Consume
          • GetDimensions
          • Handle
          • descriptorAccess
          • init
          • kind
        • RWByteAddressBuffer
          • GetBufferPointer
          • GetDimensions
          • Handle
          • InterlockedAdd
          • InterlockedAdd64
          • InterlockedAddF16
          • InterlockedAddF16Emulated
          • InterlockedAddF32
          • InterlockedAddF64
          • InterlockedAddI64
          • InterlockedAddU64
          • InterlockedAnd
          • InterlockedAnd64
          • InterlockedAndU64
          • InterlockedCompareExchange
          • InterlockedCompareExchange64
          • InterlockedCompareExchangeFloatBitwise
          • InterlockedCompareExchangeU64
          • InterlockedCompareStore
          • InterlockedCompareStore64
          • InterlockedCompareStoreFloatBitwise
          • InterlockedExchange
          • InterlockedExchange64
          • InterlockedExchangeFloat
          • InterlockedExchangeU64
          • InterlockedMax
          • InterlockedMax64
          • InterlockedMaxU64
          • InterlockedMin
          • InterlockedMin64
          • InterlockedMinU64
          • InterlockedOr
          • InterlockedOr64
          • InterlockedOrU64
          • InterlockedXor
          • InterlockedXor64
          • InterlockedXorU64
          • Load
          • Load2
          • Load2Aligned
          • Load3
          • Load3Aligned
          • Load4
          • Load4Aligned
          • LoadAligned
          • LoadByteOffset
          • Store
          • Store2
          • Store2Aligned
          • Store3
          • Store3Aligned
          • Store4
          • Store4Aligned
          • StoreAligned
          • StoreByteOffset
          • _NvInterlockedAddFp16x2
          • descriptorAccess
          • init
          • kind
        • RWStructuredBuffer
          • DecrementCounter
          • GetDimensions
          • Handle
          • IncrementCounter
          • Load
          • descriptorAccess
          • getCount
          • init
          • kind
          • subscript
        • RasterizerOrderedByteAddressBuffer
          • GetDimensions
          • Handle
          • InterlockedAdd
          • InterlockedAnd
          • InterlockedCompareExchange
          • InterlockedCompareStore
          • InterlockedExchange
          • InterlockedMax
          • InterlockedMin
          • InterlockedOr
          • InterlockedXor
          • Load
          • Load2
          • Load2Aligned
          • Load3
          • Load3Aligned
          • Load4
          • Load4Aligned
          • LoadAligned
          • Store
          • Store2
          • Store2Aligned
          • Store3
          • Store3Aligned
          • Store4
          • Store4Aligned
          • StoreAligned
          • descriptorAccess
          • init
          • kind
        • RasterizerOrderedStructuredBuffer
          • DecrementCounter
          • GetDimensions
          • Handle
          • IncrementCounter
          • Load
          • descriptorAccess
          • getCount
          • init
          • kind
          • subscript
        • StructuredBuffer
          • GetDimensions
          • Handle
          • Load
          • descriptorAccess
          • getCount
          • init
          • kind
          • subscript
      • Math types
        • matrix
          • Differential
          • T
          • add
          • dadd
          • div
          • dmul
          • dzero
          • equals
          • getCount
          • init
          • lessThan
          • lessThanOrEquals
          • mod
          • mul
          • neg
          • scale
          • sub
          • toFloat
        • vector
          • Differential
          • Element
          • add
          • and
          • bitAnd
          • bitNot
          • bitOr
          • bitXor
          • dadd
          • div
          • dmul
          • dzero
          • elementCount
          • equals
          • getCount
          • init
          • lessThan
          • lessThanOrEquals
          • mod
          • mul
          • neg
          • not
          • or
          • scale
          • shl
          • shr
          • sub
          • toFloat
          • toInt
          • toInt64
          • toUInt
          • toUInt64
      • Miscelaneous types
        • DefaultDataLayout
        • DefaultPushConstantDataLayout
        • MemoryOrder
        • NativeString
          • getBuffer
          • getLength
          • init
          • length
        • ScalarDataLayout
        • SideEffectBehavior
        • Std140DataLayout
        • Std430DataLayout
        • __Shape1D
          • dimensions
          • flavor
          • planeDimensions
        • __Shape2D
          • dimensions
          • flavor
          • planeDimensions
        • __Shape3D
          • dimensions
          • flavor
          • planeDimensions
        • __ShapeBuffer
          • dimensions
          • flavor
          • planeDimensions
        • __ShapeCube
          • dimensions
          • flavor
          • planeDimensions
        • string
      • Ray-tracing
        • BuiltInTriangleIntersectionAttributes
          • barycentrics
        • CANDIDATE_TYPE
        • COMMITTED_STATUS
        • HitObject
          • GetAttributes
          • GetClusterID
          • GetCurrentTime
          • GetGeometryIndex
          • GetHitKind
          • GetInstanceID
          • GetInstanceIndex
          • GetLssPositionsAndRadii
          • GetObjectRayDirection
          • GetObjectRayOrigin
          • GetObjectToWorld
          • GetPrimitiveIndex
          • GetRayDesc
          • GetShaderRecordBufferHandle
          • GetShaderTableIndex
          • GetSpherePositionAndRadius
          • GetWorldToObject
          • Invoke
          • IsHit
          • IsLssHit
          • IsMiss
          • IsNop
          • IsSphereHit
          • LoadLocalRootArgumentsConstant
          • LoadLocalRootTableConstant
          • MakeHit
          • MakeMiss
          • MakeMotionHit
          • MakeMotionMiss
          • MakeNop
          • SetShaderTableIndex
          • TraceMotionRay
          • TraceRay
          • init
        • RAY_FLAG
        • RayDesc
          • Direction
          • Origin
          • TMax
          • TMin
        • RayQuery
          • Abort
          • CandidateClusterID
          • CandidateGeometryIndex
          • CandidateGetIntersectionTriangleVertexPositions
          • CandidateInstanceContributionToHitGroupIndex
          • CandidateInstanceID
          • CandidateInstanceIndex
          • CandidateObjectRayDirection
          • CandidateObjectRayOrigin
          • CandidateObjectToWorld3x4
          • CandidateObjectToWorld4x3
          • CandidatePrimitiveIndex
          • CandidateProceduralPrimitiveNonOpaque
          • CandidateRayBarycentrics
          • CandidateRayFrontFace
          • CandidateRayGeometryIndex
          • CandidateRayInstanceCustomIndex
          • CandidateRayInstanceId
          • CandidateRayInstanceShaderBindingTableRecordOffset
          • CandidateRayObjectRayDirection
          • CandidateRayObjectRayOrigin
          • CandidateRayObjectToWorld
          • CandidateRayPrimitiveIndex
          • CandidateRayWorldToObject
          • CandidateTriangleBarycentrics
          • CandidateTriangleFrontFace
          • CandidateTriangleRayT
          • CandidateType
          • CandidateWorldToObject3x4
          • CandidateWorldToObject4x3
          • CommitNonOpaqueTriangleHit
          • CommitProceduralPrimitiveHit
          • CommittedClusterID
          • CommittedGeometryIndex
          • CommittedGetIntersectionTriangleVertexPositions
          • CommittedInstanceContributionToHitGroupIndex
          • CommittedInstanceID
          • CommittedInstanceIndex
          • CommittedObjectRayDirection
          • CommittedObjectRayOrigin
          • CommittedObjectToWorld3x4
          • CommittedObjectToWorld4x3
          • CommittedPrimitiveIndex
          • CommittedRayBarycentrics
          • CommittedRayFrontFace
          • CommittedRayGeometryIndex
          • CommittedRayInstanceCustomIndex
          • CommittedRayInstanceId
          • CommittedRayInstanceShaderBindingTableRecordOffset
          • CommittedRayObjectRayDirection
          • CommittedRayObjectRayOrigin
          • CommittedRayObjectToWorld
          • CommittedRayPrimitiveIndex
          • CommittedRayT
          • CommittedRayWorldToObject
          • CommittedStatus
          • CommittedTriangleBarycentrics
          • CommittedTriangleFrontFace
          • CommittedWorldToObject3x4
          • CommittedWorldToObject4x3
          • Proceed
          • RayFlags
          • RayTMin
          • TraceRayInline
          • WorldRayDirection
          • WorldRayOrigin
          • init
        • RaytracingAccelerationStructure
          • Handle
          • descriptorAccess
          • init
          • kind
      • Sampler types
        • SamplerComparisonState
          • Handle
          • descriptorAccess
          • init
          • kind
        • SamplerState
          • Handle
          • descriptorAccess
          • init
          • kind
      • Scalar types
        • float16_t
        • float32_t
        • float64_t
        • int32_t
        • size_t
        • ssize_t
        • uint32_t
        • usize_t
      • Stage IO types
        • InputPatch
          • subscript
        • LineStream
          • Append
          • RestartStrip
        • OutputIndices
          • subscript
        • OutputPatch
          • subscript
        • OutputPrimitives
          • subscript
        • OutputVertices
          • _metalSetVertex
          • _setVertex
          • subscript
        • PointStream
          • Append
          • RestartStrip
        • SubpassInput
        • SubpassInputMS
        • TextureFootprint
          • _isSingleLevel
          • isSingleLevel
        • TextureFootprint2D
        • TextureFootprint3D
        • TriangleStream
          • Append
          • RestartStrip
      • Texture types
        • Buffer
        • FeedbackTexture2D
        • FeedbackTexture2DArray
        • RWBuffer
        • RWSampler1D
        • RWSampler1DArray
        • RWSampler2D
        • RWSampler2DArray
        • RWSampler2DMS
        • RWSampler2DMSArray
        • RWSampler3D
        • RWTexture1D
        • RWTexture1DArray
        • RWTexture2D
        • RWTexture2DArray
        • RWTexture2DMS
        • RWTexture2DMSArray
        • RWTexture3D
        • RasterizerOrderedBuffer
        • RasterizerOrderedSampler1D
        • RasterizerOrderedSampler1DArray
        • RasterizerOrderedSampler2D
        • RasterizerOrderedSampler2DArray
        • RasterizerOrderedSampler3D
        • RasterizerOrderedTexture1D
        • RasterizerOrderedTexture1DArray
        • RasterizerOrderedTexture2D
        • RasterizerOrderedTexture2DArray
        • RasterizerOrderedTexture3D
        • SAMPLER_FEEDBACK_MIN_MIP
          • Element
          • elementCount
          • init
        • SAMPLER_FEEDBACK_MIP_REGION_USED
          • Element
          • elementCount
          • init
        • Sampler1D
        • Sampler1DArray
        • Sampler1DArrayShadow
        • Sampler1DShadow
        • Sampler2D
        • Sampler2DArray
        • Sampler2DArrayShadow
        • Sampler2DMS
        • Sampler2DMSArray
        • Sampler2DShadow
        • Sampler3D
        • Sampler3DArrayShadow
        • Sampler3DShadow
        • SamplerCube
        • SamplerCubeArray
        • SamplerCubeArrayShadow
        • SamplerCubeShadow
        • Texture1D
        • Texture1DArray
        • Texture2D
        • Texture2DArray
        • Texture2DMS
        • Texture2DMSArray
        • Texture3D
        • TextureBuffer
          • Handle
          • descriptorAccess
          • init
          • kind
        • TextureCube
        • TextureCubeArray
        • WSampler1D
        • WSampler1DArray
        • WSampler2D
        • WSampler2DArray
        • WSampler3D
        • WTexture1D
        • WTexture1DArray
        • WTexture2D
        • WTexture2DArray
        • WTexture3D
        • _Texture
          • CalculateLevelOfDetail
          • CalculateLevelOfDetailUnclamped
          • Coords
          • Footprint
          • FootprintGranularity
          • Gather
          • GatherAlpha
          • GatherBlue
          • GatherCmp
          • GatherCmpAlpha
          • GatherCmpBlue
          • GatherCmpGreen
          • GatherCmpRed
          • GatherGreen
          • GatherRed
          • GetDimensions
          • GetSamplePosition
          • Handle
          • InterlockedAddF32
          • Load
          • Sample
          • SampleBias
          • SampleCmp
          • SampleCmpLevel
          • SampleCmpLevelZero
          • SampleGrad
          • SampleLevel
          • Store
          • TextureCoord
          • WriteSamplerFeedback
          • WriteSamplerFeedbackBias
          • WriteSamplerFeedbackGrad
          • WriteSamplerFeedbackLevel
          • descriptorAccess
          • init
          • kind
          • queryFootprintCoarse
          • queryFootprintCoarseBias
          • queryFootprintCoarseBiasClamp
          • queryFootprintCoarseClamp
          • queryFootprintCoarseGrad
          • queryFootprintCoarseGradClamp
          • queryFootprintCoarseLevel
          • queryFootprintFine
          • queryFootprintFineBias
          • queryFootprintFineBiasClamp
          • queryFootprintFineClamp
          • queryFootprintFineGrad
          • queryFootprintFineGradClamp
          • queryFootprintFineLevel
          • subscript
        • Element
        • elementCount
        • init
      • Array
        • Differential
        • dadd
        • dmul
        • dzero
        • getCount
      • Atomic
        • add
        • and
        • compareExchange
        • decrement
        • exchange
        • increment
        • load
        • max
        • min
        • or
        • store
        • sub
        • xor
      • AtomicAdd
        • diff
        • loadOnce_backward
        • loadOnce_forward
        • load_backward
        • load_forward
        • storeOnce_backward
        • storeOnce_forward
        • store_backward
        • store_forward
      • BindlessDescriptorOptions
      • ConstantBuffer
        • Handle
        • descriptorAccess
        • init
        • kind
      • CoopMat
        • GetColumnCount
        • GetLength
        • GetRowCount
        • Load
        • MapElement
        • Reduce2x2
        • ReduceColumn
        • ReduceRow
        • ReduceRowAndColumn
        • Store
        • Transpose
        • add
        • copyFrom
        • div
        • equals
        • fill
        • getCount
        • init
        • lessThan
        • lessThanOrEquals
        • mod
        • mul
        • neg
        • sub
        • subscript
      • CoopMatClampMode
      • CoopMatMatrixLayout
      • CoopMatMatrixUse
      • CoopVec
        • add
        • copyFrom
        • div
        • equals
        • fill
        • getCount
        • init
        • lessThan
        • lessThanOrEquals
        • load
        • loadAny
        • matMulAccum
        • matMulAccumPacked
        • matMulAddAccum
        • matMulAddAccumPacked
        • mod
        • mul
        • neg
        • replicate
        • store
        • storeAny
        • sub
        • subscript
      • CoopVecComponentType
      • CoopVecMatrixLayout
      • DefaultVkBindlessBindings
      • DescriptorAccess
      • DescriptorHandle
        • equals
        • init
        • lessThan
        • lessThanOrEquals
      • DescriptorKind
      • DiffTensorView
        • diff
        • dims
        • init
        • load
        • loadOnce
        • primal
        • size
        • store
        • storeOnce
        • stride
        • subscript
      • DifferentialPair
        • Differential
        • DifferentialElementType
        • d
        • dadd
        • dmul
        • dzero
        • getDifferential
        • getPrimal
        • init
        • p
        • v
      • DifferentialPtrPair
        • Differential
        • DifferentialElementType
        • d
        • init
        • p
        • v
      • DispatchNodeInputRecord
        • Get
      • MemoryScope
      • NodePayloadPtr
      • NullDifferential
        • Differential
        • dadd
        • dmul
        • dummy
        • dzero
      • Optional
        • Differential
        • dadd
        • dmul
        • dzero
        • hasValue
        • init
        • value
      • ParameterBlock
      • Ptr
        • init
        • subscript
      • String
        • getLength
        • init
        • length
      • TensorLayout
        • BlockSize
        • ClampValue
        • Dimension
        • Slice
        • Stride
        • init
      • TensorView
        • Clip
        • Dimension
        • InterlockedAdd
        • InterlockedAnd
        • InterlockedCompareExchange
        • InterlockedExchange
        • InterlockedMax
        • InterlockedMin
        • InterlockedOr
        • InterlockedXor
        • Stride
        • data_ptr
        • data_ptr_at
        • dims
        • init
        • load
        • size
        • store
        • stride
        • subscript
      • TorchTensor
        • alloc
        • data_ptr
        • dims
        • emptyLike
        • fillValue
        • fillZero
        • getView
        • size
        • stride
        • zerosLike
      • Tuple
        • Differential
        • MapElement
        • dadd
        • dmul
        • dzero
        • equals
        • init
        • lessThan
        • lessThanOrEquals
      • VkMutableBindlessBindings
      • _AttributeTargets
      • int8_t4_packed
      • uint8_t4_packed
    • Attributes
      • AutoPyBindCUDA
      • BackwardDerivative
      • BackwardDerivativeOf
      • BackwardDifferentiable
      • COM
      • CUDADeviceExport
      • CUDAHost
      • CUDAKernel
      • CudaDeviceExport
      • CudaHost
      • CudaKernel
      • DerivativeGroupLinear
      • DerivativeGroupQuad
      • DerivativeMember
      • Differentiable
      • DllExport
      • DllImport
      • Flags
      • ForceInline
      • ForceUnroll
      • ForwardDerivative
      • ForwardDerivativeOf
      • ForwardDifferentiable
      • KnownBuiltin
      • MaxIters
      • MaximallyReconverges
      • NoDiffThis
      • NonUniformReturn
      • NumThreads
      • OverloadRank
      • PreferCheckpoint
      • PreferRecompute
      • PrimalSubstitute
      • PrimalSubstituteOf
      • PyExport
      • QuadDerivatives
      • RequireFullQuads
      • RequirePrelude
      • Shader
      • SpecializationConstant
      • Specialize
      • TorchEntryPoint
      • TreatAsDifferentiable
      • UnscopedEnum
      • WaveSize
      • allow
      • allow_uav_condition
      • anyValueSize
      • branch
      • builtin
      • call
      • constref
      • deprecated
      • disable_array_flattening
      • domain
      • earlydepthstencil
      • fastopt
      • flatten
      • forcecase
      • format
      • gl_binding
      • instance
      • loop
      • maxtessfactor
      • maxvertexcount
      • mutating
      • noRefInline
      • noinline
      • nonmutating
      • numthreads
      • open
      • outputcontrolpoints
      • outputtopology
      • partitioning
      • patchconstantfunc
      • push_constant
      • raypayload
      • require
      • sealed
      • shader
      • shader_record
      • spv_target_env_1_3
      • unroll
      • vk_binding
      • vk_constant_id
      • vk_image_format
      • vk_location
      • vk_offset
      • vk_push_constant
      • vk_shader_record
      • vk_specialization_constant
      • vk_spirv_instruction
    • Global Declarations
      • Atomic functions
        • InterlockedAdd
        • InterlockedAnd
        • InterlockedCompareExchange
        • InterlockedCompareExchangeFloatBitwise
        • InterlockedCompareStore
        • InterlockedCompareStoreFloatBitwise
        • InterlockedExchange
        • InterlockedMax
        • InterlockedMin
        • InterlockedOr
        • InterlockedXor
      • Memory and control barriers
        • AllMemoryBarrier
        • AllMemoryBarrierWithGroupSync
        • DeviceMemoryBarrier
        • DeviceMemoryBarrierWithGroupSync
        • GroupMemoryBarrier
        • GroupMemoryBarrierWithGroupSync
      • Bit operation functions
        • countbits
        • firstbithigh
        • firstbitlow
        • reversebits
      • Conversion functions
        • asdouble
        • asfloat
        • asfloat16
        • asint
        • asint16
        • asuint
        • asuint16
        • bit_cast
        • f16tof32
        • f32tof16
        • f32tof16_
        • reinterpret
      • Derivative functions
        • ddx
        • ddx_coarse
        • ddx_fine
        • ddy
        • ddy_coarse
        • ddy_fine
        • fwidth
        • fwidth_coarse
        • fwidth_fine
      • Vertex Interpolation Functions
        • EvaluateAttributeAtCentroid
        • EvaluateAttributeAtSample
        • EvaluateAttributeSnapped
      • Math functions
        • abs
        • acos
        • acosh
        • asin
        • asinh
        • atan
        • atan2
        • atanh
        • ceil
        • clamp
        • copysign
        • copysign_double
        • copysign_float
        • copysign_half
        • cos
        • cosh
        • cospi
        • cross
        • degrees
        • determinant
        • distance
        • divide
        • dot
        • dot2add
        • dot4add_i8packed
        • dot4add_u8packed
        • dst
        • exp
        • exp10
        • exp2
        • fabs
        • faceforward
        • fdim
        • floor
        • fma
        • fmax
        • fmax3
        • fmedian3
        • fmin
        • fmin3
        • fmod
        • frac
        • fract
        • frexp
        • isfinite
        • isinf
        • isnan
        • ldexp
        • length
        • lerp
        • lit
        • log
        • log10
        • log2
        • mad
        • max
        • max3
        • median3
        • min
        • min3
        • modf
        • msad4
        • mul
        • normalize
        • pow
        • powr
        • radians
        • rcp
        • reflect
        • refract
        • rint
        • round
        • rsqrt
        • saturate
        • sign
        • sin
        • sincos
        • sinh
        • sinpi
        • smoothstep
        • sqrt
        • step
        • tan
        • tanh
        • tanpi
        • transpose
        • trunc
      • Mesh shading
        • DispatchMesh
        • SetMeshOutputCounts
      • Ray-tracing
        • AcceptHitAndEndSearch
        • CANDIDATE_NON_OPAQUE_TRIANGLE
        • CANDIDATE_PROCEDURAL_PRIMITIVE
        • COMMITTED_NOTHING
        • COMMITTED_PROCEDURAL_PRIMITIVE_HIT
        • COMMITTED_TRIANGLE_HIT
        • CallShader
        • DispatchRaysDimensions
        • DispatchRaysIndex
        • GeometryIndex
        • GetClusterID
        • GetLssPositionsAndRadii
        • GetSpherePositionAndRadius
        • HIT_KIND_TRIANGLE_BACK_FACE
        • HIT_KIND_TRIANGLE_FRONT_FACE
        • HitKind
        • HitTriangleVertexPosition
        • IgnoreHit
        • InstanceID
        • InstanceIndex
        • IsLssHit
        • IsSphereHit
        • ObjectRayDirection
        • ObjectRayOrigin
        • ObjectToWorld
        • ObjectToWorld3x4
        • ObjectToWorld4x3
        • PrimitiveIndex
        • RAY_FLAG_ACCEPT_FIRST_HIT_AND_END_SEARCH
        • RAY_FLAG_CULL_BACK_FACING_TRIANGLES
        • RAY_FLAG_CULL_FRONT_FACING_TRIANGLES
        • RAY_FLAG_CULL_NON_OPAQUE
        • RAY_FLAG_CULL_OPAQUE
        • RAY_FLAG_FORCE_NON_OPAQUE
        • RAY_FLAG_FORCE_OPAQUE
        • RAY_FLAG_NONE
        • RAY_FLAG_SKIP_CLOSEST_HIT_SHADER
        • RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES
        • RAY_FLAG_SKIP_TRIANGLES
        • RayCurrentTime
        • RayFlags
        • RayTCurrent
        • RayTMin
        • ReportHit
        • ReportHitOptix
        • TraceMotionRay
        • TraceRay
        • WorldRayDirection
        • WorldRayOrigin
        • WorldToObject
        • WorldToObject3x4
        • WorldToObject4x3
      • Tessellation functions
        • Process2DQuadTessFactorsAvg
        • Process2DQuadTessFactorsMax
        • Process2DQuadTessFactorsMin
        • ProcessIsolineTessFactors
        • ProcessQuadTessFactorsAvg
        • ProcessQuadTessFactorsMax
        • ProcessQuadTessFactorsMin
        • ProcessTriTessFactorsAvg
        • ProcessTriTessFactorsMax
        • ProcessTriTessFactorsMin
      • Wave and quad functions
        • QuadReadAcrossDiagonal
        • QuadReadAcrossX
        • QuadReadAcrossY
        • QuadReadLaneAt
        • WaveActiveAllEqual
        • WaveActiveAllTrue
        • WaveActiveAnyTrue
        • WaveActiveBallot
        • WaveActiveBitAnd
        • WaveActiveBitOr
        • WaveActiveBitXor
        • WaveActiveCountBits
        • WaveActiveMax
        • WaveActiveMin
        • WaveActiveProduct
        • WaveActiveSum
        • WaveBroadcastLaneAt
        • WaveGetActiveMulti
        • WaveGetConvergedMulti
        • WaveGetLaneCount
        • WaveGetLaneIndex
        • WaveIsFirstLane
        • WaveMatch
        • WaveMultiPrefixCountBits
        • WavePrefixCountBits
        • WavePrefixProduct
        • WavePrefixSum
        • WaveReadLaneAt
        • WaveReadLaneFirst
        • WaveShuffle
        • _WaveCountBits
      • CheckAccessFullyMapped
      • D3DCOLORtoUBYTE4
      • GetAttributeAtVertex
      • GetRenderTargetSampleCount
      • GetRenderTargetSamplePosition
      • InterlockedAddF16Emulated
      • InterlockedAddF16x2
      • IsHelperLane
      • NonUniformResourceIndex
      • QuadAll
      • QuadAny
      • ReorderThread
      • WaveClusteredRotate
      • WaveMultiBitAnd
      • WaveMultiBitOr
      • WaveMultiBitXor
      • WaveMultiMax
      • WaveMultiMin
      • WaveMultiPrefixBitAnd
      • WaveMultiPrefixBitOr
      • WaveMultiPrefixBitXor
      • WaveMultiPrefixExclusiveBitAnd
      • WaveMultiPrefixExclusiveBitOr
      • WaveMultiPrefixExclusiveBitXor
      • WaveMultiPrefixExclusiveMax
      • WaveMultiPrefixExclusiveMin
      • WaveMultiPrefixExclusiveProduct
      • WaveMultiPrefixExclusiveSum
      • WaveMultiPrefixInclusiveBitAnd
      • WaveMultiPrefixInclusiveBitOr
      • WaveMultiPrefixInclusiveBitXor
      • WaveMultiPrefixInclusiveMax
      • WaveMultiPrefixInclusiveMin
      • WaveMultiPrefixInclusiveProduct
      • WaveMultiPrefixInclusiveSum
      • WaveMultiPrefixProduct
      • WaveMultiPrefixSum
      • WaveMultiProduct
      • WaveMultiSum
      • WaveRotate
      • WorkgroupSize
      • abort
      • all
      • any
      • bitfieldExtract
      • bitfieldInsert
      • clip
      • clock2x32ARB
      • clockARB
      • concat
      • coopVecLoad
      • coopVecLoadGroupshared
      • coopVecMatMul
      • coopVecMatMulAdd
      • coopVecMatMulAddPacked
      • coopVecMatMulPacked
      • coopVecOuterProductAccumulate
      • coopVecReduceSumAccumulate
      • createDynamicObject
      • cudaBlockDim
      • cudaBlockIdx
      • cudaThreadIdx
      • debugBreak
      • defaultGetDescriptorFromHandle
      • detach
      • diffPair
      • getDescriptorFromHandle
      • getRealtimeClock
      • getRealtimeClockLow
      • getStringHash
      • isDifferentialNull
      • loadAligned
      • makeArrayFromElement
      • makeTuple
      • nextafter
      • nonuniform
      • operator*
      • packHalf2x16
      • packInt4x8
      • packInt4x8Clamp
      • packSnorm2x16
      • packSnorm4x8
      • packUint4x8
      • packUint4x8Clamp
      • packUnorm2x16
      • packUnorm4x8
      • pack_clamp_s8
      • pack_clamp_u8
      • pack_s8
      • pack_u8
      • printf
      • select
      • static_assert
      • storeAligned
      • syncTorchCudaStream
      • unmodified
      • unpackHalf2x16ToFloat
      • unpackHalf2x16ToHalf
      • unpackInt4x8ToInt16
      • unpackInt4x8ToInt32
      • unpackSnorm2x16ToFloat
      • unpackSnorm2x16ToHalf
      • unpackSnorm4x8ToFloat
      • unpackSnorm4x8ToHalf
      • unpackUint4x8ToUint16
      • unpackUint4x8ToUint32
      • unpackUnorm2x16ToFloat
      • unpackUnorm2x16ToHalf
      • unpackUnorm4x8ToFloat
      • unpackUnorm4x8ToHalf
      • unpack_s8s16
      • unpack_s8s32
      • unpack_u8u16
      • unpack_u8u32
      • unused
      • updateDiff
      • updatePair
      • updatePrimal
      • workgroupUniformLoad

coopVecMatMulAddPacked

Description

Multiply a matrix with a cooperative vector and add a bias vector to the result. Given a M-row by K-col matrix, a K-element column vector input, and a M-element vector bias, computes matrix*input+bias, and returns a M-element vector.

Signature

/// Requires Capability Set 1:
CoopVec<T, M> coopVecMatMulAddPacked<T, int M, int PackedK, U>(
    CoopVec<U, PackedK> input,
    CoopVecComponentType inputInterpretation,
    int k,
    RWByteAddressBuffer matrix,
    int matrixOffset,
    CoopVecComponentType matrixInterpretation,
    RWByteAddressBuffer bias,
    int biasOffset,
    CoopVecComponentType biasInterpretation,
    CoopVecMatrixLayout memoryLayout,
    bool transpose,
    uint matrixStride)
    where T : __BuiltinArithmeticType
    where U : __BuiltinArithmeticType;

/// Requires Capability Set 1:
CoopVec<T, M> coopVecMatMulAddPacked<T, int M, int PackedK, U>(
    CoopVec<U, PackedK> input,
    CoopVecComponentType inputInterpretation,
    int k,
    ByteAddressBuffer matrix,
    int matrixOffset,
    CoopVecComponentType matrixInterpretation,
    ByteAddressBuffer bias,
    int biasOffset,
    CoopVecComponentType biasInterpretation,
    CoopVecMatrixLayout memoryLayout,
    bool transpose,
    uint matrixStride)
    where T : __BuiltinArithmeticType
    where U : __BuiltinArithmeticType;

/// Requires Capability Set 2:
CoopVec<T, M> coopVecMatMulAddPacked<T, int M, int PackedK, U, IgnoredBufferElementType>(
    CoopVec<U, PackedK> input,
    CoopVecComponentType inputInterpretation,
    int k,
    RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout> matrix,
    int matrixOffset,
    CoopVecComponentType matrixInterpretation,
    RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout> bias,
    int biasOffset,
    CoopVecComponentType biasInterpretation,
    CoopVecMatrixLayout memoryLayout,
    bool transpose,
    uint matrixStride)
    where T : __BuiltinArithmeticType
    where U : __BuiltinArithmeticType;

/// Requires Capability Set 2:
CoopVec<T, M> coopVecMatMulAddPacked<T, int M, int PackedK, U, IgnoredBufferElementType>(
    CoopVec<U, PackedK> input,
    CoopVecComponentType inputInterpretation,
    int k,
    StructuredBuffer<IgnoredBufferElementType, DefaultDataLayout> matrix,
    int matrixOffset,
    CoopVecComponentType matrixInterpretation,
    StructuredBuffer<IgnoredBufferElementType, DefaultDataLayout> bias,
    int biasOffset,
    CoopVecComponentType biasInterpretation,
    CoopVecMatrixLayout memoryLayout,
    bool transpose,
    uint matrixStride)
    where T : __BuiltinArithmeticType
    where U : __BuiltinArithmeticType;

/// Requires Capability Set 2:
CoopVec<T, M> coopVecMatMulAddPacked<T, int M, int PackedK, U>(
    CoopVec<U, PackedK> input,
    CoopVecComponentType inputInterpretation,
    int k,
    Ptr<void> matrixPtr,
    CoopVecComponentType matrixInterpretation,
    Ptr<void> biasPtr,
    CoopVecComponentType biasInterpretation,
    CoopVecMatrixLayout memoryLayout,
    bool transpose,
    uint matrixStride)
    where T : __BuiltinArithmeticType
    where U : __BuiltinArithmeticType;

Generic Parameters

T: __BuiltinArithmeticType

M : int

PackedK : int

U: __BuiltinArithmeticType

IgnoredBufferElementType

Parameters

input : CoopVec<U, PackedK>

The input cooperative vector to multiply with the matrix.

inputInterpretation : CoopVecComponentType

Specifies how to interpret the values in the input vector (e.g. as packed values).

k : int

The number of columns in the matrix.

matrix : RWByteAddressBuffer

The matrix buffer to multiply with.

matrixOffset : int

Byte offset into the matrix buffer.

matrixInterpretation : CoopVecComponentType

Specifies how to interpret the values in the matrix.

bias : RWByteAddressBuffer

The bias buffer to add after multiplication.

biasOffset : int

Byte offset into the bias buffer.

biasInterpretation : CoopVecComponentType

Specifies how to interpret the values in the bias vector.

memoryLayout : CoopVecMatrixLayout

Specifies the memory layout of the matrix (row-major or column-major).

transpose : bool

Whether to transpose the matrix before multiplication.

matrixStride : uint

The stride between matrix rows/columns in bytes.

matrix : ByteAddressBuffer

The matrix buffer to multiply with.

bias : ByteAddressBuffer

The bias buffer to add after multiplication.

matrix : RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout>

The matrix buffer to multiply with.

bias : RWStructuredBuffer<IgnoredBufferElementType, DefaultDataLayout>

The bias buffer to add after multiplication.

matrix : StructuredBuffer<IgnoredBufferElementType, DefaultDataLayout>

The matrix buffer to multiply with.

bias : StructuredBuffer<IgnoredBufferElementType, DefaultDataLayout>

The bias buffer to add after multiplication.

matrixPtr : Ptr<void>

biasPtr : Ptr<void>

Return value

A new cooperative vector containing the result of the matrix multiplication with added bias.

Remarks

Unlike coopVecMatMulAdd, this function supports packed input interpretations where multiple values can be packed into each element of the input vector. The k parameter specifies the actual number of values to use from the packed input.

Depending on target hardware, some combinations of inputInterpretation, matrixInterpretation and memoryLayout may not be supported. For example, CoopVecComponentType.Float32 is not widely supported. Developers should query device properties through the host graphics API to find out which interpretations are supported.

Transposing is not supported when memoryLayout is RowMajor or ColumnMajor, and transpose must be false. Not all component types support transposing. When memoryLayout is InferencingOptimal or TrainingOptimal, matrixStride is ignored.

Availability and Requirements

Capability Set 1

Defined for the following targets:

hlsl

Available in all stages.

Requires capability: hlsl_coopvec_poc.

cpp

Available in all stages.

cuda

Available in all stages.

spirv

Available in all stages.

Requires capability: spvCooperativeVectorNV.

Capability Set 2

Defined for the following targets:

spirv

Available in all stages.

Requires capability: spvCooperativeVectorNV.