wave_sum

Live Demo Pending

This intrinsic does not yet have a self-contained interactive preview. Compute-only and external-resource intrinsics need a different demo path than the current fragment and 3D showcases.

The wave_sum function calculates the sum of a value across all active lanes (threads) in a wave/subgroup, with all lanes receiving the same result.

Signature

bwsl

wave_sum :: (T x) -> T {...}

Where T can be float, float2, float3, or float4.

Parameters

Parameter	Type	Description
`x`	`T`	Value to sum across lanes

Return Value

Returns the sum of x from all active lanes in the wave.

Example

bwsl

pipeline AverageLuminance {
  pass "Main" {
    compute "Main" [64, 1, 1] {
    uint localIdx = input.local_index;

    // Sample luminance at this thread's location
    float luminance = computeLuminance(input.global_id);

    // Sum luminance across all threads in the wave
    float waveTotal = wave_sum(luminance);

    // Calculate average (assuming 32-lane wave)
    float waveAvg = waveTotal / 32.0;

    // Use the result on one invocation
    if (localIdx == 0u) {
        float _ = waveAvg;
    }
    }
  }
}

Common Use Cases

Parallel Reduction

bwsl

// Sum values across wave for reduction
float localSum = computeValue(input.local_index);
float waveSum = wave_sum(localSum);

Average Calculation

bwsl

// Compute average across wave
float value = sample(tex, uv).r;
float sum = wave_sum(value);
float average = sum / float(waveSize);

Histogram Contribution

bwsl

// Count how many lanes have a value in range
float inRange = (value >= min && value < max) ? 1.0 : 0.0;
float count = wave_sum(inRange);

Prefix Sum Building Block

bwsl

// Wave-level sum is part of prefix sum algorithm
float local = data[input.local_index];
float waveTotal = wave_sum(local);

Statistics Gathering

bwsl

// Gather statistics across wave
float luminance = getLuminance(color);
float totalLum = wave_sum(luminance);

Wave Size

Wave size varies by GPU architecture: typically 32 (NVIDIA), 64 (AMD), or 32 (Intel). Use built-in constants or queries to get the actual wave size.

Active Lanes

Only active (non-diverged) lanes participate in wave operations. Be aware of control flow divergence within a wave.

Compiled Output

When compiled to GLSL:

glsl

// Requires GL_KHR_shader_subgroup_arithmetic
subgroupAdd(x)

When compiled to HLSL:

hlsl

WaveActiveSum(x)

When compiled to SPIR-V:

Uses OpGroupNonUniformFAdd with Reduce operation.

Test wave_sum in a live shader

Signature

Parameters

Return Value

Example

Common Use Cases

Parallel Reduction

Average Calculation

Histogram Contribution

Prefix Sum Building Block

Statistics Gathering

Compiled Output

See Also