Intrinsics1 min read

wave_sum

Computes the sum of a value across all active lanes in a wave.

Reading Time
1 min
Word Count
173
Sections
12
Try It Live

Test wave_sum in a live shader

Open the playground, start from a visual preset, and wire wave_sum into the fragment stage to see how it behaves with real values.

Open Playground

Live Demo Pending

This intrinsic does not yet have a self-contained interactive preview. Compute-only and external-resource intrinsics need a different demo path than the current fragment and 3D showcases.

The wave_sum function calculates the sum of a value across all active lanes (threads) in a wave/subgroup, with all lanes receiving the same result.

Signature

bwsl
wave_sum :: (T x) -> T {...}

Where T can be float, float2, float3, or float4.

Parameters

ParameterTypeDescription
xTValue to sum across lanes

Return Value

Returns the sum of x from all active lanes in the wave.

Example

bwsl
pipeline AverageLuminance {
pass "Main" {
compute "Main" [64, 1, 1] {
uint localIdx = input.local_index;
// Sample luminance at this thread's location
float luminance = computeLuminance(input.global_id);
// Sum luminance across all threads in the wave
float waveTotal = wave_sum(luminance);
// Calculate average (assuming 32-lane wave)
float waveAvg = waveTotal / 32.0;
// Use the result on one invocation
if (localIdx == 0u) {
float _ = waveAvg;
}
}
}
}

Common Use Cases

Parallel Reduction

bwsl
// Sum values across wave for reduction
float localSum = computeValue(input.local_index);
float waveSum = wave_sum(localSum);

Average Calculation

bwsl
// Compute average across wave
float value = sample(tex, uv).r;
float sum = wave_sum(value);
float average = sum / float(waveSize);

Histogram Contribution

bwsl
// Count how many lanes have a value in range
float inRange = (value >= min && value < max) ? 1.0 : 0.0;
float count = wave_sum(inRange);

Prefix Sum Building Block

bwsl
// Wave-level sum is part of prefix sum algorithm
float local = data[input.local_index];
float waveTotal = wave_sum(local);

Statistics Gathering

bwsl
// Gather statistics across wave
float luminance = getLuminance(color);
float totalLum = wave_sum(luminance);

Wave Size

Wave size varies by GPU architecture: typically 32 (NVIDIA), 64 (AMD), or 32 (Intel). Use built-in constants or queries to get the actual wave size.

Active Lanes

Only active (non-diverged) lanes participate in wave operations. Be aware of control flow divergence within a wave.

Compiled Output

When compiled to GLSL:

glsl
// Requires GL_KHR_shader_subgroup_arithmetic
subgroupAdd(x)

When compiled to HLSL:

hlsl
WaveActiveSum(x)

When compiled to SPIR-V:

Uses OpGroupNonUniformFAdd with Reduce operation.

See Also