Skip to main content

Module amdgpu

Module amdgpu 

Source
Available on target_arch=amdgpu only.
Expand description

amdgpu intrinsics

The reference is the LLVM amdgpu guide and the LLVM implementation. The order of intrinsics here follows the order in the LLVM implementation.

Functionsยง

llvm_ballot ๐Ÿ”’
llvm_dispatch_id ๐Ÿ”’
llvm_ds_bpermute ๐Ÿ”’ โš 
llvm_ds_permute ๐Ÿ”’ โš 
llvm_endpgm ๐Ÿ”’
llvm_groupstaticsize ๐Ÿ”’
llvm_inverse_ballot ๐Ÿ”’
llvm_mbcnt_hi ๐Ÿ”’
llvm_mbcnt_lo ๐Ÿ”’
llvm_perm ๐Ÿ”’ โš 
llvm_permlane16_swap ๐Ÿ”’ โš 
llvm_permlane16_u32 ๐Ÿ”’ โš 
llvm_permlane16_var ๐Ÿ”’ โš 
llvm_permlane32_swap ๐Ÿ”’ โš 
llvm_permlane64_u32 ๐Ÿ”’ โš 
llvm_permlanex16_u32 ๐Ÿ”’ โš 
llvm_permlanex16_var ๐Ÿ”’ โš 
llvm_readfirstlane_u32 ๐Ÿ”’
llvm_readfirstlane_u64 ๐Ÿ”’
llvm_readlane_u32 ๐Ÿ”’ โš 
llvm_readlane_u64 ๐Ÿ”’ โš 
llvm_s_barrier ๐Ÿ”’
llvm_s_barrier_signal ๐Ÿ”’ โš 
llvm_s_barrier_signal_isfirst ๐Ÿ”’ โš 
llvm_s_barrier_wait ๐Ÿ”’ โš 
llvm_s_get_barrier_state ๐Ÿ”’ โš 
llvm_s_get_waveid_in_workgroup ๐Ÿ”’
llvm_s_getpc ๐Ÿ”’
llvm_s_memrealtime ๐Ÿ”’
llvm_s_sethalt ๐Ÿ”’
llvm_s_sleep ๐Ÿ”’
llvm_sched_barrier ๐Ÿ”’ โš 
llvm_sched_group_barrier ๐Ÿ”’ โš 
llvm_update_dpp ๐Ÿ”’ โš 
llvm_wave_barrier ๐Ÿ”’
llvm_wave_id ๐Ÿ”’
llvm_wave_reduce_add ๐Ÿ”’
llvm_wave_reduce_and ๐Ÿ”’
llvm_wave_reduce_max ๐Ÿ”’
llvm_wave_reduce_min ๐Ÿ”’
llvm_wave_reduce_or ๐Ÿ”’
llvm_wave_reduce_umax ๐Ÿ”’
llvm_wave_reduce_umin ๐Ÿ”’
llvm_wave_reduce_xor ๐Ÿ”’
llvm_wavefrontsize ๐Ÿ”’
llvm_workgroup_id_x ๐Ÿ”’
llvm_workgroup_id_y ๐Ÿ”’
llvm_workgroup_id_z ๐Ÿ”’
llvm_workitem_id_x ๐Ÿ”’
llvm_workitem_id_y ๐Ÿ”’
llvm_workitem_id_z ๐Ÿ”’
llvm_writelane_u32 ๐Ÿ”’ โš 
llvm_writelane_u64 ๐Ÿ”’ โš 
ballotExperimental
Returns a bitfield (u32 or u64) containing the result of its i1 argument in all active lanes, and zero in all inactive lanes.
dispatch_idExperimental
Returns the id of the dispatch that is currently executed.
ds_bpermuteโš Experimental
Gather data across all lanes in a wavefront.
ds_permuteโš Experimental
Scatter data across all lanes in a wavefront.
endpgmExperimental
Stop execution of the wavefront.
groupstaticsizeExperimental
Returns the size of statically allocated shared memory for this program in bytes.
inverse_ballotExperimental
Indexes into the value with the current lane id and returns for each lane if the corresponding bit is set.
mbcnt_hiExperimental
Masked bit count, high 32 lanes.
mbcnt_loExperimental
Masked bit count, low 32 lanes.
permโš Experimental
Permute a 64-bit value.
permlane16_swapโš Experimental
Provide direct access to v_permlane16_swap_b32 instruction on supported targets.
permlane16_u32โš Experimental
Performs arbitrary gather-style operation within a row (16 contiguous lanes) of the second input operand.
permlane16_varโš Experimental
Performs arbitrary gather-style operation within a row (16 contiguous lanes) of the second input operand.
permlane32_swapโš Experimental
Provide direct access to v_permlane32_swap_b32 instruction on supported targets.
permlane64_u32โš Experimental
Swap value between upper and lower 32 lanes in a wavefront.
permlanex16_u32โš Experimental
Performs arbitrary gather-style operation across two rows (16 contiguous lanes) of the second input operand.
permlanex16_varโš Experimental
Performs arbitrary gather-style operation across two rows (16 contiguous lanes) of the second input operand.
readfirstlane_u32Experimental
Get value from the first active lane in the wavefront.
readfirstlane_u64Experimental
Get value from the first active lane in the wavefront.
readlane_u32โš Experimental
Get value from the lane at index lane in the wavefront.
readlane_u64โš Experimental
Get value from the lane at index lane in the wavefront.
s_barrierExperimental
Synchronize all wavefronts in a workgroup.
s_barrier_signalโš Experimental
Signal a specific barrier type.
s_barrier_signal_isfirstโš Experimental
Signal a specific barrier type.
s_barrier_waitโš Experimental
Wait for a specific barrier type.
s_get_barrier_stateโš Experimental
Get the state of a specific barrier type.
s_get_waveid_in_workgroupExperimental
Get the index of the current wavefront in the workgroup.
s_getpcExperimental
Returns the current process counter.
s_memrealtimeExperimental
Measures time based on a fixed frequency.
s_sethaltExperimental
Stop execution of the kernel.
s_sleepExperimental
Sleeps for approximately COUNT * 64 cycles.
sched_barrierโš Experimental
Prevent movement of some instruction types.
sched_group_barrierโš Experimental
Creates schedule groups with specific properties to create custom scheduling pipelines.
update_dppโš Experimental
The update_dpp intrinsic represents the update.dpp operation in AMDGPU. It takes an old value, a source operand, a DPP control operand, a row mask, a bank mask, and a bound control. This operation is equivalent to a sequence of v_mov_b32 operations.
wave_barrierExperimental
A barrier for only the threads within the current wavefront.
wave_idExperimental
Get the index of the current wavefront in the workgroup.
wave_reduce_addExperimental
Performs an arithmetic add reduction on the values provided by each lane in the wavefront.
wave_reduce_andExperimental
Performs a logical and reduction on the unsigned values provided by each lane in the wavefront.
wave_reduce_maxExperimental
Performs an arithmetic max reduction on the signed values provided by each lane in the wavefront.
wave_reduce_minExperimental
Performs an arithmetic min reduction on the signed values provided by each lane in the wavefront.
wave_reduce_orExperimental
Performs a logical or reduction on the unsigned values provided by each lane in the wavefront.
wave_reduce_umaxExperimental
Performs an arithmetic max reduction on the unsigned values provided by each lane in the wavefront.
wave_reduce_uminExperimental
Performs an arithmetic min reduction on the unsigned values provided by each lane in the wavefront.
wave_reduce_xorExperimental
Performs a logical xor reduction on the unsigned values provided by each lane in the wavefront.
wavefrontsizeExperimental
Returns the number of threads in a wavefront.
workgroup_id_xExperimental
Returns the x coordinate of the workgroup index within the dispatch.
workgroup_id_yExperimental
Returns the y coordinate of the workgroup index within the dispatch.
workgroup_id_zExperimental
Returns the z coordinate of the workgroup index within the dispatch.
workitem_id_xExperimental
Returns the x coordinate of the workitem index within the workgroup.
workitem_id_yExperimental
Returns the y coordinate of the workitem index within the workgroup.
workitem_id_zExperimental
Returns the z coordinate of the workitem index within the workgroup.
writelane_u32โš Experimental
Return value for the lane at index lane in the wavefront. Return default for all other lanes.
writelane_u64โš Experimental
Return value for the lane at index lane in the wavefront. Return default for all other lanes.