to_buffer
1. Hardware Background
Used to convert a tl.tensor object into a bl.buffer object specific to Ascend hardware. It is the core conversion interface between tensors and hardware memory buffers.
2. Interface Definition
| Python def to_buffer( tensor: tl.tensor, space: address_space = None, bind_buffer: buffer = None, _builder=None ) -> buffer: |
3. Parameter Description
| Parameter | Type | Required | Description |
| tensor | tl.tensor | Yes | The input tensor to be converted into a buffer |
| space | bl.address_space | No | Specifies the Ascend hardware address space where the target buffer resides |
| bind_buffer | bl.buffer | No | Optional. Bind the tensor directly to the specified target buffer |
| _builder | - | Internal parameter | Automatically passed by the compiler; users do not need to use it |
4. Return Value
-
Returns the
bl.bufferobject corresponding to the input tensor. -
If a
bind_bufferparameter is provided, the function returns that bound buffer directly.
5. Constraints
-
The interface follows the same constraint rules as
bl.allocate_local_buffer. -
The address-space parameter must strictly match memory regions supported by Ascend hardware (UB/L1/L0A/L0B/L0C).
6. Complete Example
Basic Usage (Kernel Definition + Compilation Verification)
| Python import triton import triton.language as tl from triton.compiler import ASTSource import triton.extension.buffer.language as bl import triton.language.extra.cann.extension as al # Get the current hardware compilation target target = triton.runtime.driver.active.get_current_target() @triton.jit def to_buffer_kernel(): # 1. Basic conversion: no address space specified a = tl.full((32, 2, 4), 0, dtype=tl.int64) a_buf = bl.to_buffer(a) # 2. Convert and specify the UB address space b = tl.full((32, 2, 4), 0, dtype=tl.int64) b_buf = bl.to_buffer(b, al.ascend_address_space.UB) # 3. Convert and specify the L1 address space c = tl.full((32, 2, 4), 0, dtype=tl.int64) c_buf = bl.to_buffer(c, al.ascend_address_space.L1) # 4. Convert and specify the L0A address space d = tl.full((32, 2, 4), 0, dtype=tl.int64) d_buf = bl.to_buffer(d, al.ascend_address_space.L0A) # 5. Convert and specify the L0B address space e = tl.full((32, 2, 4), 0, dtype=tl.int64) e_buf = bl.to_buffer(e, al.ascend_address_space.L0B) # 6. Convert and specify the L0C address space f = tl.full((32, 2, 4), 0, dtype=tl.int64) f_buf = bl.to_buffer(f, al.ascend_address_space.L0C) # Compilation test function def test_to_buffer(): src = ASTSource( fn=to_buffer_kernel, constants={}, signature={}, ) # Compile the kernel (to validate API legality) triton.compile(src=src, target=target) print("✅ to_buffer API compilation verified successfully") if __name__ == "__main__": test_to_buffer() |
Advanced Usage (Compilation + IR Printing)
| Python # Compile and print Triton IR (recommended for debugging) def test_to_buffer_print_ir(): src = ASTSource( fn=to_buffer_kernel, constants={}, signature={}, ) # Enable IR dumping compile_options = {"dump_ir": True, "optimization_level": 0} compiled_kernel = triton.compile(src=src, target=target, options=compile_options) print("\n📄 Kernel IR dump complete") if __name__ == "__main__": test_to_buffer_print_ir() |
7. Key Notes
-
This interface is the core entry point for converting between tensors and hardware buffers.
-
Supports manual selection of the full set of Ascend hardware address spaces (UB/L1/L0).
-
Supports binding to existing buffers to satisfy fine-grained memory management requirements.
-
Can only be used inside kernel functions decorated with
@triton.jit.