TINTERLEAVE
Tile Operation Diagram
Introduction
Interleave two source tiles (src0 and src1) into two destination tiles (dst0 and dst1). The operation combines elements from src0 and src1 in an alternating pattern: even-positioned elements of the interleaved stream are placed into dst0, and odd-positioned elements into dst1. Each destination tile holds half the interleaved stream, split at the midpoint.
TInterleave is the inverse of TDeInterleave.
Math Interpretation
Two-source form
Given two source tiles src0 and src1 with the same valid shape (validRows, validCols), construct an interleaved stream of length 2 × validCols per row:
interleaved2k=src0i,k,interleaved2k+1=src1i,k,0≤k<validCols\mathrm{interleaved}_{2k} = \mathrm{src0}_{i, k}, \quad \mathrm{interleaved}_{2k+1} = \mathrm{src1}_{i, k}, \quad 0 \le k < \mathrm{validCols}
Then split the interleaved stream into two halves:
dst0i,j=interleavedj,0≤j<validCols\mathrm{dst0}_{i, j} = \mathrm{interleaved}_{j}, \quad 0 \le j < \mathrm{validCols}
dst1i,j=interleavedvalidCols+j,0≤j<validCols\mathrm{dst1}_{i, j} = \mathrm{interleaved}_{\mathrm{validCols} + j}, \quad 0 \le j < \mathrm{validCols}
Where validRows = dst0.GetValidRow() and validCols = dst0.GetValidCol().
Assembly Syntax
PTO-AS form: see PTO-AS Specification.
Synchronous form:
%dst0, %dst1 = tinterleave %src0, %src1 : !pto.tile<...>
AS Level 1 (SSA)
%dst0, %dst1 = pto.tinterleave %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> (!pto.tile<...>, !pto.tile<...>)
AS Level 2 (DPS)
pto.tinterleave ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst0, %dst1 : !pto.tile_buf<...>, !pto.tile_buf<...>)
C++ Intrinsic
Declared in include/pto/common/pto_instr.hpp:
template <typename TileDataDst, typename TileDataSrc, typename... WaitEvents>
PTO_INST RecordEvent TInterleave(TileDataDst &dst1, TileDataDst &dst0, TileDataSrc &src1, TileDataSrc &src0,
WaitEvents &...events);
Note: The parameter order is
(dst1, dst0, src1, src0).dst0receives the first half of the interleaved stream (positions0 … validCols-1),dst1receives the second half (positionsvalidCols … 2×validCols-1).
Constraints
- Implementation checks (A5):
TileData::DTypemust be one of:int32_t,uint32_t,float,int16_t,uint16_t,half,bfloat16_t,uint8_t,int8_t.- Tile layout must be row-major (
TileData::isRowMajor). - All tiles (
dst0,dst1,src0,src1) must have the sameDTypeand the same valid shape. validColof all tiles must be even (dst0.GetValidCol() % 2 == 0). Since all tiles share the same valid shape, this is equivalent to requiringdst0.GetValidCol() % 2 == 0.
- Valid region:
- The op uses
dst0.GetValidRow()/dst0.GetValidCol()as the iteration domain;src0/src1/dst1are assumed to be compatible.
- The op uses
Examples
Auto
#include <pto/pto-inst.hpp>
using namespace pto;
void example_auto() {
using TileT = Tile<TileType::Vec, float, 16, 64>;
TileT src0(16, 64), src1(16, 64);
TileT dst0(16, 64), dst1(16, 64);
TInterleave(dst1, dst0, src1, src0);
}
Manual
#include <pto/pto-inst.hpp>
using namespace pto;
void example_manual() {
using TileT = Tile<TileType::Vec, half, 16, 256, BLayout::RowMajor, 16, 256>;
TileT src0, src1, dst0, dst1;
TASSIGN(src0, 0x1000);
TASSIGN(src1, 0x2000);
TASSIGN(dst0, 0x3000);
TASSIGN(dst1, 0x4000);
TInterleave(dst1, dst0, src1, src0);
}
ASM Form Examples
Auto Mode
# Auto mode: compiler/runtime-managed placement and scheduling.
%dst0, %dst1 = pto.tinterleave %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> (!pto.tile<...>, !pto.tile<...>)
Manual Mode
# Manual mode: resources must be bound explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %src0, @tile(0x1000)
# pto.tassign %src1, @tile(0x2000)
# pto.tassign %dst0, @tile(0x3000)
# pto.tassign %dst1, @tile(0x4000)
%dst0, %dst1 = pto.tinterleave %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> (!pto.tile<...>, !pto.tile<...>)
PTO Assembly Form
%dst0, %dst1 = tinterleave %src0, %src1 : !pto.tile<...>
# AS Level 2 (DPS)
pto.tinterleave ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst0, %dst1 : !pto.tile_buf<...>, !pto.tile_buf<...>)
Related Instructions
- TDeInterleave - De-interleave two tiles back into the original even/odd streams (inverse of TInterleave).