Beaver.MLIR.Dialect.Linalg (beaver v0.4.7)
Summary
Functions
linalg.abs - Applies abs(x) elementwise.
linalg.add - Adds two tensors elementwise.
linalg.batch_matmul - Performs a batched matrix multiplication of two 3D inputs.
linalg.batch_matvec - Performs a batched matrix-vector multiplication.
linalg.batch_mmt4d - Performs a batched matrix-matrix-transpose multiplication of two
batched-4D (5D) inputs.
linalg.batch_reduce_matmul - Performs a batch-reduce matrix multiplication on two inputs.
linalg.batch_vecmat - Performs a batched matrix-vector multiplication.
linalg.broadcast - Static broadcast operator
linalg.ceil - Applies ceil(x) elementwise.
linalg.contract -
linalg.conv_1d - Performs 1-D convolution with no channels.
linalg.conv_1d_ncw_fcw - Performs 1-D convolution.
linalg.conv_1d_nwc_wcf - Performs 1-D convolution.
linalg.conv_2d - Performs 2-D convolution with no channels.
linalg.conv_2d_nchw_fchw - Performs 2-D convolution.
linalg.conv_2d_nchw_fchw_q - Performs 2-D convolution with zero point offsets.
linalg.conv_2d_ngchw_fgchw - Performs 2-D grouped convolution.
linalg.conv_2d_ngchw_gfchw - Performs 2-D grouped convolution.
linalg.conv_2d_ngchw_gfchw_q - Performs 2-D grouped convolution with zero-point offsets.
linalg.conv_2d_nhwc_fhwc - Performs 2-D convolution.
linalg.conv_2d_nhwc_fhwc_q - Performs 2-D convolution with zero point offsets.
linalg.conv_2d_nhwc_hwcf - Performs 2-D convolution.
linalg.conv_2d_nhwc_hwcf_q - Performs 2-D convolution with zero point offsets.
linalg.conv_2d_nhwgc_gfhwc - Performs 2-D grouped convolution.
linalg.conv_2d_nhwgc_gfhwc_q - Performs 2-D grouped convolution with zero point offsets.
linalg.conv_3d - Performs 3-D convolution with no channels.
linalg.conv_3d_ncdhw_fcdhw - Performs 3-D convolution.
linalg.conv_3d_ndhwc_dhwcf - Performs 3-D convolution.
linalg.conv_3d_ndhwc_dhwcf_q - Performs 3-D convolution with zero point offsets.
linalg.copy - Copies the tensor elementwise.
linalg.depthwise_conv_1d_ncw_cw - Performs depth-wise 1-D convolution.
linalg.depthwise_conv_1d_nwc_wc - Performs depth-wise 1-D convolution.
linalg.depthwise_conv_1d_nwc_wcm - Performs depth-wise 1-D convolution.
linalg.depthwise_conv_2d_nchw_chw - Performs depth-wise 2-D convolution.
linalg.depthwise_conv_2d_nhwc_hwc - Performs depth-wise 2-D convolution.
linalg.depthwise_conv_2d_nhwc_hwc_q - Performs depth-wise 2-D convolution.
linalg.depthwise_conv_2d_nhwc_hwcm - Performs depth-wise 2-D convolution.
linalg.depthwise_conv_2d_nhwc_hwcm_q - Performs depth-wise 2-D convolution.
linalg.depthwise_conv_3d_ncdhw_cdhw - Performs depth-wise 3-D convolution.
linalg.depthwise_conv_3d_ndhwc_dhwc - Performs depth-wise 3-D convolution.
linalg.depthwise_conv_3d_ndhwc_dhwcm - Performs depth-wise 3-D convolution.
linalg.div - Divides the first tensor by the second tensor, elementwise.
linalg.div_unsigned - Divides the first tensor by the second tensor, elementwise. For integer
types, performs an unsigned division.
linalg.dot - Performs a dot product of two vectors to a scalar result.
linalg.elementwise - Performs element-wise operation
linalg.erf - Applies erf(x) elementwise.
linalg.exp - Applies exp(x) elementwise.
linalg.fill - Fills the output tensor with the given value.
linalg.fill_rng_2d - Fills the output tensor with pseudo random numbers.
linalg.floor - Applies floor(x) elementwise.
linalg.generic
linalg.index
linalg.log - Applies log(x) elementwise.
linalg.map - Elementwise operations
linalg.matmul -
linalg.matvec - Performs a matrix-vector multiplication.
linalg.max - Takes the max (signed) between two inputs, elementwise.
linalg.min - Takes the min (signed) between two inputs, elementwise.
linalg.mmt4d - Performs a matrix-matrix-transpose multiplication of two 4D inputs.
linalg.mul - Multiplies two tensors elementwise.
linalg.negf - Applies negf(x) elementwise.
linalg.pack - linalg.pack operation
linalg.pooling_nchw_max - Performs max pooling.
linalg.pooling_nchw_sum - Performs sum pooling.
linalg.pooling_ncw_max - Performs max pooling.
linalg.pooling_ncw_sum - Performs sum pooling.
linalg.pooling_ndhwc_max - Performs 3D max pooling.
linalg.pooling_ndhwc_min - Performs 3D min pooling.
linalg.pooling_ndhwc_sum - Performs 3D sum pooling.
linalg.pooling_nhwc_max - Performs max pooling.
linalg.pooling_nhwc_max_unsigned - Performs unsigned max pooling.
linalg.pooling_nhwc_min - Performs min pooling.
linalg.pooling_nhwc_min_unsigned - Performs unsigned min pooling.
linalg.pooling_nhwc_sum - Performs sum pooling.
linalg.pooling_nwc_max - Performs max pooling.
linalg.pooling_nwc_max_unsigned - Performs unsigned max pooling.
linalg.pooling_nwc_min - Performs min pooling.
linalg.pooling_nwc_min_unsigned - Performs unsigned min pooling.
linalg.pooling_nwc_sum - Performs sum pooling.
linalg.powf - Takes the powf(lhs, rhs) between two inputs, elementwise. For powf(arg, 2) use linalg.square.
linalg.quantized_batch_matmul - Performs a batched matrix multiplication of two 3D inputs.
linalg.quantized_matmul - Performs a matrix multiplication of two 2D inputs.
linalg.reciprocal - Applies reciprocal(x) elementwise.
linalg.reduce - Reduce operator
linalg.round - Applies round(x) elementwise.
linalg.rsqrt - Applies rsqrt(x) elementwise.
linalg.select - Chooses one value based on a binary condition supplied as its first operand.
linalg.softmax - Softmax operator
linalg.sqrt - Applies sqrt(x) elementwise.
linalg.square - Applies square(x) elementwise.
linalg.sub - Subtracts two tensors elementwise.
linalg.tanh - Applies tanh(x) elementwise.
linalg.transpose - Transpose operator
linalg.unpack - linalg.unpack operation
linalg.vecmat - Performs a vector-matrix multiplication.
linalg.winograd_filter_transform - Winograd filter transform operator
linalg.winograd_input_transform - Winograd input transform operator
linalg.winograd_output_transform - Winograd output transform operator
linalg.yield - Linalg yield operation
Functions
linalg.abs - Applies abs(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.add - Adds two tensors elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The shapes and element types must be identical. The appropriate casts, broadcasts and reductions should be done previously to calling this op.
This means reduction/broadcast/element cast semantics is explicit. Further
passes can take that into account when lowering this code. For example,
a linalg.broadcast + linalg.add sequence can be lowered to a
linalg.generic with different affine maps for the two operands.
linalg.batch_matmul - Performs a batched matrix multiplication of two 3D inputs.
Attributes
indexing_maps- Optional,AffineMapArrayAttr, AffineMap array attributecast- Optional,TypeFnAttr, allowed 32-bit signless integer cases: 0, 1
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting
them to the same data type as the accumulator/output.
Broadcast and Transpose semantics can be appiled by specifying the explicit attribute
'indexing_maps' as shown below. This is a list attribute, so must include maps for all
arguments if specified.
Example Transpose:
```mlir
linalg.batch_matmul
indexing_maps = [affine_map<(batch, m, n, k) -> (batch, k, m)>, // transpose
affine_map<(batch, m, n, k) -> (batch, k, n)>,
affine_map<(batch, m, n, k) -> (batch, m, n)>]
ins(%arg0, %arg1 : memref<2x5x3xf32>,memref<2x5x7xf32>)
outs(%arg2: memref<2x3x7xf32>)
```
Example Broadcast:
```mlir
linalg.batch_matmul
indexing_maps = [affine_map<(batch, m, n, k) -> (k)>, // broadcast
affine_map<(batch, m, n, k) -> (batch, k, n)>,
affine_map<(batch, m, n, k) -> (batch, m, n)>]
ins(%arg0, %arg1 : memref<5xf32>, memref<2x5x7xf32>)
outs(%arg2: memref<2x3x7xf32>)
```
Example Broadcast and Transpose:
```mlir
linalg.batch_matmul
indexing_maps = [affine_map<(batch, m, n, k) -> (m, k)>, // broadcast
affine_map<(batch, m, n, k) -> (batch, n, k)>, // transpose
affine_map<(batch, m, n, k) -> (batch, m, n)>]
ins(%arg0, %arg1 : memref<3x5xf32>, memref<2x7x5xf32>)
outs(%arg2: memref<2x3x7xf32>)
```
linalg.batch_matvec - Performs a batched matrix-vector multiplication.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.batch_mmt4d - Performs a batched matrix-matrix-transpose multiplication of two
batched-4D (5D) inputs.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Besides the outermost batch dimension has the same semantic as linalg.batch_matmul, the differences from linalg.batch_matmul in the non-batch dimensions are the same as linalg.mmt4d vs. linalg.matmul. See the description of lingalg.mmt4d.
linalg.batch_reduce_matmul - Performs a batch-reduce matrix multiplication on two inputs.
The partial multiplication results are reduced into a 2D output.Attributes
indexing_maps- Optional,AffineMapArrayAttr, AffineMap array attributecast- Optional,TypeFnAttr, allowed 32-bit signless integer cases: 0, 1
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
Broadcast and Transpose semantics can be applied by specifying the explicit attribute 'indexing_maps' as shown below. This is a list attribute, so must include maps for all arguments if specified.
Example Transpose:
linalg.batch_reduce_matmul
indexing_maps = [affine_map<(batch, m, n, k) -> (batch, k, m)>, // transpose
affine_map<(batch, m, n, k) -> (batch, k, n)>,
affine_map<(batch, m, n, k) -> (m, n)>]
ins(%arg0, %arg1 : memref<2x5x3xf32>,memref<2x5x7xf32>)
outs(%arg2: memref<3x7xf32>)Example Broadcast:
linalg.batch_reduce_matmul
indexing_maps = [affine_map<(batch, m, n, k) -> (k)>, // broadcast
affine_map<(batch, m, n, k) -> (batch, k, n)>,
affine_map<(batch, m, n, k) -> (m, n)>]
ins(%arg0, %arg1 : memref<5xf32>, memref<2x5x7xf32>)
outs(%arg2: memref<3x7xf32>)Example Broadcast and Transpose:
linalg.batch_reduce_matmul
indexing_maps = [affine_map<(batch, m, n, k) -> (m, k)>, // broadcast
affine_map<(batch, m, n, k) -> (batch, n, k)>, // transpose
affine_map<(batch, m, n, k) -> (m, n)>]
ins(%arg0, %arg1 : memref<3x5xf32>, memref<2x7x5xf32>)
outs(%arg2: memref<3x7xf32>)
linalg.batch_vecmat - Performs a batched matrix-vector multiplication.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.broadcast - Static broadcast operator
Attributes
dimensions- Single,DenseI64ArrayAttr, i64 dense array attribute
Operands
input- Single,TensorOrMemref, memref of any type values or ranked tensor of any type valuesinit- Single,TensorOrMemref, memref of any type values or ranked tensor of any type values
Results
result- Variadic,AnyTensor, variadic of tensor of any type values
Description
Broadcast the input into the given shape by adding dimensions.
Example:
%bcast = linalg.broadcast
ins(%input:tensor<16xf32>)
outs(%init:tensor<16x64xf32>)
dimensions = [1]
linalg.ceil - Applies ceil(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.contract -
Perform a contraction on two inputs, accumulating into the third.Attributes
indexing_maps- Single,AffineMapArrayAttr, AffineMap array attributecast- Optional,TypeFnAttr, allowed 32-bit signless integer cases: 0, 1
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyShaped, variadic of shaped of any type values
Description
The semantics of contracting inputs A and B on top of C to produce
output D is given by
D[H] = (SUM_{(I ∪ J) \ H} A[I] * B[J]) + C[H]
where I, J, and H are tuples of (pairwise distinct) dimension
identifiers - meant to range over valid indices - corresponding to the
results of the mandatory (projected permutation) indexing_maps for A,
B and C. SUM_{dims} means reduce over all valid indices for the
dimensions in the set dims (with I, J, and K treated as sets of
dim identifiers).
The iteration space consists of all dimensions in I, J and H, i.e. the
domain of each of the affine_maps. Like for einsums, the iteration type of
each dim is inferred and is either:
reduction: the dim is used to index into
AandBbut notC. Per the above semantics, these dims will be contracted, i.e. reduced over.parallel: the dim is used to index into
Cand at least one ofAandB, and - deriving from matmul terminology - is either an "M-like" dim (if used onAandC), an "N-like" dim (if used onBandC) or a "batch"-dim (if used to index intoA,B, andC).
For example, batch-matmul is given by I = ⟨ b, m, k ⟩, J = ⟨ b, k, n ⟩,
H = ⟨ b, m, n ⟩ (with k as a contracting reduction-dimension while m,
n and b have parallel iteration-type) and gets represented as:
%D = linalg.contract
indexing_maps = [affine_map<(batch, m, n, k) -> (batch, m, k)>,
affine_map<(batch, m, n, k) -> (batch, k, n)>,
affine_map<(batch, m, n, k) -> (batch, m, n)>]
ins(%A, %B: tensor<?x?x?xf32>, tensor<?x?x?xf32>)
outs(%C: tensor<?x?x?xf32>) -> tensor<?x?x?xf32>Note that by permuting dims in the affine_maps' results, accesses to
to the inputs and output can be arbitrarily transposed. Similarly, arbitrary
broadcasts can be achieved through leaving out dims on either input operand.
For example, the following is a variant of batch-matmul with a transposition
applied to A while B's 2D-matrix gets broadcasted along the batch dim:
linalg.contract
indexing_maps = [affine_map<(batch, m, n, k) -> (batch, k, m)>,
affine_map<(batch, m, n, k) -> (k, n)>,
affine_map<(batch, m, n, k) -> (batch, m, n)>]
ins(%A, %B: memref<?x?x?xf32>, memref<?x?xf32>)
outs(%C: memref<?x?x?xf32>)Numeric casting is performed on the operands to the inner multiplication, promoting/truncating them to the same data type as the accumulator/output.
TODO: Allow control over the combining/accumulating op and possibly the
multiplication op.
linalg.conv_1d - Performs 1-D convolution with no channels.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_1d_ncw_fcw - Performs 1-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NCW.
- Kernel: FCW.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_1d_nwc_wcf - Performs 1-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_2d - Performs 2-D convolution with no channels.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_2d_nchw_fchw - Performs 2-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NCHW.
- Kernel: FCHW.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_2d_nchw_fchw_q - Performs 2-D convolution with zero point offsets.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NCHW.
- Kernel: FCHW.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. This includes the zero point offsets common to quantized operations.
linalg.conv_2d_ngchw_fgchw - Performs 2-D grouped convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NGCHW.
- Kernel: FGCHW.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_2d_ngchw_gfchw - Performs 2-D grouped convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NGCHW.
- Kernel: GFCHW.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_2d_ngchw_gfchw_q - Performs 2-D grouped convolution with zero-point offsets.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NGCHW.
- Kernel: GFCHW.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. This includes the zero point offsets common to quantized operations.
linalg.conv_2d_nhwc_fhwc - Performs 2-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NHWC.
- Kernel: FHWC.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_2d_nhwc_fhwc_q - Performs 2-D convolution with zero point offsets.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NHWC.
- Kernel: FHWC.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. This includes the zero point offsets common to quantized operations.
linalg.conv_2d_nhwc_hwcf - Performs 2-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NHWC.
- Kernel: HWCF.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_2d_nhwc_hwcf_q - Performs 2-D convolution with zero point offsets.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NHWC.
- Kernel: HWCF.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. This includes the zero point offsets common to quantized operations.
linalg.conv_2d_nhwgc_gfhwc - Performs 2-D grouped convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NHWGC.
- Kernel: GFHWC.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_2d_nhwgc_gfhwc_q - Performs 2-D grouped convolution with zero point offsets.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NHWGC.
- Kernel: GFHWC.
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. This includes the zero point offsets common to quantized operations.
linalg.conv_3d - Performs 3-D convolution with no channels.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_3d_ncdhw_fcdhw - Performs 3-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_3d_ndhwc_dhwcf - Performs 3-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.conv_3d_ndhwc_dhwcf_q - Performs 3-D convolution with zero point offsets.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. This includes the zero point offsets common to quantized operations.
linalg.copy - Copies the tensor elementwise.
Attributes
cast- Optional,TypeFnAttr, allowed 32-bit signless integer cases: 0, 1
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.depthwise_conv_1d_ncw_cw - Performs depth-wise 1-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. Multiplier is set to 1 which is a special case for most depthwise convolutions.
linalg.depthwise_conv_1d_nwc_wc - Performs depth-wise 1-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. Multiplier is set to 1 which is a special case for most depthwise convolutions.
linalg.depthwise_conv_1d_nwc_wcm - Performs depth-wise 1-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.depthwise_conv_2d_nchw_chw - Performs depth-wise 2-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. Multiplier is set to 1 which is a special case for most depthwise convolutions.
linalg.depthwise_conv_2d_nhwc_hwc - Performs depth-wise 2-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. Multiplier is set to 1 which is a special case for most depthwise convolutions.
linalg.depthwise_conv_2d_nhwc_hwc_q - Performs depth-wise 2-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.depthwise_conv_2d_nhwc_hwcm - Performs depth-wise 2-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.depthwise_conv_2d_nhwc_hwcm_q - Performs depth-wise 2-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.depthwise_conv_3d_ncdhw_cdhw - Performs depth-wise 3-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. Multiplier is set to 1 which is a special case for most depthwise convolutions.
linalg.depthwise_conv_3d_ndhwc_dhwc - Performs depth-wise 3-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. Multiplier is set to 1 which is a special case for most depthwise convolutions.
linalg.depthwise_conv_3d_ndhwc_dhwcm - Performs depth-wise 3-D convolution.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.div - Divides the first tensor by the second tensor, elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The shapes and element types must be identical. The appropriate casts, broadcasts and reductions should be done previously to calling this op.
This means reduction/broadcast/element cast semantics is explicit. Further
passes can take that into account when lowering this code. For example,
a linalg.broadcast + linalg.div sequence can be lowered to a
linalg.generic with different affine maps for the two operands.
linalg.div_unsigned - Divides the first tensor by the second tensor, elementwise. For integer
types, performs an unsigned division.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The shapes and element types must be identical. The appropriate casts, broadcasts and reductions should be done previously to calling this op.
This means reduction/broadcast/element cast semantics is explicit. Further
passes can take that into account when lowering this code. For example,
a linalg.broadcast + linalg.div sequence can be lowered to a
linalg.generic with different affine maps for the two operands.
linalg.dot - Performs a dot product of two vectors to a scalar result.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.elementwise - Performs element-wise operation
Attributes
kind- Single,ElementwiseKindAttr, allowed 32-bit signless integer cases: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23indexing_maps- Optional,AffineMapArrayAttr, AffineMap array attribute
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The attribute kind describes arithmetic operation to perform. The
operation kind can be unary (e.g. max), binary (e.g. add) or ternary
(e.g. select).
By default, all indexing maps are identities. In the case of default indexing map, all input and output shapes must match. The number of dims in each of the identity maps is equal to the rank of the output type.
Affine-maps for operands and result are required to be provided by the user when a transpose and/or broadcast is needed on any operand. When a map is not provided, default identity maps are inferred for each operand.
Iterator-types are always all parallel.
Iterator-types are needed for constructing the underlying structured op.
The number of dims of the iterator-types are inferred from the rank of the result type.
Example:
Defining a unary linalg.elementwise with default indexing-map:
%exp = linalg.elementwise
kind=#linalg.elementwise_kind<exp>
ins(%x : tensor<4x16x8xf32>)
outs(%y: tensor<4x16x8xf32>) -> tensor<4x16x8xf32>Defining a binary linalg.elementwise with user-defined indexing-map:
%add = linalg.elementwise
kind=#linalg.elementwise_kind<add>
indexing_maps = [#transpose, #broadcast, #identity]
ins(%exp, %arg1 : tensor<4x16x8xf32>, tensor<4x16xf32>)
outs(%arg2: tensor<4x8x16xf32>) -> tensor<4x8x16xf32>
linalg.erf - Applies erf(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.exp - Applies exp(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.fill - Fills the output tensor with the given value.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Works for arbitrary ranked output tensors since the operation performs scalar accesses only and is thus rank polymorphic. Numeric casting is performed on the value operand, promoting it to the same data type as the output.
linalg.fill_rng_2d - Fills the output tensor with pseudo random numbers.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The operation generations pseudo random numbers using a linear congruential generator. It provides no guarantees regarding the distribution of the generated random numbers. Instead of generating the random numbers sequentially, it instantiates one random number generator per data element and runs them in parallel. The seed operand and the indices of the data element seed the random number generation. The min and max operands limit the range of the generated random numbers.
linalg.floor - Applies floor(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.generic
Attributes
indexing_maps- Single,AffineMapArrayAttr, AffineMap array attributeiterator_types- Single,IteratorTypeArrayAttr, Iterator type should be an enum.doc- Optional,StrAttr, string attributelibrary_call- Optional,StrAttr, string attribute
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Generic Linalg op form where the key properties of the computation are
specified as attributes. In pretty form, a linalg.generic op is written
as:
linalg.generic #trait_attribute
ins(%A, %B : memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>)
outs(%C : memref<?x?xf32, stride_specification>)
attrs = {other-optional-attributes}
{region}Where #trait_attributes is an alias of a dictionary attribute containing:
- doc [optional]: a documentation string
- indexing_maps: a list of AffineMapAttr, one AffineMapAttr per each input and output view. Such AffineMapAttr specifies the mapping between the loops and the indexing within each view.
- library_call [optional]: a StringAttr containing the name of an external library function that the linalg.generic operation maps to. The external library is assumed to be dynamically linked and no strong compile-time guarantees are provided. In the absence of such a library call, linalg.generic will always lower to loops.
- iterator_types: an ArrayAttr specifying the type of the enclosing loops. Each element of the list represents and iterator of one of the following types: parallel, reduction, window
Example: Defining a #matmul_trait attribute in MLIR can be done as follows:
#matmul_accesses = [
(m, n, k) -> (m, k),
(m, n, k) -> (k, n),
(m, n, k) -> (m, n)
]
#matmul_trait = {
doc = "C(m, n) += A(m, k) * B(k, n)",
indexing_maps = #matmul_accesses,
library_call = "linalg_matmul",
iterator_types = ["parallel", "parallel", "reduction"]
}And can be reused in multiple places as:
linalg.generic #matmul_trait
ins(%A, %B : memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>)
outs(%C : memref<?x?xf32, stride_specification>)
{other-optional-attributes} {
^bb0(%a: f32, %b: f32, %c: f32) :
%d = arith.mulf %a, %b: f32
%e = arith.addf %c, %d: f32
linalg.yield %e : f32
}This may lower to either:
call @linalg_matmul(%A, %B, %C) :
(memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>)
-> ()or IR resembling:
scf.for %m = %c0 to %M step %c1 {
scf.for %n = %c0 to %N step %c1 {
scf.for %k = %c0 to %K step %c1 {
%a = load %A[%m, %k] : memref<?x?xf32, stride_specification>
%b = load %B[%k, %n] : memref<?x?xf32, stride_specification>
%c = load %C[%m, %n] : memref<?x?xf32, stride_specification>
%d = arith.mulf %a, %b: f32
%e = arith.addf %c, %d: f32
store %e, %C[%m, %n] : memref<?x?x?xf32, stride_specification>
}
}
}
linalg.index
linalg.log - Applies log(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.map - Elementwise operations
Operands
inputs- Variadic,TensorOrMemref, variadic of memref of any type values or ranked tensor of any type valuesinit- Single,TensorOrMemref, memref of any type values or ranked tensor of any type values
Results
result- Variadic,AnyTensor, variadic of tensor of any type values
Description
Models elementwise operations on tensors in terms of arithmetic operations on the corresponding elements.
Example:
%add = linalg.map
ins(%lhs, %rhs : tensor<64xf32>, tensor<64xf32>)
outs(%init: tensor<64xf32>)
(%lhs_elem: f32, %rhs_elem: f32) {
%0 = arith.addf %lhs_elem, %rhs_elem: f32
linalg.yield %0: f32
}Shortened print form is available for simple maps where the body contains exactly two operations (the payload operation and a yield), the payload operation has the same number of operands as block arguments with operands matching block arguments in order, and the yield operand is the result of the payload operation.
The example above will be printed using the shortened form as:
%add = linalg.map { arith.addf }
ins(%lhs, %rhs : tensor<64xf32>, tensor<64xf32>)
outs(%init: tensor<64xf32>)
linalg.matmul -
Performs a matrix multiplication of two 2D inputs without broadcast or transpose.Attributes
indexing_maps- Optional,AffineMapArrayAttr, AffineMap array attributecast- Optional,TypeFnAttr, allowed 32-bit signless integer cases: 0, 1
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
Broadcast and Transpose semantics can be appiled by specifying the explicit attribute 'indexing_maps' as shown below.This is a list attribute, so the list must include all the maps if specified.
Example Transpose:
linalg.matmul
indexing_maps = [affine_map<(m, n, k) -> (k, m)>, // transpose
affine_map<(m, n, k) -> (k, n)>,
affine_map<(m, n, k) -> (m, n)>]
ins(%arg0, %arg1 : memref<5x3xf32>,memref<5x7xf32>)
outs(%arg2: memref<3x7xf32>)Example Broadcast:
linalg.matmul
indexing_maps = [affine_map<(m, n, k) -> (k)>, // broadcast
affine_map<(m, n, k) -> (k, n)>,
affine_map<(m, n, k) -> (m, n)>]
ins(%arg0, %arg1 : memref<3xf32>, memref<5x7xf32>)
outs(%arg2: memref<3x7xf32>)Example Broadcast and transpose:
linalg.matmul
indexing_maps = [affine_map<(m, n, k) -> (k, m)>, // transpose
affine_map<(m, n, k) -> (k)>, // broadcast
affine_map<(m, n, k) -> (m, n)>]
ins(%arg0, %arg1 : memref<5x3xf32>, memref<7xf32>)
outs(%arg2: memref<3x7xf32>)
linalg.matvec - Performs a matrix-vector multiplication.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.max - Takes the max (signed) between two inputs, elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The shapes and element types must be identical. The appropriate casts, broadcasts and reductions should be done previously to calling this op.
This means reduction/broadcast/element cast semantics is explicit. Further
passes can take that into account when lowering this code. For example,
a linalg.broadcast + linalg.max sequence can be lowered to a
linalg.generic with different affine maps for the two operands.
linalg.min - Takes the min (signed) between two inputs, elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The shapes and element types must be identical. The appropriate casts, broadcasts and reductions should be done previously to calling this op.
This means reduction/broadcast/element cast semantics is explicit. Further
passes can take that into account when lowering this code. For example,
a linalg.broadcast + linalg.min sequence can be lowered to a
linalg.generic with different affine maps for the two operands.
linalg.mmt4d - Performs a matrix-matrix-transpose multiplication of two 4D inputs.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Differences from linalg.matmul:
- The right hand side is transposed, whence the 't' in 'mmt'.
- The input and output tensors have a 4D shape instead of a 2D shape. They are interpreted as 2D matrices with one level of 2D tile subdivision, whence the 2+2=4 dimensions. The inner tile dimensions are identified with '0' suffixes below, for instance the LHS matrix shape (M, K, M0, K0) reads as: MxK tiles, each of shape M0xK0.
linalg.mul - Multiplies two tensors elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The shapes and element types must be identical. The appropriate casts, broadcasts and reductions should be done previously to calling this op.
This means reduction/broadcast/element cast semantics is explicit. Further
passes can take that into account when lowering this code. For example,
a linalg.broadcast + linalg.mul sequence can be lowered to a
linalg.generic with different affine maps for the two operands.
linalg.negf - Applies negf(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.pack - linalg.pack operation
This op has support for result type inference.
Attributes
outer_dims_perm- Optional,DenseI64ArrayAttr, i64 dense array attributeinner_dims_pos- Single,DenseI64ArrayAttr, i64 dense array attributestatic_inner_tiles- Single,DenseI64ArrayAttr, i64 dense array attribute
Operands
source- Single,AnyRankedTensor, ranked tensor of any type valuesdest- Single,AnyRankedTensor, ranked tensor of any type valuespadding_value- Optional,AnyType, any typeinner_tiles- Variadic,Index, index
Results
result- Single,AnyRankedTensor, ranked tensor of any type values
Description
The "pack" operation converts a source tensor of rank n into a result
tensor of rank n + k with a tiled and packed layout (maybe with padding)
and optionally transposes the tiled source tensor dimensions.
inner_tiles (mandatory) specifies k tile sizes. These tile sizes
correspond to the least significant ("inner") result tensor dimension sizes,
in the same order. Tile sizes can be static or dynamic.
inner_dims_pos (mandatory) specifies k source tensor dimensions that are
being tiled, where 0 <= k <= n.
inner_dims_pos[i]specifies the source tensor dimension tiled byinner_tiles[i]where0 <= i < k. All the values ininner_dims_posare within [0, n).- The tiled dimensions (of size
inner_tiles) are added to the end of the result tensor in the order in which they appear, i.e.shape(result)[rank(result) + i] = inner_tiles[i]for0 <= i < k. - The following relationship for the tiled dimensions holds:
shape(result)[inner_dims_pos[i]] = shape(source)[inner_dims_pos[i]] / inner_tiles[i], where (⌈/⌉ indicates CeilDiv).
Example: If inner_tiles = [16, 32], the result tensor has a shape of
...x16x32. If inner_dims_pos = [0, 1], the 0th source dimension is tiled
by 16 and the 1st source dimension is tiled by 32. Other source dimensions
(if any) are not tiled. If inner_dims_pos = [1, 0], the 1st dimension is
tiled by 16 and the 0th dimension is tiled by 32.
Example:
// NC to NCnc
%0 = linalg.pack %source inner_dims_pos = [0, 1] inner_tiles = [8, 32]
into %dest : tensor<128x256xf32> -> tensor<16x8 x 8x32 xf32>
// \ / \ /
// Outer Dims: 16x8 Inner Dims: 8x32
// CHW to CHWhw
%0 = linalg.pack %source inner_dims_pos = [2, 1] inner_tiles = [4, 2]
into %dest : tensor<3x20x24xf32> -> tensor<3x10x6 x 4x2 xf32>
// \ / \ /
// Outer Dims: 3x10x6 Inner Dims: 4x2
// HCW to HCWhw
%0 = linalg.pack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2]
into %dest : tensor<18x3x32xf32> -> tensor<9x3x8 x 4x2 xf32>
// \ / \ /
// Outer Dims: 9x3x8 Inner Dims: 4x2outer_dims_perm (optional) specifies a permutation for the outer
dimensions. If specified, it must have n elements.
Example:
// CK to KCck
%0 = linalg.pack %source outer_dims_perm = [1, 0] inner_dims_pos = [0, 1]
inner_tiles = [8, 32] into %dest
: tensor<128x256xf32> -> tensor<8x16 x 8x32 xf32>
// \ /
// compare with "NC to NCnc": outer dims are transposedpadding_value specifies a padding value at the boundary on non-perfectly
divisible dimensions. Padding is optional:
- If absent, it is assumed that for all inner tiles,
shape(source)[inner_dims_pos[i]] % inner_tiles[i] == 0, i.e. all inner tiles divide perfectly the corresponding outer dimension in the result tensor. It is UB if the tile does not perfectly divide the dimension. - If present, it will pad along high dimensions (high-padding) to make the tile complete. Note that it is not allowed to have artificial padding that is not strictly required by linalg.pack (i.e., padding past what is needed to complete the last tile along each packed dimension). It is UB if extra padding is requested. It is not possible to verify the requirements statically with dynamic shapes, so they are treated as UB.
Example:
%0 = linalg.pack %arg0 padding_value(%pad : f32) outer_dims_perm = [2, 1, 0]
inner_dims_pos = [1] inner_tiles = [2] into %arg1
: tensor<200x127x256xf32> -> tensor<256x64x200x2xf32>
// \
// padded and tiled dim
//
// Source dimension 1 is tiled. 64 does not divide 127 evenly, so 1 padded
// element is added at the end.
//
// Note: Only tiled dimensions can be padded.Invalid example that has artificial padding:
%0 = linalg.pack %src padding_value(%cst : f32) inner_dims_pos = [0]
inner_tiles = [8] into %dest
: tensor<9xf32> -> tensor<3x8xf32>
// \
// expect tensor<2x8xf32> because CeilDiv(9, 8) = 2
linalg.pooling_nchw_max - Performs max pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nchw_sum - Performs sum pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NCHW.
- Kernel: HW.
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_ncw_max - Performs max pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_ncw_sum - Performs sum pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NCW.
- Kernel: W.
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_ndhwc_max - Performs 3D max pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_ndhwc_min - Performs 3D min pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_ndhwc_sum - Performs 3D sum pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [3]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nhwc_max - Performs max pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nhwc_max_unsigned - Performs unsigned max pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nhwc_min - Performs min pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nhwc_min_unsigned - Performs unsigned min pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nhwc_sum - Performs sum pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [2]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NHWC.
- Kernel: HW.
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nwc_max - Performs max pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nwc_max_unsigned - Performs unsigned max pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nwc_min - Performs min pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nwc_min_unsigned - Performs unsigned min pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.pooling_nwc_sum - Performs sum pooling.
Attributes
strides- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]dilations- Optional, anonymous/composite constraint, 64-bit signless int elements attribute of shape [1]
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Layout:
- Input: NWC.
- Kernel: W.
Numeric casting is performed on the input operand, promoting it to the same data type as the accumulator/output.
linalg.powf - Takes the powf(lhs, rhs) between two inputs, elementwise. For powf(arg, 2) use linalg.square.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Only applies to floating point values.
The shapes and element types must be identical. The appropriate casts, broadcasts and reductions should be done previously to calling this op.
This means reduction/broadcast/element cast semantics is explicit. Further
passes can take that into account when lowering this code. For example,
a linalg.broadcast + linalg.powf sequence can be lowered to a
linalg.generic with different affine maps for the two operands.
linalg.quantized_batch_matmul - Performs a batched matrix multiplication of two 3D inputs.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. The quantized variant includes zero-point adjustments for the left and right operands of the matmul.
linalg.quantized_matmul - Performs a matrix multiplication of two 2D inputs.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. The quantized variant includes zero-point adjustments for the left and right operands of the matmul.
linalg.reciprocal - Applies reciprocal(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.reduce - Reduce operator
Attributes
dimensions- Single,DenseI64ArrayAttr, i64 dense array attribute should be in increasing order
Operands
inputs- Variadic,TensorOrMemref, variadic of memref of any type values or ranked tensor of any type valuesinits- Variadic,TensorOrMemref, variadic of memref of any type values or ranked tensor of any type values
Results
- anonymous - Variadic,
AnyTensor, variadic of tensor of any type values
Description
Executes combiner on the dimensions of inputs and returns the
reduced result. The dimensions attribute needs to list the reduction
dimensions in increasing order.
Example:
%reduce = linalg.reduce
ins(%input:tensor<16x32x64xf32>)
outs(%init:tensor<16x64xf32>)
dimensions = [1]
(%in: f32, %out: f32) {
%0 = arith.addf %out, %in: f32
linalg.yield %0: f32
}Shortened print form is available for simple reduces where the body contains exactly two operations (the payload operation and a yield), the payload operation has the same number of operands as block arguments, the first block argument (init) is the last operand of the payload operation with remaining operands matching remaining block arguments in order, and the yield operand is the result of the payload operation.
The example above will be printed using the shortened form as:
%reduce = linalg.reduce { arith.addf }
ins(%input:tensor<16x32x64xf32>)
outs(%init:tensor<16x64xf32>)
dimensions = [1]
linalg.round - Applies round(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.rsqrt - Applies rsqrt(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.select - Chooses one value based on a binary condition supplied as its first operand.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The shapes and element types must be identical. The appropriate casts, broadcasts and reductions should be done previously to calling this op.
This means reduction/broadcast/element cast semantics is explicit. Further
passes can take that into account when lowering this code. For example,
a linalg.broadcast + linalg.select sequence can be lowered to a
linalg.generic with different affine maps for the two operands.
linalg.softmax - Softmax operator
Attributes
dimension- Single,I64Attr, 64-bit signless integer attribute
Operands
input- Single,AnyShaped, shaped of any type valuesoutput- Single,AnyShaped, shaped of any type values
Results
result- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
linalg.softmax computes a numerically stable version of softmax.
For a given input tensor and a specified dimension d, compute:
- the max
malong that dimensiond - f(x) = exp(x - m)
- sum f(x) along dimension d to get l(x).
- compute the final result f(x) / l(x).
This is an aggregate linalg operation that further reduces to a small DAG of structured operations.
Warning: Regarding the tiling capabilities, the implementation doesn't check that the provided dimensions make sense. This is the responsability of the transformation calling the tiling to ensure that the provided sizes for each dimension make sense with respect to the semantic of softmax.
linalg.sqrt - Applies sqrt(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.square - Applies square(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.sub - Subtracts two tensors elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
The shapes and element types must be identical. The appropriate casts, broadcasts and reductions should be done previously to calling this op.
This means reduction/broadcast/element cast semantics is explicit. Further
passes can take that into account when lowering this code. For example,
a linalg.broadcast + linalg.sub sequence can be lowered to a
linalg.generic with different affine maps for the two operands.
linalg.tanh - Applies tanh(x) elementwise.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
No numeric casting is performed on the input operand.
linalg.transpose - Transpose operator
Attributes
permutation- Single,DenseI64ArrayAttr, i64 dense array attribute
Operands
input- Single,TensorOrMemref, memref of any type values or ranked tensor of any type valuesinit- Single,TensorOrMemref, memref of any type values or ranked tensor of any type values
Results
result- Variadic,AnyTensor, variadic of tensor of any type values
Description
Permutes the dimensions of input according to the given permutation.
dim(result, i) = dim(input, permutation[i])
This op actually moves data, unlike memref.transpose which is a metadata
operation only that produces a transposed "view".
Example:
%transpose = linalg.transpose
ins(%input:tensor<16x64xf32>)
outs(%init:tensor<64x16xf32>)
permutation = [1, 0]
linalg.unpack - linalg.unpack operation
This op has support for result type inference.
Attributes
outer_dims_perm- Optional,DenseI64ArrayAttr, i64 dense array attributeinner_dims_pos- Single,DenseI64ArrayAttr, i64 dense array attributestatic_inner_tiles- Single,DenseI64ArrayAttr, i64 dense array attribute
Operands
source- Single,AnyRankedTensor, ranked tensor of any type valuesdest- Single,AnyRankedTensor, ranked tensor of any type valuesinner_tiles- Variadic,Index, index
Results
result- Single,AnyRankedTensor, ranked tensor of any type values
Description
The "unpack" operation converts a source tensor of rank n with a tiled and
packed layout to a result tensor of rank n - k.
inner_tiles (mandatory) specifies k tile sizes. These tile sizes
correspond to the least significant ("inner") source tensor dimension sizes.
The behavior of this op is undefined if:
inner_tilesdo not exactly match with the corresponding source tensor dimension sizes.- Or,
inner_tiles[i]does not divide the size of dimensioninner_dims_pos[i](assuming thatouter_dims_permis not specified) evenly.
inner_dims_pos (mandatory) specifies k result tensor (i.e. unpacked
tensor) dimensions that were tiled with the inner_tiles to create the
packed source tensor. The source tensor (i.e. packed tensor) dimensions can
be unpacked given inner_dims_pos as follows.
- For
0 <= i < kthe following relationship holds:shape(result)[inner_dims_pos[i]] <= shape(source)[n-k+i] * shape(source)[inner_dims_pos[i]]. - For
0 <= j < n-kandjnot ininner_dims_posthe following relationship holds:shape(result)[j] = shape(source)[j].
outer_dims_perm (optional) specifies a permutation for the outer
dimensions. If specified, it must have n - k elements. If specified, this
permutation is applied before combining any dimensions.
Note, the unpack operation may drop any padding introduced by the pack
operation and hence the following holds
NumElementsOf(source) >= NumElementsOf(result).
Examples:
// NCnc to NC:
%0 = linalg.unpack %source inner_dims_pos = [0, 1] inner_tiles = [8, 32]
into %dest : tensor<16x8 x 8x32 xf32> -> tensor<128x256xf32>
// \ / \ /
// Outer Dims: 16x8 Inner Dims: 8x32
// CK to KCck:
%0 = linalg.unpack %source outer_dims_perm = [1, 0] inner_dims_pos = [0, 1]
inner_tiles = [8, 32]
into %dest : tensor<8x16 x 8x32 xf32> -> tensor<128x256xf32>
// \ / \ /
// Outer Dims: 8x16 Inner Dims: 8x32
// CHW to CHWhw:
%0 = linalg.unpack %source inner_dims_pos = [2, 1] inner_tiles = [4, 2]
into %dest : tensor<3x10x6 x 4x2 xf32> -> tensor<3x20x24xf32>
// \ / \ /
// Outer Dims: 3x10x6 Inner Dims: 4x2
// HCW to HCWhw
%0 = linalg.unpack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2]
into %dest : tensor<9x3x8 x 4x2 xf32> -> tensor<18x3x32xf32>
// \ / \ /
// Outer Dims: 9x3x8 Inner Dims: 4x2
linalg.vecmat - Performs a vector-matrix multiplication.
Operands
inputs- Variadic,AnyType, variadic of any typeoutputs- Variadic,AnyShaped, variadic of shaped of any type values
Results
result_tensors- Variadic,AnyRankedTensor, variadic of ranked tensor of any type values
Description
Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output.
linalg.winograd_filter_transform - Winograd filter transform operator
Attributes
fmr- Single,WinogradConv2DFmr, allowed 32-bit signless integer cases: 0, 1, 2
Operands
filter- Single, anonymous/composite constraint, 4D tensor of any type valuesoutput- Single, anonymous/composite constraint, 4D tensor of any type values
Results
result- Single, anonymous/composite constraint, 4D tensor of any type values
Description
Winograd Conv2D algorithm will convert linalg Conv2D operator into batched matrix multiply. Before the matrix multiply, it will convert filter and input into a format suitable for batched matrix multiply. After the matrix multiply, it will convert output to the final result tensor.
The algorithm F(m x m, r x r) is
Y = A^T x [(G x g x G^T) @ (B^T x d x B)] x A
The size of output Y is m x m. The size of filter g is r x r. The size of input d is (m + r - 1) x (m + r - 1). A^T, A, G^T, G, B^T, and B are transformation matrices.
This operator is defined to represent the high level concept of filter transformation (G x g x G^T) in the Winograd Conv2D algorithm.
linalg.winograd_input_transform - Winograd input transform operator
Attributes
fmr- Single,WinogradConv2DFmr, allowed 32-bit signless integer cases: 0, 1, 2
Operands
input- Single, anonymous/composite constraint, 4D tensor of any type valuesoutput- Single, anonymous/composite constraint, 6D tensor of any type values
Results
result- Single, anonymous/composite constraint, 6D tensor of any type values
Description
Winograd Conv2D algorithm will convert linalg Conv2D operator into batched matrix multiply. Before the matrix multiply, it will convert filter and input into a format suitable for batched matrix multiply. After the matrix multiply, it will convert output to the final result tensor.
The algorithm F(m x m, r x r) is
Y = A^T x [(G x g x G^T) @ (B^T x d x B)] x A
The size of output Y is m x m. The size of filter g is r x r. The size of input d is (m + r - 1) x (m + r - 1). A^T, A, G^T, G, B^T, and B are transformation matrices.
This operator is defined to represent the high level concept of input transformation (B^T x d x B) in the Winograd Conv2D algorithm.
linalg.winograd_output_transform - Winograd output transform operator
Attributes
fmr- Single,WinogradConv2DFmr, allowed 32-bit signless integer cases: 0, 1, 2
Operands
value- Single, anonymous/composite constraint, 6D tensor of any type valuesoutput- Single, anonymous/composite constraint, 4D tensor of any type values
Results
result- Single, anonymous/composite constraint, 4D tensor of any type values
Description
Winograd Conv2D algorithm will convert linalg Conv2D operator into batched matrix multiply. Before the matrix multiply, it will convert filter and input into a format suitable for batched matrix multiply. After the matrix multiply, it will convert output to the final result tensor.
The algorithm F(m x m, r x r) is
Y = A^T x [(G x g x G^T) @ (B^T x d x B)] x A
The size of output Y is m x m. The size of filter g is r x r. The size of input d is (m + r - 1) x (m + r - 1). A^T, A, G^T, G, B^T, and B are transformation matrices.
This operator is defined to represent the high level concept of output transformation (A^T x y x A) in the Winograd Conv2D algorithm.
linalg.yield - Linalg yield operation
Operands
values- Variadic,AnyType, variadic of any type
Description
linalg.yield is a special terminator operation for blocks inside regions
in linalg generic ops. It returns values to the immediately enclosing
linalg generic op.
Example:
linalg.yield %f0, %f1 : f32, f32