Support FP16/BF16 in MLIR TOSA (Half-precision Tensors and ops)

We've intended to lower DL models to MLIR TOSA, and **we found MLIR does not have the full support of half-precision ops**, for example, AvgPool2dOp in TOSA only accepts fp32, int8 and int64. We've tried to use tosa.CastOp to convert fp16/bf16 to fp32, but still, CastOp has no such support for half-precision inputs. Given that half-precision training and inference is very important for large language models, **is there any plan in MLIR to support half-precision Tensors and ops?**

``` C++
LogicalResult tosa::AvgPool2dOp::verify() {
  auto inputETy = llvm::cast<ShapedType>(getInput().getType()).getElementType();
  auto resultETy = llvm::cast<ShapedType>(getType()).getElementType();

  if (auto quantType =
          llvm::dyn_cast<mlir::quant::UniformQuantizedType>(inputETy))
    inputETy = quantType.getStorageType();

  if (auto quantType =
          llvm::dyn_cast<mlir::quant::UniformQuantizedType>(resultETy))
    resultETy = quantType.getStorageType();

  auto accType = getAccType();
  if (llvm::isa<IntegerType>(inputETy) && !accType.isInteger(32))
    return emitOpError("accumulator type for integer tensor is not i32");

  if ((inputETy.isBF16() || inputETy.isF16()) &&
      !(accType.isF16() || accType.isF32()))
    return emitOpError("accumulator type for f16/bf16 tensor is not f16/f32");

  if (inputETy.isF32() && !accType.isF32())
    return emitOpError("accumulator type for f32 tensor is not f32");

  if (inputETy.isF32() && resultETy.isF32())
    return success();
  if (inputETy.isInteger(8) && resultETy.isInteger(8))
    return success();
  if (inputETy.isInteger(16) && resultETy.isInteger(16))
    return success();

  **return emitOpError("input/output element types are incompatible.");** **//Error for FP16 and BF16 inputs.**
}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support FP16/BF16 in MLIR TOSA (Half-precision Tensors and ops) #63424

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support FP16/BF16 in MLIR TOSA (Half-precision Tensors and ops) #63424

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions