-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[mlir][quant] Bump up the MaxStorageBits from 32 to 64. #91706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
64-bit storage type for quantized type is often used for certain settings like on-device or when dealing with models that are quality-sensitive. For example, a TFLite micro kernel supports 64-bit quantized types for 16x8 quantized operations. Currently the Quant dialect allows up to 32 bit storage bits: [MaxStorageBits](https://github.com/llvm/llvm-project/blob/b903badd73a2467fdd4e363231f2bf9b0704b546/mlir/include/mlir/Dialect/Quant/QuantTypes.h#L55). Bump this limit up to 64. Issue llvm#91584
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-quant Author: Dan Suh (dansuh17) Changes64-bit storage type for quantized type is often used for certain settings like on-device or when dealing with models that are quality-sensitive. For example, a TFLite micro kernel supports 64-bit quantized types for 16x8 quantized operations. Currently the Quant dialect allows up to 32 bit storage bits: MaxStorageBits. Bump this limit up to 64. Issue #91584 Full diff: https://github.com/llvm/llvm-project/pull/91706.diff 5 Files Affected:
diff --git a/mlir/include/mlir/Dialect/Quant/QuantTypes.h b/mlir/include/mlir/Dialect/Quant/QuantTypes.h
index de5aed0a91a20..09cddf3e96f4d 100644
--- a/mlir/include/mlir/Dialect/Quant/QuantTypes.h
+++ b/mlir/include/mlir/Dialect/Quant/QuantTypes.h
@@ -52,7 +52,8 @@ class QuantizedType : public Type {
using Type::Type;
/// The maximum number of bits supported for storage types.
- static constexpr unsigned MaxStorageBits = 32;
+ /// NOTE: u64 storage type is not yet supported.
+ static constexpr unsigned MaxStorageBits = 64;
static LogicalResult verify(function_ref<InFlightDiagnostic()> emitError,
unsigned flags, Type storageType,
diff --git a/mlir/lib/Dialect/Quant/IR/QuantTypes.cpp b/mlir/lib/Dialect/Quant/IR/QuantTypes.cpp
index 81e3b914755be..f588d247e7d57 100644
--- a/mlir/lib/Dialect/Quant/IR/QuantTypes.cpp
+++ b/mlir/lib/Dialect/Quant/IR/QuantTypes.cpp
@@ -44,9 +44,16 @@ QuantizedType::verify(function_ref<InFlightDiagnostic()> emitError,
if (integralWidth == 0 || integralWidth > MaxStorageBits)
return emitError() << "illegal storage type size: " << integralWidth;
- // Verify storageTypeMin and storageTypeMax.
bool isSigned =
(flags & QuantizationFlags::Signed) == QuantizationFlags::Signed;
+ // u64 is not yet supproted because its full range cannot be represented
+ // by the type of `storageTypeMax`, making it difficult to verify the
+ // storage type.
+ if (!isSigned && integralWidth == 64)
+ return emitError()
+ << "illegal storage type; u64 storage type is not supported";
+
+ // Verify storageTypeMin and storageTypeMax.
int64_t defaultIntegerMin =
getDefaultMinimumForInteger(isSigned, integralWidth);
int64_t defaultIntegerMax =
diff --git a/mlir/test/Dialect/Quant/parse-any-invalid.mlir b/mlir/test/Dialect/Quant/parse-any-invalid.mlir
index 41c5f93070717..a7c7f461846a4 100644
--- a/mlir/test/Dialect/Quant/parse-any-invalid.mlir
+++ b/mlir/test/Dialect/Quant/parse-any-invalid.mlir
@@ -26,12 +26,12 @@
!qalias = !quant.any<i<-4:3>:f32>
// -----
-// Unrecognized storage type: storage size > 32
-// expected-error@+1 {{illegal storage type size: 33}}
-!qalias = !quant.any<i33:f32>
+// Unrecognized storage type: storage size > 64
+// expected-error@+1 {{illegal storage type size: 65}}
+!qalias = !quant.any<i65:f32>
// -----
-// Unrecognized storage type: storage size < 0
+// Unrecognized storage type: storage size > 64
// expected-error@+1 {{illegal storage type size: 1024}}
!qalias = !quant.any<i1024<-4:3>:f32>
diff --git a/mlir/test/Dialect/Quant/parse-uniform-invalid.mlir b/mlir/test/Dialect/Quant/parse-uniform-invalid.mlir
index a82e8efdb1a3c..5f7ac004c49b9 100644
--- a/mlir/test/Dialect/Quant/parse-uniform-invalid.mlir
+++ b/mlir/test/Dialect/Quant/parse-uniform-invalid.mlir
@@ -46,9 +46,9 @@
!qalias = !quant.uniform<i<-4:3>:f32, 0.99872:127>
// -----
-// Unrecognized storage type: storage size > 32
-// expected-error@+1 {{illegal storage type size: 33}}
-!qalias = !quant.uniform<i33:f32, 0.99872:127>
+// Unrecognized storage type: storage size > 64
+// expected-error@+1 {{illegal storage type size: 65}}
+!qalias = !quant.uniform<i65:f32, 0.99872:127>
// -----
// Unrecognized storage type: storage size < 0
@@ -60,6 +60,11 @@
// expected-error@+1 {{invalid integer width}}
!qalias = !quant.uniform<i123123123120<-4:3>:f32, 0.99872:127>
+// -----
+// Illegal storage type: u64
+// expected-error@+1 {{illegal storage type; u64 storage type is not supported}}
+!qalias = !quant.uniform<u64:f32, 0.99782:127>
+
// -----
// Illegal storage min/max: max - min < 0
// expected-error@+1 {{illegal storage min and storage max: (2:1)}}
diff --git a/mlir/test/Dialect/Quant/parse-uniform.mlir b/mlir/test/Dialect/Quant/parse-uniform.mlir
index 4fbe86d935ea3..5bc391e9ea8ca 100644
--- a/mlir/test/Dialect/Quant/parse-uniform.mlir
+++ b/mlir/test/Dialect/Quant/parse-uniform.mlir
@@ -83,6 +83,15 @@ func.func @parse() -> !qalias {
return %0 : !qalias
}
+// -----
+// Storage type: i64
+// CHECK: !quant.uniform<i64:f32, 2.000000e+02>
+!qalias = !quant.uniform<i64:f32, 2.0e+2>
+func.func @parse() -> !qalias {
+ %0 = "foo"() : () -> !qalias
+ return %0 : !qalias
+}
+
// -----
// Expressed type: f32
// CHECK: !quant.uniform<u8:f32, 2.000000e+02>
|
Typo fix Co-authored-by: Mehdi Amini <[email protected]>
(first time contribution via github) do I need a separate approval for merging? |
@joker-eph seems like I can't merge the branch due to the workflow pending approval. Could you take a look again and approve the workflow? |
@joker-eph gentle ping regarding the workflow approval |
@joker-eph Friendly ping for approval |
64-bit storage type for quantized type is often used for certain settings like on-device or when dealing with models that are quality-sensitive. For example, a TFLite micro kernel supports 64-bit quantized types for 16x8 quantized operations.
Currently the Quant dialect allows up to 32 bit storage bits: MaxStorageBits. Bump this limit up to 64.
Issue #91584