Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAFT_vaiq_int8 model support #182

Open
AmosLewis opened this issue Apr 19, 2024 · 5 comments
Open

RAFT_vaiq_int8 model support #182

AmosLewis opened this issue Apr 19, 2024 · 5 comments
Assignees
Labels

Comments

@AmosLewis
Copy link
Collaborator

AmosLewis commented Apr 19, 2024

Failed op:

Before fixing expand op, we cannot get any other fail op signature information. The following cmd is blocked by expand.

torch-mlir-opt -convert-torch-onnx-to-torch ./RAFT_vaiq_int8.default.torch-onnx.mlir 
./RAFT_vaiq_int8.default.torch-onnx.mlir:2045:13: error: failed to legalize operation 'torch.operator' that was explicitly marked illegal
    %2041 = torch.operator "onnx.Expand"(%2002, %2040) : (!torch.vtensor<[1,2,?,?],f32>, !torch.vtensor<[?],si64>) -> !torch.vtensor<[],f32> 
            ^
./RAFT_vaiq_int8.default.torch-onnx.mlir:2045:13: note: see current operation: %7506 = "torch.operator"(%7142, %7505) <{name = "onnx.Expand"}> : (!torch.vtensor<[1,2,?,?],f32>, !torch.vtensor<[?],si64>) -> !torch.vtensor<[],f32>
@zjgarvey
Copy link
Contributor

zjgarvey commented May 3, 2024

Next failures are

  1. onnx.Resize (which should be fixed by torch-mlir PR# 3013)
  2. A matrix multiplication that is getting half-quantized. Looking into this now, but it's going to be difficult to fix, since the second operand is getting dequantized before several shape manipulations occur.

@AmosLewis
Copy link
Collaborator Author

AmosLewis commented May 14, 2024

After Integrate torch-mlir@ec6d7aa onnx.resize op iree-org/iree#17358:
@zjgarvey looks like quantize related error, it would be better you look at this.

failed to translate executables
failed to translate executables
failed to translate executables
RAFT_vaiq_int8.default.onnx.torch.mlir:1644:13: error: 'func.func' op exceeded stack allocation limit of 32768 bytes for function. Got 401408 bytes
    %1601 = torch.aten.quantize_per_tensor %1596, %float5.000000e-01, %int0, %int12 : !torch.vtensor<[1024,7,7,1],f32>, !torch.float, !torch.int, !torch.int -> !torch.vtensor<[1024,7,7,1],!torch.qint8>
            ^
RAFT_vaiq_int8.default.onnx.torch.mlir:1644:13: note: called from
    %1601 = torch.aten.quantize_per_tensor %1596, %float5.000000e-01, %int0, %int12 : !torch.vtensor<[1024,7,7,1],f32>, !torch.float, !torch.int, !torch.int -> !torch.vtensor<[1024,7,7,1],!torch.qint8>
            ^

@zjgarvey
Copy link
Contributor

Maybe we can use --iree-llvmcpu-fail-on-out-of-bounds-stack-allocation=false. but I'm not sure if that is the best option. I've been sitting on iree-compile for a while now.

@zjgarvey
Copy link
Contributor

Related issue in iree: iree-org/iree#17455.

@AmosLewis
Copy link
Collaborator Author

@IanWood1 PR to fix: iree-org/iree#17574

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants