RAFT_vaiq_int8 model support #182

AmosLewis · 2024-04-19T02:21:48Z

Failed op:

[e2e][onnx][model] onnx.Expand SHARK-ModelDev#636

Before fixing expand op, we cannot get any other fail op signature information. The following cmd is blocked by expand.

torch-mlir-opt -convert-torch-onnx-to-torch ./RAFT_vaiq_int8.default.torch-onnx.mlir 
./RAFT_vaiq_int8.default.torch-onnx.mlir:2045:13: error: failed to legalize operation 'torch.operator' that was explicitly marked illegal
    %2041 = torch.operator "onnx.Expand"(%2002, %2040) : (!torch.vtensor<[1,2,?,?],f32>, !torch.vtensor<[?],si64>) -> !torch.vtensor<[],f32> 
            ^
./RAFT_vaiq_int8.default.torch-onnx.mlir:2045:13: note: see current operation: %7506 = "torch.operator"(%7142, %7505) <{name = "onnx.Expand"}> : (!torch.vtensor<[1,2,?,?],f32>, !torch.vtensor<[?],si64>) -> !torch.vtensor<[],f32>

The text was updated successfully, but these errors were encountered:

zjgarvey · 2024-05-03T15:15:18Z

Next failures are

onnx.Resize (which should be fixed by torch-mlir PR# 3013)
A matrix multiplication that is getting half-quantized. Looking into this now, but it's going to be difficult to fix, since the second operand is getting dequantized before several shape manipulations occur.

AmosLewis · 2024-05-14T23:13:49Z

After Integrate torch-mlir@ec6d7aa onnx.resize op iree-org/iree#17358:
@zjgarvey looks like quantize related error, it would be better you look at this.

failed to translate executables
failed to translate executables
failed to translate executables
RAFT_vaiq_int8.default.onnx.torch.mlir:1644:13: error: 'func.func' op exceeded stack allocation limit of 32768 bytes for function. Got 401408 bytes
    %1601 = torch.aten.quantize_per_tensor %1596, %float5.000000e-01, %int0, %int12 : !torch.vtensor<[1024,7,7,1],f32>, !torch.float, !torch.int, !torch.int -> !torch.vtensor<[1024,7,7,1],!torch.qint8>
            ^
RAFT_vaiq_int8.default.onnx.torch.mlir:1644:13: note: called from
    %1601 = torch.aten.quantize_per_tensor %1596, %float5.000000e-01, %int0, %int12 : !torch.vtensor<[1024,7,7,1],f32>, !torch.float, !torch.int, !torch.int -> !torch.vtensor<[1024,7,7,1],!torch.qint8>
            ^

zjgarvey · 2024-05-16T19:27:51Z

Maybe we can use --iree-llvmcpu-fail-on-out-of-bounds-stack-allocation=false. but I'm not sure if that is the best option. I've been sitting on iree-compile for a while now.

zjgarvey · 2024-05-30T16:26:22Z

Related issue in iree: iree-org/iree#17455.

AmosLewis · 2024-06-12T01:27:48Z

@IanWood1 PR to fix: iree-org/iree#17574

AmosLewis added the model label Apr 19, 2024

This was referenced Apr 19, 2024

[e2e][onnx][model] onnx.Expand nod-ai/SHARK-ModelDev#636

Closed

[tracking] E2EShark Model Tests Onnx Mode nod-ai/SHARK-ModelDev#566

Open

Expand nod-ai/SHARK-ModelDev#302

Closed

zjgarvey mentioned this issue May 3, 2024

[e2e][ONNX][Model] onnx.Matmul failure to lower nod-ai/SHARK-ModelDev#666

Closed

AmosLewis mentioned this issue May 14, 2024

torch.aten.quantize_per_tensor to linalg nod-ai/SHARK-ModelDev#683

Open

zjgarvey assigned IanWood1 May 30, 2024

IanWood1 mentioned this issue Jun 3, 2024

Fix extract_slice causing compilation errors iree-org/iree#17519

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAFT_vaiq_int8 model support #182

RAFT_vaiq_int8 model support #182

AmosLewis commented Apr 19, 2024 •

edited

Loading

zjgarvey commented May 3, 2024

AmosLewis commented May 14, 2024 •

edited

Loading

zjgarvey commented May 16, 2024

zjgarvey commented May 30, 2024

AmosLewis commented Jun 12, 2024

RAFT_vaiq_int8 model support #182

RAFT_vaiq_int8 model support #182

Comments

AmosLewis commented Apr 19, 2024 • edited Loading

zjgarvey commented May 3, 2024

AmosLewis commented May 14, 2024 • edited Loading

zjgarvey commented May 16, 2024

zjgarvey commented May 30, 2024

AmosLewis commented Jun 12, 2024

AmosLewis commented Apr 19, 2024 •

edited

Loading

AmosLewis commented May 14, 2024 •

edited

Loading