Runtime tensor error when trying to convert cpu model to tflite #5749

itzjac · 2024-11-21T18:39:30Z

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

None

OS Platform and Distribution

WSL2

MediaPipe Tasks SDK version

0.10.18

Task name (e.g. Image classification, Gesture recognition etc.)

convert model

Programming Language and version (e.g. C++, Python, Java)

Python

Describe the actual behavior

Runtime Erorr for generating cpu model

Describe the expected behaviour

convert model and generate a tflite file

Standalone code/steps you may have used to try to get what you need

Using the provided LLM inference example as found in github (text-to-text)

Other info / Complete Logs

Running the conversion using the gpu backend works and load on device (is super slow). cpu backend stops the process with the runtime error

RuntimeError: INTERNAL: ; RET_CHECK failure (external/odml/odml/infra/genai/inference/utils/xnn_utils/model_ckpt_util.cc:116) tensor

I tried different ubuntu versions, both generated same runtime error for the cpu backend and worked fine with gpu backend.

itzjac · 2024-11-21T18:48:11Z

The wsl 2 has a default installation ubuntu 24

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.04.1 LTS
Release:        24.04
Codename:       noble

kuaashish · 2024-11-26T08:20:24Z

HI @itzjac,

Could you please share the complete script used for model conversion? Providing the full, standalone code would be very helpful in understanding and potentially reproducing the issue.

Thank you!!

itzjac · 2024-11-26T19:27:42Z

Could you please share the complete script used for model conversion? Providing the full, standalone code would be very helpful in understanding and potentially reproducing the issue.

Thank you!!

Hi @kuaashish !

Thanks for the follow up.
The script is like any other reference I could find on the web. https://github.com/google-ai-edge/mediapipe

from mediapipe.tasks.python.genai import converter
import os

def gemma_convert_config(backend):
    input_ckpt = '/home/me/gemma-2b-it/'
    vocab_model_file = '/home/me/gemma-2b-it/'
    output_dir = '/home/me/gemma-2b-it/intermediate/'
    output_tflite_file = f'/home/me/gemma-2b-it-{backend}.tflite'
    return converter.ConversionConfig(input_ckpt=input_ckpt, ckpt_format='safetensors', 
model_type='GEMMA_2B', backend=backend, output_dir=output_dir, combine_file_only=False, 
vocab_model_file=vocab_model_file, output_tflite_file=output_tflite_file)


config = gemma_convert_config("cpu")
converter.convert_checkpoint(config)

By changing the cpu to gpu, i can produce the tflite and run on device, is super slow though. I expect the CPu model would run normally as that's the general recommendation, correct?

google-ml-butler bot assigned kuaashish Nov 21, 2024

kuaashish added platform:python MediaPipe Python issues task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup os:windows MediaPipe issues on Windows type:support General questions labels Nov 22, 2024

kuaashish added the stat:awaiting response Waiting for user response label Nov 26, 2024

google-ml-butler bot removed the stat:awaiting response Waiting for user response label Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime tensor error when trying to convert cpu model to tflite #5749

Runtime tensor error when trying to convert cpu model to tflite #5749

itzjac commented Nov 21, 2024 •

edited

Loading

itzjac commented Nov 21, 2024

kuaashish commented Nov 26, 2024

itzjac commented Nov 26, 2024 •

edited

Loading

Runtime tensor error when trying to convert cpu model to tflite #5749

Runtime tensor error when trying to convert cpu model to tflite #5749

Comments

itzjac commented Nov 21, 2024 • edited Loading

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

OS Platform and Distribution

MediaPipe Tasks SDK version

Task name (e.g. Image classification, Gesture recognition etc.)

Programming Language and version (e.g. C++, Python, Java)

Describe the actual behavior

Describe the expected behaviour

Standalone code/steps you may have used to try to get what you need

Other info / Complete Logs

itzjac commented Nov 21, 2024

kuaashish commented Nov 26, 2024

itzjac commented Nov 26, 2024 • edited Loading

itzjac commented Nov 21, 2024 •

edited

Loading

itzjac commented Nov 26, 2024 •

edited

Loading