Port fixes from master to 2024.5.1 / 2024.6.0 #1239

ilya-lavrenov · 2024-11-20T18:37:17Z

Ported:

Always tokenize as batch to return `attention_mask` to preserve the error: ``` The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. ```

…oolkit#1189) See https://productionresultssa7.blob.core.windows.net/actions-results/1d9c7ed7-f60f-463a-8dba-2898f6f39e0a/workflow-job-run-2ad6080a-b02b-5d2d-feb5-675d9367578e/logs/job/job-logs.txt?rsct=text%2Fplain&se=2024-11-11T12%3A35%3A03Z&sig=cE1uxQm2H0TswczNt4jESUwM1%2BzN2uQBHxsm7Tqw0AI%3D&ske=2024-11-11T22%3A38%3A18Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2024-11-11T10%3A38%3A18Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2024-08-04&sp=r&spr=https&sr=b&st=2024-11-11T12%3A24%3A58Z&sv=2024-08-04 A lot of words are printed as max_new_tokens is quite huge number, while exception is expected as streamer cannot work with batch. To fix hanging of tests in this PR openvinotoolkit#882

Related to openvinotoolkit#1188

) Fixed ``` model_tmp_path = ('katuni4ka/tiny-random-phi3', WindowsPath('C:/Users/runneradmin/AppData/Local/Temp/pytest-of-runneradmin/pytest-0/katuni4ka_tiny-random-phi31')) generation_config = {'eos_token_id': 42, 'ignore_eos': True} @pytest.mark.precommit @pytest.mark.nightly @pytest.mark.parametrize("generation_config", invalid_py_configs) def test_python_generation_config_validation(model_tmp_path, generation_config): model_id, temp_path = model_tmp_path pipe = load_pipe([({"eos_token_id": 37}, "config.json")], temp_path) # 'unexisting_key_name' key validity is checked in pybind and ValueError will be returned # instead of RuntimeError, which is returned when GenerationConfig values are validated return_exception_type = ValueError if 'unexisting_key_name' in generation_config else RuntimeError > with pytest.raises(return_exception_type): E Failed: DID NOT RAISE <class 'RuntimeError'> ```

Fixed `test_valid_configs`: ``` FAILED tests/python_tests/test_generate_api.py::test_valid_configs - RuntimeError: Check 'eos_token_id != -1 || max_new_tokens != (18446744073709551615UL) || max_length != (18446744073709551615UL)' failed at /home/runner/work/openvino.genai/openvino.genai/src/cpp/src/generation_config.cpp:164: Either 'eos_token_id', or 'max_new_tokens', or 'max_length' should be defined. ``` See https://github.com/openvinotoolkit/openvino.genai/actions/runs/11797566325/job/32862069017?pr=882

External chat template is not patched for some reason, but it still can contain JinjaCpp unsupported constructions.

…otoolkit#1217)

Added a command for exporting compressed VLMs similar to how it is done for LMs.

Task: [CVS-155520](https://jira.devtools.intel.com/browse/CVS-155520) --------- Co-authored-by: Ekaterina Aidova <[email protected]>

…penvinotoolkit#1222) When using NPU, it seems that the eos token is not initialized correctly (at least, for certain models). This causes chat sample to have a conversation with itself: ``` >chat_sample.exe Meta-Llama-3-8B-Instruct question: hello! Hello! It's nice to meet you! Is there something I can help you with, or would you like to chat?assistant Nice to meet you too! I'm just a language model, I don't have personal experiences or emotions, but I'm here to help answer any questions you might have or engage in a fun conversation! What's on your mind? Want to talk about something in particular or just shoot the breeze?assistant Sounds like fun! I ---------- question: ``` Borrowing some initialization code from *StatefulLLMPipeline*, where we init eos token from tokenizer within constructor, if eos token has not been provided, the issue is resolved: ``` > chat_sample.exe Meta-Llama-3-8B-Instruct question: hello! Hello! It's nice to meet you! Is there something I can help you with, or would you like to chat? ---------- question: ```

…antization (openvinotoolkit#1228) Added a command for exporting quantized diffusion models similar to how it is done for LMs.

ilya-lavrenov · 2024-11-20T19:02:37Z

@AlexKoff88 please, suggest a fix that needs to be ported to resolve https://github.com/openvinotoolkit/openvino.genai/actions/runs/11939760566/job/33280974703?pr=1239

AlexKoff88 · 2024-11-21T06:07:47Z

@AlexKoff88 please, suggest a fix that needs to be ported to resolve https://github.com/openvinotoolkit/openvino.genai/actions/runs/11939760566/job/33280974703?pr=1239

The tests will be fixed in #1238. I am ok with merging this PR.

ilya-lavrenov and others added 12 commits November 20, 2024 19:26

Shorten the cache optimizations similarity test (openvinotoolkit#1192)

16d2aee

Related to openvinotoolkit#1188

Patch user passed chat template (openvinotoolkit#1204)

f02e272

External chat template is not patched for some reason, but it still can contain JinjaCpp unsupported constructions.

[Image generation] Fixed SDXL with LCM's Unet (openvinotoolkit#1210)

074736b

Fix wrong logits processing without applying of slice matmul (openvin…

1b17dcb

…otoolkit#1217)

Add export command for compressed VLMs (openvinotoolkit#1218)

f9eca94

Added a command for exporting compressed VLMs similar to how it is done for LMs.

Add speculative decoding params to lm_bench (openvinotoolkit#1221)

7a44c33

Task: [CVS-155520](https://jira.devtools.intel.com/browse/CVS-155520) --------- Co-authored-by: Ekaterina Aidova <[email protected]>

Add to README a command for export of diffusion models with hybrid qu…

6730822

…antization (openvinotoolkit#1228) Added a command for exporting quantized diffusion models similar to how it is done for LMs.

ilya-lavrenov added this to the 2024.5.1 milestone Nov 20, 2024

ilya-lavrenov assigned Wovchena Nov 20, 2024

ilya-lavrenov force-pushed the port-fixes branch 3 times, most recently from 3013560 to 4aa9ed7 Compare November 20, 2024 18:42

Restored SD CI

2f10139

ilya-lavrenov force-pushed the port-fixes branch from 4aa9ed7 to 2f10139 Compare November 20, 2024 18:45

Merge branch 'releases/2024/5' into port-fixes

115c30c

AlexKoff88 approved these changes Nov 21, 2024

View reviewed changes

ilya-lavrenov added this pull request to the merge queue Nov 21, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 21, 2024

ilya-lavrenov added this pull request to the merge queue Nov 21, 2024

Merged via the queue into openvinotoolkit:releases/2024/5 with commit da7a7ca Nov 21, 2024
52 of 54 checks passed

ilya-lavrenov deleted the port-fixes branch November 21, 2024 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port fixes from master to 2024.5.1 / 2024.6.0 #1239

Port fixes from master to 2024.5.1 / 2024.6.0 #1239

ilya-lavrenov commented Nov 20, 2024

ilya-lavrenov commented Nov 20, 2024

AlexKoff88 commented Nov 21, 2024

Port fixes from master to 2024.5.1 / 2024.6.0 #1239

Port fixes from master to 2024.5.1 / 2024.6.0 #1239

Conversation

ilya-lavrenov commented Nov 20, 2024

ilya-lavrenov commented Nov 20, 2024

AlexKoff88 commented Nov 21, 2024