Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port fixes from master to 2024.5.1 / 2024.6.0 #1239

Merged

Conversation

ilya-lavrenov and others added 12 commits November 20, 2024 19:26
Always tokenize as batch to return `attention_mask` to preserve the
error:
```
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
```
)

Fixed
```
model_tmp_path = ('katuni4ka/tiny-random-phi3', WindowsPath('C:/Users/runneradmin/AppData/Local/Temp/pytest-of-runneradmin/pytest-0/katuni4ka_tiny-random-phi31'))
generation_config = {'eos_token_id': 42, 'ignore_eos': True}

    @pytest.mark.precommit
    @pytest.mark.nightly
    @pytest.mark.parametrize("generation_config", invalid_py_configs)
    def test_python_generation_config_validation(model_tmp_path, generation_config):
        model_id, temp_path = model_tmp_path
        pipe = load_pipe([({"eos_token_id": 37}, "config.json")], temp_path)
    
        # 'unexisting_key_name' key validity is checked in pybind and ValueError will be returned
        #  instead of RuntimeError, which is returned when GenerationConfig values are validated
        return_exception_type = ValueError if 'unexisting_key_name' in generation_config else RuntimeError
>       with pytest.raises(return_exception_type):
E       Failed: DID NOT RAISE <class 'RuntimeError'>
```
Fixed `test_valid_configs`:
```
FAILED tests/python_tests/test_generate_api.py::test_valid_configs - RuntimeError: Check 'eos_token_id != -1 || max_new_tokens != (18446744073709551615UL) || max_length != (18446744073709551615UL)' failed at /home/runner/work/openvino.genai/openvino.genai/src/cpp/src/generation_config.cpp:164:
Either 'eos_token_id', or 'max_new_tokens', or 'max_length' should be defined.
```
See
https://github.com/openvinotoolkit/openvino.genai/actions/runs/11797566325/job/32862069017?pr=882
External chat template is not patched for some reason, but it still can
contain JinjaCpp unsupported constructions.
Added a command for exporting compressed VLMs similar to how it is done
for LMs.
…penvinotoolkit#1222)

When using NPU, it seems that the eos token is not initialized correctly
(at least, for certain models). This causes chat sample to have a
conversation with itself:
```
>chat_sample.exe Meta-Llama-3-8B-Instruct
question:
hello!
Hello! It's nice to meet you! Is there something I can help you with, or would you like to chat?assistant

Nice to meet you too! I'm just a language model, I don't have personal experiences or emotions, but I'm here to help answer any questions you might have or engage in a fun conversation!

What's on your mind? Want to talk about something in particular or just shoot the breeze?assistant

Sounds like fun! I
----------
question:
```

Borrowing some initialization code from *StatefulLLMPipeline*, where we
init eos token from tokenizer within constructor, if eos token has not
been provided, the issue is resolved:

```
> chat_sample.exe Meta-Llama-3-8B-Instruct
question:
hello!
Hello! It's nice to meet you! Is there something I can help you with, or would you like to chat?
----------
question:
```
…antization (openvinotoolkit#1228)

Added a command for exporting quantized diffusion models similar to how
it is done for LMs.
@ilya-lavrenov ilya-lavrenov added this to the 2024.5.1 milestone Nov 20, 2024
@github-actions github-actions bot added category: llm_bench Label for tool/llm_bench folder category: text to image Text 2 image pipeline category: visual language Visual language pipeline category: continuous batching Continuous batching category: LLM LLM pipeline (stateful, static) category: whisper Whisper pipeline category: sampling Sampling / Decoding algorithms category: speculative decoding Speculative decoding category: GHA CI based on Github actions category: tokenizers Tokenizer class or submodule update category: samples GenAI samples labels Nov 20, 2024
@ilya-lavrenov ilya-lavrenov force-pushed the port-fixes branch 3 times, most recently from 3013560 to 4aa9ed7 Compare November 20, 2024 18:42
@ilya-lavrenov
Copy link
Contributor Author

@AlexKoff88 please, suggest a fix that needs to be ported to resolve https://github.com/openvinotoolkit/openvino.genai/actions/runs/11939760566/job/33280974703?pr=1239

@AlexKoff88
Copy link
Collaborator

@AlexKoff88 please, suggest a fix that needs to be ported to resolve https://github.com/openvinotoolkit/openvino.genai/actions/runs/11939760566/job/33280974703?pr=1239

The tests will be fixed in #1238. I am ok with merging this PR.

@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Nov 21, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 21, 2024
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Nov 21, 2024
Merged via the queue into openvinotoolkit:releases/2024/5 with commit da7a7ca Nov 21, 2024
52 of 54 checks passed
@ilya-lavrenov ilya-lavrenov deleted the port-fixes branch November 21, 2024 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: continuous batching Continuous batching category: GHA CI based on Github actions category: llm_bench Label for tool/llm_bench folder category: LLM LLM pipeline (stateful, static) category: samples GenAI samples category: sampling Sampling / Decoding algorithms category: speculative decoding Speculative decoding category: text to image Text 2 image pipeline category: tokenizers Tokenizer class or submodule update category: visual language Visual language pipeline category: whisper Whisper pipeline
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants