Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Llama2] Add fix for generating past key values #1829

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Abhishek-Varma
Copy link
Contributor

-- torch.tensor on list of np.arrays is VERY SLOW.
-- This commit therefore converts the list to a np.array and then uses torch.tensor on the same.

This therefore solves the issue of indefinite hanging after the First Llama is invoked.

Signed-off-by: Abhishek Varma [email protected]

-- torch.tensor on list of np.arrays is slow.
-- This commit therefore converts the list to a np.array 
    and then uses torch.tensor on the same.

Signed-off-by: Abhishek Varma <[email protected]>
@Abhishek-Varma
Copy link
Contributor Author

@Shukla-Gaurav you can cherry-pick this.
This solves the issue of Vulkan execution getting stuck indefinitely.
I was able to get the tokens to generate on both CLI and WebUI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants