Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-273] 'temperature' parameter in LangchainLLMWrapper.generate_text causing issues #656

Open
Kirushikesh opened this issue Feb 24, 2024 · 8 comments
Labels
bug Something isn't working linear Created by Linear-GitHub Sync

Comments

@Kirushikesh
Copy link

Kirushikesh commented Feb 24, 2024

Describe the bug

LangchainLLMWrapper has .generate_text() function which further calls .generate_prompt() of the underlying LLM. The LangchainLLMWrapper passes 'temperature' parameter in .generate_prompt() function which causes the following issues,

  1. temperature parameter is not affecting the response when using HuggingFace LLM
  2. Some Langchain Extensions like IBM Generative AI doesn't support temperature parameter to be passed in .generate_prompt() function.

Since when initialising an LangChain LLM we can pass the temperature as a parameter, it is not needed to be supplied additionally in LangchainLLMWrapper.

For ex in HuggingFacePipeline, you can specify the temperature when initialization using:
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, temperature=1)

Or when using IBM LLM you can specify the temperature by:

llm = LangChainInterface(
        model_id='google/flan-t5-xl',
        client=Client(credentials=Credentials.from_env()),
        parameters=TextGenerationParameters(
                  decoding_method=DecodingMethod.SAMPLE,
                  max_new_tokens=1000,
                  min_new_tokens=1,
                  temperature=0.2,
                  top_k=20,
                  top_p=1,
                  random_seed=42,
                  repetition_penalty = 1.1
              )
)

Ragas version: 0.1.1
Python version: 3.10.6

Code to Reproduce
The following code explains why 'temperature' parameter not affecting the response in HuggingFaceLLM

from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

from ragas.llms.base import BaseRagasLLM, LangchainLLMWrapper
from ragas.run_config import RunConfig
from ragas.llms.prompt import PromptValue

model_id = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=100)
hf_llm = HuggingFacePipeline(pipeline=pipe)

pv = PromptValue(prompt_str='hi, how are you')
run_config = RunConfig()
ragas_hf_llm = LangchainLLMWrapper(hf_llm, run_config=run_config)
ragas_hf_llm.generate_text(
    prompt=pv,
    stop=None,
    temperature=0
)

# Output
LLMResult(generations=[[Generation(text=' feeling today?!\n\nThe sun was shining, and there was only a small light, but no more than a single drop from the sky. The clouds were white with white fringing, which they looked like a cloud filled with smoke. It covered the place with the smoke, and the man standing before them had a sword, as a symbol of protection. He was only slightly more than a hundred lightyears away from the Sun and the Moon.\n\n"Fuu...what?"\n\n')]], llm_output=None, run=[RunInfo(run_id=UUID('cbd105d1-ab2c-4069-a6f3-20fc9159443e'))])

In the above code I initialised the HuggingFacePipeline with gpt-2 model and wrapped it around ragas LangchainLLMWrapper and i was passing 'temperature=0' when calling .generate_text(), ideally this should generate error because 0 temperature is not accepted in HuggingFace.

You can also check by passing temperature as 99 in .generate_text() and its not raising any exception too for this high value of temperature. Thus its evident that temperature in .generate_text is not affecting the HuggingFace LLM. Also user can sent the temperature in pipeline() function so need to have an additional temperature in .generate_text() function.

The following code explains why passing 'temperature' raises an error in IBM LLM:

from genai import Client, Credentials
from genai.extensions.llama_index import IBMGenAILlamaIndex

from genai.extensions.langchain import LangChainInterface
from genai.extensions.langchain.chat_llm import LangChainChatInterface
from genai.extensions.langchain import LangChainEmbeddingsInterface
from genai.schema import (
    DecodingMethod,
    TextGenerationParameters,
    TextEmbeddingParameters
)

llm = LangChainInterface(
        model_id='google/flan-t5-xl',
        client=Client(credentials=Credentials.from_env()),
        parameters=TextGenerationParameters(
                  decoding_method=DecodingMethod.SAMPLE,
                  max_new_tokens=1000,
                  min_new_tokens=1,
                  temperature=0.2,
                  top_k=20,
                  top_p=1,
                  random_seed=42,
                  repetition_penalty = 1.1
              )
)

from ragas.llms.base import BaseRagasLLM, LangchainLLMWrapper
from ragas.run_config import RunConfig
from ragas.llms.prompt import PromptValue

pv = PromptValue(prompt_str='hi, how are you')
run_config = RunConfig()
ragas_ibm_llm = LangchainLLMWrapper(llm, run_config=run_config)
ragas_ibm_llm.generate_text(
    prompt=pv,
    stop=None,
    temperature=99
)

# Error Trace

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [16], in <cell line: 1>()
----> 1 ragas_ibm_llm.generate_text(
      2     prompt=pv,
      3     stop=None,
      4     temperature=99
      5 )

File /dccstor/kirushikesh/.conda/guardrails/lib/python3.10/site-packages/ragas/llms/base.py:147, in LangchainLLMWrapper.generate_text(self, prompt, n, temperature, stop, callbacks)
    139     return self.langchain_llm.generate_prompt(
    140         prompts=[prompt],
    141         n=n,
   (...)
    144         callbacks=callbacks,
    145     )
    146 else:
--> 147     result = self.langchain_llm.generate_prompt(
    148         prompts=[prompt] * n,
    149         temperature=temperature,
    150         stop=stop,
    151         callbacks=callbacks,
    152     )
    153     # make LLMResult.generation appear as if it was n_completions
    154     # note that LLMResult.runs is still a list that represents each run
    155     generations = [[g[0] for g in result.generations]]

File /dccstor/kirushikesh/.conda/guardrails/lib/python3.10/site-packages/langchain_core/language_models/llms.py:530, in BaseLLM.generate_prompt(self, prompts, stop, callbacks, **kwargs)
    522 def generate_prompt(
    523     self,
    524     prompts: List[PromptValue],
   (...)
    527     **kwargs: Any,
    528 ) -> LLMResult:
    529     prompt_strings = [p.to_string() for p in prompts]
--> 530     return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)

File /dccstor/kirushikesh/.conda/guardrails/lib/python3.10/site-packages/langchain_core/language_models/llms.py:703, in BaseLLM.generate(self, prompts, stop, callbacks, tags, metadata, run_name, **kwargs)
    687         raise ValueError(
    688             "Asked to cache, but no cache found at `langchain.cache`."
    689         )
    690     run_managers = [
    691         callback_manager.on_llm_start(
    692             dumpd(self),
   (...)
    701         )
    702     ]
--> 703     output = self._generate_helper(
    704         prompts, stop, run_managers, bool(new_arg_supported), **kwargs
    705     )
    706     return output
    707 if len(missing_prompts) > 0:

File /dccstor/kirushikesh/.conda/guardrails/lib/python3.10/site-packages/langchain_core/language_models/llms.py:567, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs)
    565     for run_manager in run_managers:
    566         run_manager.on_llm_error(e, response=LLMResult(generations=[]))
--> 567     raise e
    568 flattened_outputs = output.flatten()
    569 for manager, flattened_output in zip(run_managers, flattened_outputs):

File /dccstor/kirushikesh/.conda/guardrails/lib/python3.10/site-packages/langchain_core/language_models/llms.py:554, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs)
    544 def _generate_helper(
    545     self,
    546     prompts: List[str],
   (...)
    550     **kwargs: Any,
    551 ) -> LLMResult:
    552     try:
    553         output = (
--> 554             self._generate(
    555                 prompts,
    556                 stop=stop,
    557                 # TODO: support multiple run managers
    558                 run_manager=run_managers[0] if run_managers else None,
    559                 **kwargs,
    560             )
    561             if new_arg_supported
    562             else self._generate(prompts, stop=stop)
    563         )
    564     except BaseException as e:
    565         for run_manager in run_managers:

File /dccstor/kirushikesh/.conda/guardrails/lib/python3.10/site-packages/genai/extensions/langchain/llm.py:190, in LangChainInterface._generate(self, prompts, stop, run_manager, **kwargs)
    187     return final_result
    188 else:
    189     responses = list(
--> 190         self.client.text.generation.create(**self._prepare_request(inputs=prompts, stop=stop, **kwargs))
    191     )
    192     for response in responses:
    193         for result in response.results:

TypeError: GenerationService.create() got an unexpected keyword argument 'temperature'

As the error trace explains that using Langchain wrapped IBM LLM doesn't support 'temperature' as an additional parameter in .generate_prompt() function. The error resolves when i didn't pass temperature parameter. The same error occurs when calling 'evaluate()' function in ragas with the same IBM LLM.

Expected behavior
A clear solution to this problem was to remove the temperature parameter in LangchainLLMWrapper

class LangchainLLMWrapper(BaseRagasLLM):
    ...
    def generate_text(
        self,
        prompt: PromptValue,
        n: int = 1,
        stop: t.Optional[t.List[str]] = None,
        callbacks: t.Optional[Callbacks] = None,
    ) -> LLMResult:
        if is_multiple_completion_supported(self.langchain_llm):
            return self.langchain_llm.generate_prompt(
                prompts=[prompt],
                n=n,
                stop=stop,
                callbacks=callbacks,
            )
        else:
            result = self.langchain_llm.generate_prompt(
                prompts=[prompt] * n,
                stop=stop,
                callbacks=callbacks,
            )
            # make LLMResult.generation appear as if it was n_completions
            # note that LLMResult.runs is still a list that represents each run
            generations = [[g[0] for g in result.generations]]
            result.generations = generations
            return result
    ...

Additional context
Add any other context about the problem here.

R-273

@Kirushikesh Kirushikesh added the bug Something isn't working label Feb 24, 2024
@Kirushikesh
Copy link
Author

Kirushikesh commented Feb 24, 2024

Further I raised a PR to address the issue. #657

@joy13975
Copy link
Contributor

joy13975 commented Feb 28, 2024

+1 getting same error for trying out Google gemini models through langchain-google-genai

@Kirushikesh but removing the temperature arg impacts OpenAI behavior right?

@Kirushikesh
Copy link
Author

@joy13975, when initialising the OpenAI LLM we are providing the temperature right llm = ChatOpenAI(temperature=0) and temperature in . generate_prompt() is also an optional parameter.

This was referenced Feb 28, 2024
@RazHadas
Copy link

RazHadas commented Mar 1, 2024

Someone have any update about this bug?

@shahules786
Copy link
Member

Hey, @RazHadas There are two PRs raised on the same issue. You can check them out or wait till we merge them.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label May 19, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 1, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jun 1, 2024
@LostInCode404
Copy link

This issue should probably not be closed without merging the fixes. I am facing the same issue using langchain-google-genai.

@jjmachan
Copy link
Member

thanks for bringing it to your attention @LostInCode404 , reopening this

@jjmachan jjmachan reopened this Jun 13, 2024
@jjmachan jjmachan added the linear Created by Linear-GitHub Sync label Jun 13, 2024
@jjmachan jjmachan changed the title 'temperature' parameter in LangchainLLMWrapper.generate_text causing issues [R-273] 'temperature' parameter in LangchainLLMWrapper.generate_text causing issues Jun 13, 2024
@Hemang999
Copy link

thanks for bringing it to your attention @LostInCode404 , reopening this

Was this issue fixed? I am also getting the same error when I use langchain-google-genai but it works fine for langchain_openai.

Please help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working linear Created by Linear-GitHub Sync
Projects
None yet
Development

No branches or pull requests

7 participants