Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to reduce the computational complexity? #18

Open
HK-SHAO opened this issue Jul 19, 2023 · 4 comments
Open

Is it possible to reduce the computational complexity? #18

HK-SHAO opened this issue Jul 19, 2023 · 4 comments

Comments

@HK-SHAO
Copy link

HK-SHAO commented Jul 19, 2023

The current complexity is a bit high: $O(m\cdot n^2)$
And only one optimal one can be selected from $n$ prompts

total_rounds = len(test_cases) * len(prompts) * (len(prompts) - 1) // 2

Is there a way to iterate prompt faster to get it to good shape?
Such as the use of randomization, generative adversarial or genetic algorithms?

@Phq-art
Copy link

Phq-art commented Jul 19, 2023

This is a very interesting point. I was running it yesterday with 30 candidates and it took 4350 iterations which was like 12 hours of computation or something with like 15k calls to the API

@HK-SHAO
Copy link
Author

HK-SHAO commented Jul 20, 2023

This is a very interesting point. I was running it yesterday with 30 candidates and it took 4350 iterations which was like 12 hours of computation or something with like 15k calls to the API

If it's GPT-4, it consumes too much money and electricity :(

@scmaree
Copy link

scmaree commented Jul 20, 2023

Nice idea of the automatic prompt engineer!

What you can do to reduce cost: You don't have to check every prompt against every prompt. Just like chess, not all players play against each other all the time. You already know who's going to win when the best player competes against the worst.

For each test case:

  1. Sort all prompts on their rating. (Initially, thats just a random order as they all have rating 1200.)
  2. Compare the first with the second; the third with the fourth, and so on.
  3. Update the ratings of the prompts.

That will reduce an experiment with 30 candidates and 10 test cases from 4350 iterations to 150. Also, can you reduce the cost by only running get_score once instead of twice?

What you furthermore can do is discard the worst candidates after some point, as you're not interested in the complete order anyways, but only the best. You will have to tune a bit how many you can discard at what point so that you still get good results.

@ericfeunekes
Copy link

I'm curious why you chose the ELO based competition. My understanding is that ELO is meant to help identify the relative skill of different players so you can say Annie is 31% more skilled than John (or prompt A is x% better than prompt b). Compared with a tournament style competition where you essentially end up with a ranking but don't know the relative skill of first vs second place, etc.

Given that the goal here is to find the best prompt, why not use a single elimination tournament. This would significantly decrease the computational complexity and therefore allow way more prompts to be tested.

Personally I love the concept of an ELO based ranking system for the prompts and it would likely show you interesting details about the relative goodness of the prompts, but I'm not sure it's worth the computational cost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants