Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gwen 72B and open-router Google Flash added. #232

Merged
merged 5 commits into from
Nov 11, 2024

Conversation

Sixzero
Copy link
Collaborator

@Sixzero Sixzero commented Nov 5, 2024

Google Flash and Qwen 72B are one of the best models out there, so I added them.
Google Flash is added through open-router.

Copy link

codecov bot commented Nov 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.95%. Comparing base (bfc8833) to head (4067512).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #232      +/-   ##
==========================================
+ Coverage   91.93%   91.95%   +0.01%     
==========================================
  Files          47       47              
  Lines        4601     4609       +8     
==========================================
+ Hits         4230     4238       +8     
  Misses        371      371              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@svilupp
Copy link
Owner

svilupp commented Nov 6, 2024

Thank you, @Sixzero !

Could you please confirm the costing associated with these models and that you have personally reviewed it (not GenAI generated)?

Eg, some of the Google records have 0 cost, which is not likely (everything has a cost eventually after the launch)

@Sixzero
Copy link
Collaborator Author

Sixzero commented Nov 7, 2024

I think the costs are correct as for now open-router has these models for free.

The problem with free models is they sometimes throw weird errors:

ai"Tell me the short story of life"orgfexp
ERROR: HTTP.Exceptions.StatusError(429, "POST", "/api/v1/chat/completions", HTTP.Messages.Response:
"""
HTTP/1.1 429 Too Many Requests
Date: Thu, 07 Nov 2024 15:24:39 GMT
Content-Type: application/json; charset=UTF-8
Content-Length: 54
Connection: keep-alive
Access-Control-Allow-Origin: *
Cf-Placement: local-VIE
x-clerk-auth-message: Invalid JWT form. A JWT consists of three parts separated by dots. (reason=token-invalid, token-carrier=header)
x-clerk-auth-reason: token-invalid
x-clerk-auth-status: signed-out
Vary: Accept-Encoding
Server: cloudflare
CF-RAY: 8dee54d85ba43250-VIE

{"error":{"message":"Rate limit exceeded","code":429}}""")
Stacktrace:
  [1] (::HTTP.ConnectionRequest.var"#connections#4"{})(req::HTTP.Messages.Request; proxy::Nothing, socket_type::Type, socket_type_tls::Nothing, readtimeout::Int64, connect_timeout::Int64, logerrors::Bool, logtag::Nothing, closeimmediately::Bool, kw::@Kwargs{})
    @ HTTP.ConnectionRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/ConnectionRequest.jl:144
  [2] connections
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/ConnectionRequest.jl:60 [inlined]
  [3] (::Base.var"#96#98"{})(args::HTTP.Messages.Request; kwargs::@Kwargs{})
    @ Base ./error.jl:308
  [4] (::HTTP.RetryRequest.var"#manageretries#3"{})(req::HTTP.Messages.Request; retry::Bool, retries::Int64, retry_delays::ExponentialBackOff, retry_check::Function, retry_non_idempotent::Bool, kw::@Kwargs{})
    @ HTTP.RetryRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/RetryRequest.jl:75
  [5] manageretries
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/RetryRequest.jl:30 [inlined]
  [6] (::HTTP.CookieRequest.var"#managecookies#4"{})(req::HTTP.Messages.Request; cookies::Bool, cookiejar::HTTP.Cookies.CookieJar, kw::@Kwargs{})
    @ HTTP.CookieRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/CookieRequest.jl:42
  [7] managecookies
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/CookieRequest.jl:19 [inlined]
  [8] (::HTTP.HeadersRequest.var"#defaultheaders#2"{})(req::HTTP.Messages.Request; iofunction::Nothing, decompress::Nothing, basicauth::Bool, detect_content_type::Bool, canonicalize_headers::Bool, kw::@Kwargs{})
    @ HTTP.HeadersRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/HeadersRequest.jl:71
  [9] defaultheaders
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/HeadersRequest.jl:14 [inlined]
 [10] (::HTTP.RedirectRequest.var"#redirects#3"{})(req::HTTP.Messages.Request; redirect::Bool, redirect_limit::Int64, redirect_method::Nothing, forwardheaders::Bool, response_stream::Nothing, kw::@Kwargs{})
    @ HTTP.RedirectRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/RedirectRequest.jl:25
 [11] redirects
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/RedirectRequest.jl:14 [inlined]
 [12] (::HTTP.MessageRequest.var"#makerequest#3"{})(method::String, url::URIs.URI, headers::Vector{…}, body::IOBuffer; copyheaders::Bool, response_stream::Nothing, http_version::HTTP.Strings.HTTPVersion, verbose::Int64, kw::@Kwargs{})
    @ HTTP.MessageRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/MessageRequest.jl:35
 [13] makerequest
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/MessageRequest.jl:24 [inlined]
 [14] request(stack::HTTP.MessageRequest.var"#makerequest#3"{}, method::String, url::String, h::Vector{…}, b::IOBuffer, q::Vector{…}; headers::Vector{…}, body::IOBuffer, query::Vector{…}, kw::@Kwargs{})
    @ HTTP ~/.julia/packages/HTTP/sJD5V/src/HTTP.jl:457
 [15] #request#20
    @ ~/.julia/packages/HTTP/sJD5V/src/HTTP.jl:315 [inlined]
 [16] request (repeats 2 times)
    @ ~/.julia/packages/HTTP/sJD5V/src/HTTP.jl:313 [inlined]
 [17] #request_body#3
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:82 [inlined]
 [18] request_body
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:78 [inlined]
 [19] _request(api::String, provider::PromptingTools.CustomProvider, api_key::String; method::String, query::Nothing, http_kwargs::@NamedTuple{}, streamcallback::Nothing, additional_headers::Vector{…}, kwargs::@Kwargs{})
    @ OpenAI ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:162
 [20] _request (repeats 2 times)
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:141 [inlined]
 [21] #openai_request#15
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:220 [inlined]
 [22] openai_request
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:214 [inlined]
 [23] #create_chat#20
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:383 [inlined]
 [24] create_chat
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:377 [inlined]
 [25] #create_chat#236
    @ ~/repo/PromptingTools.jl/src/llm_openai_schema_defs.jl:98 [inlined]
 [26] create_chat
    @ ~/repo/PromptingTools.jl/src/llm_openai_schema_defs.jl:75 [inlined]
 [27] #create_chat#243
    @ ~/repo/PromptingTools.jl/src/llm_openai_schema_defs.jl:188 [inlined]
 [28] create_chat
    @ ~/repo/PromptingTools.jl/src/llm_openai_schema_defs.jl:181 [inlined]
 [29] macro expansion
    @ ./timing.jl:395 [inlined]
 [30] aigenerate(prompt_schema::PromptingTools.OpenRouterOpenAISchema, prompt::String; verbose::Bool, api_key::String, model::String, return_all::Bool, dry_run::Bool, conversation::Vector{…}, streamcallback::Nothing, no_system_message::Bool, name_user::Nothing, name_assistant::Nothing, http_kwargs::@NamedTuple{}, api_kwargs::@NamedTuple{}, kwargs::@Kwargs{})
    @ PromptingTools ~/repo/PromptingTools.jl/src/llm_openai.jl:323
 [31] aigenerate(prompt::String; model::String, kwargs::@Kwargs{return_all::Bool})
    @ PromptingTools ~/repo/PromptingTools.jl/src/llm_interface.jl:472
 [32] macro expansion
    @ ~/repo/PromptingTools.jl/src/macros.jl:39 [inlined]
 [33] top-level scope
    @ REPL[13]:1

caused by: HTTP.Exceptions.StatusError(429, "POST", "/api/v1/chat/completions", HTTP.Messages.Response:
"""
HTTP/1.1 429 Too Many Requests
Date: Thu, 07 Nov 2024 15:24:39 GMT
Content-Type: application/json; charset=UTF-8
Content-Length: 54
Connection: keep-alive
Access-Control-Allow-Origin: *
Cf-Placement: local-VIE
x-clerk-auth-message: Invalid JWT form. A JWT consists of three parts separated by dots. (reason=token-invalid, token-carrier=header)
x-clerk-auth-reason: token-invalid
x-clerk-auth-status: signed-out
Vary: Accept-Encoding
Server: cloudflare
CF-RAY: 8dee54d85ba43250-VIE

{"error":{"message":"Rate limit exceeded","code":429}}""")
Stacktrace:
 [1] (::HTTP.ExceptionRequest.var"#exceptions#2"{})(stream::HTTP.Streams.Stream{…}; status_exception::Bool, timedout::ConcurrentUtilities.TimedOut{…}, logerrors::Bool, logtag::Nothing, kw::@Kwargs{})
   @ HTTP.ExceptionRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/ExceptionRequest.jl:19
 [2] exceptions
   @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/ExceptionRequest.jl:13 [inlined]
 [3] #2
   @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/TimeoutRequest.jl:22 [inlined]
 [4] macro expansion
   @ ~/.julia/packages/ConcurrentUtilities/QOkoO/src/try_with_timeout.jl:82 [inlined]
 [5] (::ConcurrentUtilities.var"#2#4"{})()
   @ ConcurrentUtilities ~/.julia/packages/ConcurrentUtilities/QOkoO/src/ConcurrentUtilities.jl:9
Stacktrace:
  [1] try_yieldto(undo::typeof(Base.ensure_rescheduled))
    @ Base ./task.jl:945
  [2] wait()
    @ Base ./task.jl:1009
  [3] wait(c::Base.GenericCondition{ReentrantLock}; first::Bool)
    @ Base ./condition.jl:130
  [4] wait
    @ ./condition.jl:125 [inlined]
  [5] take_unbuffered(c::Channel{HTTP.Messages.Response})
    @ Base ./channels.jl:494
  [6] take!
    @ ./channels.jl:471 [inlined]
  [7] try_with_timeout(f::Function, timeout::Int64, ::Type{HTTP.Messages.Response})
    @ ConcurrentUtilities ~/.julia/packages/ConcurrentUtilities/QOkoO/src/try_with_timeout.jl:89
  [8] (::HTTP.TimeoutRequest.var"#timeouts#3"{})(stream::HTTP.Streams.Stream{…}; readtimeout::Int64, logerrors::Bool, logtag::Nothing, kw::@Kwargs{})
    @ HTTP.TimeoutRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/TimeoutRequest.jl:21
  [9] (::HTTP.ConnectionRequest.var"#connections#4"{})(req::HTTP.Messages.Request; proxy::Nothing, socket_type::Type, socket_type_tls::Nothing, readtimeout::Int64, connect_timeout::Int64, logerrors::Bool, logtag::Nothing, closeimmediately::Bool, kw::@Kwargs{})
    @ HTTP.ConnectionRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/ConnectionRequest.jl:122
 [10] connections
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/ConnectionRequest.jl:60 [inlined]
 [11] (::Base.var"#96#98"{})(args::HTTP.Messages.Request; kwargs::@Kwargs{})
    @ Base ./error.jl:308
 [12] (::HTTP.RetryRequest.var"#manageretries#3"{})(req::HTTP.Messages.Request; retry::Bool, retries::Int64, retry_delays::ExponentialBackOff, retry_check::Function, retry_non_idempotent::Bool, kw::@Kwargs{})
    @ HTTP.RetryRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/RetryRequest.jl:75
 [13] manageretries
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/RetryRequest.jl:30 [inlined]
 [14] (::HTTP.CookieRequest.var"#managecookies#4"{})(req::HTTP.Messages.Request; cookies::Bool, cookiejar::HTTP.Cookies.CookieJar, kw::@Kwargs{})
    @ HTTP.CookieRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/CookieRequest.jl:42
 [15] managecookies
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/CookieRequest.jl:19 [inlined]
 [16] (::HTTP.HeadersRequest.var"#defaultheaders#2"{})(req::HTTP.Messages.Request; iofunction::Nothing, decompress::Nothing, basicauth::Bool, detect_content_type::Bool, canonicalize_headers::Bool, kw::@Kwargs{})
    @ HTTP.HeadersRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/HeadersRequest.jl:71
 [17] defaultheaders
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/HeadersRequest.jl:14 [inlined]
 [18] (::HTTP.RedirectRequest.var"#redirects#3"{})(req::HTTP.Messages.Request; redirect::Bool, redirect_limit::Int64, redirect_method::Nothing, forwardheaders::Bool, response_stream::Nothing, kw::@Kwargs{})
    @ HTTP.RedirectRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/RedirectRequest.jl:25
 [19] redirects
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/RedirectRequest.jl:14 [inlined]
 [20] (::HTTP.MessageRequest.var"#makerequest#3"{})(method::String, url::URIs.URI, headers::Vector{…}, body::IOBuffer; copyheaders::Bool, response_stream::Nothing, http_version::HTTP.Strings.HTTPVersion, verbose::Int64, kw::@Kwargs{})
    @ HTTP.MessageRequest ~/.julia/packages/HTTP/sJD5V/src/clientlayers/MessageRequest.jl:35
 [21] makerequest
    @ ~/.julia/packages/HTTP/sJD5V/src/clientlayers/MessageRequest.jl:24 [inlined]
 [22] request(stack::HTTP.MessageRequest.var"#makerequest#3"{}, method::String, url::String, h::Vector{…}, b::IOBuffer, q::Vector{…}; headers::Vector{…}, body::IOBuffer, query::Vector{…}, kw::@Kwargs{})
    @ HTTP ~/.julia/packages/HTTP/sJD5V/src/HTTP.jl:457
 [23] #request#20
    @ ~/.julia/packages/HTTP/sJD5V/src/HTTP.jl:315 [inlined]
 [24] request (repeats 2 times)
    @ ~/.julia/packages/HTTP/sJD5V/src/HTTP.jl:313 [inlined]
 [25] #request_body#3
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:82 [inlined]
 [26] request_body
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:78 [inlined]
 [27] _request(api::String, provider::PromptingTools.CustomProvider, api_key::String; method::String, query::Nothing, http_kwargs::@NamedTuple{}, streamcallback::Nothing, additional_headers::Vector{…}, kwargs::@Kwargs{})
    @ OpenAI ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:162
 [28] _request (repeats 2 times)
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:141 [inlined]
 [29] #openai_request#15
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:220 [inlined]
 [30] openai_request
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:214 [inlined]
 [31] #create_chat#20
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:383 [inlined]
 [32] create_chat
    @ ~/.julia/packages/OpenAI/d65zV/src/OpenAI.jl:377 [inlined]
 [33] #create_chat#236
    @ ~/repo/PromptingTools.jl/src/llm_openai_schema_defs.jl:98 [inlined]
 [34] create_chat
    @ ~/repo/PromptingTools.jl/src/llm_openai_schema_defs.jl:75 [inlined]
 [35] #create_chat#243
    @ ~/repo/PromptingTools.jl/src/llm_openai_schema_defs.jl:188 [inlined]
 [36] create_chat
    @ ~/repo/PromptingTools.jl/src/llm_openai_schema_defs.jl:181 [inlined]
 [37] macro expansion
    @ ./timing.jl:395 [inlined]
 [38] aigenerate(prompt_schema::PromptingTools.OpenRouterOpenAISchema, prompt::String; verbose::Bool, api_key::String, model::String, return_all::Bool, dry_run::Bool, conversation::Vector{…}, streamcallback::Nothing, no_system_message::Bool, name_user::Nothing, name_assistant::Nothing, http_kwargs::@NamedTuple{}, api_kwargs::@NamedTuple{}, kwargs::@Kwargs{})
    @ PromptingTools ~/repo/PromptingTools.jl/src/llm_openai.jl:323
 [39] aigenerate(prompt::String; model::String, kwargs::@Kwargs{return_all::Bool})
    @ PromptingTools ~/repo/PromptingTools.jl/src/llm_interface.jl:472
 [40] macro expansion
    @ ~/repo/PromptingTools.jl/src/macros.jl:39 [inlined]
 [41] top-level scope
    @ REPL[13]:1
Some type information was truncated. Use `show(err)` to see complete types.

Maybe we could handle it or remove orgexp model? This issue happens sometimes with orgf8b model too. Which is also free, but now: asyncmap(i -> (ai"Tell me the short story of life"orgf8b), 1:100); this ran without an issue, so I am not sure.

Removed orgfexp because the model is heavily rate limited.
@Sixzero
Copy link
Collaborator Author

Sixzero commented Nov 11, 2024

I decided to remove the free model, since it was heavily rate limited and caused crashes.

@svilupp
Copy link
Owner

svilupp commented Nov 11, 2024

Cool! One last thing, could you remove any underscores from the aliases? The spirit of it was to make it easy to use string macros like ai_str. You could just make it ...257 and 2572 (model ver/size). Not many people will use it and you'll know what it means

@Sixzero
Copy link
Collaborator Author

Sixzero commented Nov 11, 2024

Named them with a b to separate the size of the models. I wonder if this is a little bit more descriptive, or not, if you will we can ofc switch to 257 and 2572 but than I think only we will know what this code means in the end.

@svilupp svilupp self-requested a review November 11, 2024 21:58
@svilupp
Copy link
Owner

svilupp commented Nov 11, 2024

Thanks!

@svilupp svilupp merged commit eeb0dfc into svilupp:main Nov 11, 2024
5 checks passed
svilupp pushed a commit that referenced this pull request Nov 26, 2024
* Gwen 72B and open-router Google Flash added.

* Fixing the pricing informations.

* Pricing updated for gemini-flash-1.5-8b

* Removed orgfexp

Removed orgfexp because the model is heavily rate limited.

* Following naming conventions for qwen aliases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants