Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use paws to authenticate a httr2 request? #842

Open
hadley opened this issue Oct 9, 2024 · 16 comments
Open

Use paws to authenticate a httr2 request? #842

hadley opened this issue Oct 9, 2024 · 16 comments
Labels
enhancement 💡 New feature or request

Comments

@hadley
Copy link

hadley commented Oct 9, 2024

Is it possible to use paws to authenticate a request that I'm making with httr2? I want to perform a request to the bedrock runtime ConverseStream operation using the (very new) httr2::req_perform_connection().

(Related to #839)

@DyfanJones
Copy link
Member

Is it possible to use paws to authenticate a request that I'm making with httr2?

In it's current state I don't think so 🤔. I believe the request to httr::VERB would have to be expose to translate it over to a httr2 request instead.

paws/paws.common/R/net.R

Lines 128 to 136 in 5a37466

r <- with_paws_verbose(
httr::VERB(
method,
url = url,
config = c(httr::add_headers(.headers = headers), dest),
body = body,
timeout
)
)

I have been trying to think in how to handle streaming data and was pondering if a new class would be the best approach 🤔

Currently paws is using httr. I guess to fully utilise streaming functionality paws should update to httr2 and take advantage of all the benefits it offers. I am going on holiday for the next 3 weeks but when I get back I will try an experimental branch to migrate paws to httr2.

Sorry this isn't a full answer to your question.

@hadley
Copy link
Author

hadley commented Oct 9, 2024

For the project I need it for, I might just bite the bullet and implement the AWS SigV4 signing protocol myself (I'm talking to a bunch of other LLMs with pure httr2 calls). I don't know how it would be to extract that logic out of paws into an exported function, but that would certainly make life easier for me.

@DyfanJones
Copy link
Member

When I am back from holiday I am happy to expose paws's AWS SigV4 (it was on my todo list). That should make it simpler for you.

From my knowledge you might need to convert some of the raw response into int8, int16, int32, int64 and uint8, uint16, uint32, uint64. To extract some key information before parse the message back

https://github.com/boto/botocore/blob/8e2e8fd7ab59f8c1337902acc32d2ee10cb184ad/botocore/eventstream.py

@DyfanJones
Copy link
Member

I have been playing around with some ideas in how to do this in R. The pkd package looks like a useful however it isn't on cran.

I have managed to implement a method in R:

big_endian <- function(vec, dtype) {
  switch(
    dtype,
    "int64" = c(
      vec[8:1], vec[16:9], vec[24:17], vec[32:25], vec[40:33], vec[48:41], vec[56:49], vec[64:57]
    ),
    "int32" = c(vec[8:1], vec[16:9], vec[24:17], vec[32:25]),
    "int16" = c(vec[8:1], vec[16:9]),
    "int8" = vec[8:1]
  )
}

int_to_uint <- function (x, adjustment=2^32) {
  if (sign(x) < 0) {
    return(x + adjustment)
  }
  return(x)
}

# Convert raw vector into integers with big-endian
int64 <- function(x) {
  bits <- as.integer(big_endian(rawToBits(x), "int64"))
  sum(bits[-1] * 2^(62:0)) - bits[[1]] * 2^63
}

int32 <- function(x) {
  bits <- as.integer(big_endian(rawToBits(x), "int32"))
  sum(bits[-1] * 2^(30:0)) - bits[[1]] * 2^31
}

int16 <- function(x) {
  bits <- as.integer(big_endian(rawToBits(x), "int16"))
  sum(bits[-1] * 2^(14:0)) - bits[[1]] * 2^15
}

int8 <- function(x) {
  bits <- as.integer(big_endian(rawToBits(x), "int8"))
  sum(bits[-1] * 2^(6:0)) - bits[[1]] * 2^7
}

# Converts raw vector into unsigned integers with big-endian
uint64 <- function(x) {
  int_to_uint(int64(x), 2^64)
}

uint32 <- function(x) {
  int_to_uint(readBin(x, "integer", n=length(x), size = 4, endian = "big"))
}

uint16 <- function(x) {
  readBin(x, "integer", n=length(x), size=2, signed = F, endian = "big")
}

uint8 <- function(x) {
  readBin(x, "integer", n=length(x), size=1, signed = F, endian = "big")
}

obj <- openssl::rand_bytes(8)

uint8(obj[1])
#> [1] 228
uint16(obj[1:2])
#> [1] 58508
uint32(obj[1:4])
#> [1] 3834393554
uint64(obj)
#> [1] 1.646859e+19


int8(obj[1])
#> [1] -28
int16(obj[1:2])
#> [1] -7028
int32(obj[1:4])
#> [1] -460573742
int64(obj)
#> [1] -1.978149e+18

pkd::uint8(obj[1])
#> <pkd_uint8[1]>
#> [1] 228
pkd::uint16(obj[1:2], endian = 0)
#> <pkd_uint16[1]>
#> [1] 58508
pkd::uint32(obj[1:4], endian = 0)
#> <pkd_uint32[1]>
#> [1] 3834393554
pkd::uint64(obj, endian = 0)
#> <pkd_uint64[1]>
#> [1] 1.646859e+19


pkd::int8(obj[1])
#> <pkd_int8[1]>
#> [1] -28
pkd::int16(obj[1:2], endian = 0)
#> <pkd_int16[1]>
#> [1] -7028
pkd::int32(obj[1:4], endian = 0)
#> <pkd_int32[1]>
#> [1] -460573742
pkd::int64(obj, endian = 0)
#> <pkd_int64[1]>
#> [1] -1.978149e+18

Created on 2024-10-09 with reprex v2.1.1

@hadley
Copy link
Author

hadley commented Oct 9, 2024

I was hoping it would use server-sent events like every other API 😭

@DyfanJones
Copy link
Member

AWS can be a bit of a pain at times :) If you managed to get a working prototype I would be really interested as it should help with the implementation in paws. :)

@hadley
Copy link
Author

hadley commented Oct 10, 2024

@jcheng5 discovered that curl actually has a native implementation: https://curl.se/libcurl/c/CURLOPT_AWS_SIGV4.html. So auth, at least, will be easier than expected.

@hadley
Copy link
Author

hadley commented Oct 21, 2024

Some docs for the protocol at https://docs.aws.amazon.com/transcribe/latest/dg/streaming-setting-up.html#streaming-event-stream. I'm going to try and parse this in httr2 so you'll be able to use it if desired.

@hadley
Copy link
Author

hadley commented Oct 23, 2024

And implemented in r-lib/httr2#571 😄

Here's an example of what streaming code looks like:

creds <- paws.common::locate_credentials()
model_id <- "anthropic.claude-3-5-sonnet-20240620-v1:0"
req <- request("https://bedrock-runtime.us-east-1.amazonaws.com")
req <- req_url_path_append(req, "model", model_id, "converse-stream")
req <- req_body_json(req, list(
  messages = list(list(
    role = "user",
    content = list(list(text = "What's your name?"))
  ))
))
req <- req_auth_aws_v4(
  req,
  aws_access_key_id = creds$access_key_id,
  aws_secret_access_key = creds$secret_access_key,
  aws_session_token = creds$session_token
)

con <- req_perform_connection(req)
repeat{
  event <- resp_stream_aws(con)
  if (is.null(event)) {
    close(con)
    break
  }

  str(event)
}

@DyfanJones
Copy link
Member

This is great! I will take a proper look at this once I get back from my holiday :)

@DyfanJones DyfanJones added the enhancement 💡 New feature or request label Oct 27, 2024
@DyfanJones
Copy link
Member

🤔 From looking at this, I believe paws will need to expose the connection to allow for streaming.

@hadley
Copy link
Author

hadley commented Nov 4, 2024

You could also provide a call back interface (like the older req_perform_stream()) but returning the connection object gives the user maximum flexibility.

@DyfanJones
Copy link
Member

DyfanJones commented Nov 19, 2024

This is alot more work than I initially thought. I will list stuff I need to do, to get streaming in paws SDK properly

  • Update paws backend to httr2 from httr
  • Identify which methods need streaming handlers (Aws API jsons)
  • Expose stream API into method operations (make paws)
  • Regenerate paws SDK with stream_api identifier
  • Allow connections to be passed to unmarshal methods
  • New StreamHandler
  • Error Handling for new StreamHandler
  • Documentation for StreamHandler

@DyfanJones
Copy link
Member

Side note: just noticed aws-sdk-js api JSONs have stopped being updated. For the short term will switch to botocore JSON files. Long term will need to more to smithy.

@DyfanJones
Copy link
Member

DyfanJones commented Nov 21, 2024

Initial dev design:

library(paws)
library(httr2)

client <- bedrockruntime(region = "us-east-1")

model_id <- "amazon.titan-text-lite-v1:0"

resp <- client$converse_stream(
  modelId = model_id,
  messages = list(
    list(
      role = "user",
      content = list(list(text = "What's your name?"))
    ))
)

# Return httr2 req_performance_connection for full flexibility
con <- resp$stream(.connection = T)

repeat{
  event <- resp_stream_aws(con)
  if (is.null(event)) {
    close(con)
    break
  }
  
  str(event)
}

# OR
# Utilise paws unmarshal methods to parse response
resp$stream(\(chunk) print(chunk$contentBlockDelta$delta$text))

I think this initial design should give best of both worlds. I just need to do the plumbing for the paws unmarshal methods and capture stream error. But it is looking promising :)

@DyfanJones
Copy link
Member

Plus Could always create a similar function to httr2:: resp_stream_aws but return the operations expected output: https://www.paws-r-sdk.com/docs/bedrockruntime_converse_stream/

library(paws)

client <- bedrockruntime(region = "us-east-1")

model_id <- "amazon.titan-text-lite-v1:0"

resp <- client$converse_stream(
  modelId = model_id,
  messages = list(
    list(
      role = "user",
      content = list(list(text = "What's your name?"))
    ))
)

# Return httr2 req_performance_connection for full flexibility
con <- resp$stream(.connection = T)

while(!is.null(event <- paws_stream_parser(con))) {
    print(chunk$contentBlockDelta$delta$text)
}
close(con)

I think these 3 options could give alot of flexibility to the user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💡 New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants