Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any way of using two way communication on any cable. #1

Open
zamananjum0 opened this issue Aug 15, 2023 · 9 comments
Open

Comments

@zamananjum0
Copy link

zamananjum0 commented Aug 15, 2023

I need help on bidirectional chat system where we can use twilio, and AI agent can communicate with the caller.

@junket
Copy link

junket commented Jun 12, 2024

@zamananjum0 Did you figure this out? Looking at the AnyCable code, this demo utilizes a custom executor to handle incoming messages on the WS but doesn't appear to have a similar model in place for some custom handler to send messages back via WS. Am I wrong?

@junket
Copy link

junket commented Jun 12, 2024

@palkan It would be amazing to get your input on if/how this can be accomplished. Thank you for the excellent demo!

@palkan
Copy link
Member

palkan commented Jun 13, 2024

Hey there,

Sorry, I didn't have notifications tuned on for this repo, so missed this one.

There are two ways you can implement bi-directional communication:

  1. Move it to the Go app. This would be more performant but require more work and maintenance at the Go app side. The entry-point for adding responses is somewhere around here:

    st.OnResponse(func(response *streamer.Response) {
    (and you can use session.Send(msg) to send data back to the session).

  2. Use AnyCable pub/sub capabilities. You can subscribe your session to some named stream here (e..g., stream_from "twilio/#{call_sid}") and then use ActionCable.server.broadcast (or AnyCable.broadcast) to send data back to this stream. A bit of overhead for pub/sub but all the logic lives in your Ruby/Rails app.

@junket
Copy link

junket commented Jun 15, 2024

That's great, @palkan. Thank you!

For my project, I'm inclined to keep the business logic in Rails as long as possible, so option #2 here would be great. But I found that if I simply broadcast my data from Rails in the form of a media_event message (the JSON format Twilio expects), AnyCable does not appear to publish that message to the the Twilio media stream (i.e. the client in this case).

The logs suggest that the broadcast was scheduled, but no outbound audio is played in the stream and I'm not sure how to "see" what the message relayed by AnyCable looked like. I get:

2024-06-15 12:09:40.167 DBG handle broadcast message nodeid=oDo8m8 context=node payload.stream=call_CAXXXXXXX payload.data="\"{\\\"event\\\":\\\"media\\\",\\\"sequenceNumber\\\":\\\"149\\\",\\\"media\\\":{\\\"payload\\\": \\\"dHd5en1//v39/fz8/f3+////f...(204)"
2024-06-15 12:09:40.167 DBG incoming broadcast message nodeid=oDo8m8 context=node payload.stream=call_CAXXXXXXX payload.data="\"{\\\"event\\\":\\\"media\\\",\\\"sequenceNumber\\\":\\\"149\\\",\\\"media\\\":{\\\"payload\\\": \\\"dHd5en1//v39/fz8/f3+////f...(204)"
2024-06-15 12:09:40.167 DBG schedule broadcast nodeid=oDo8m8 context=node component=hub gate=3 stream=call_CAXXXXXXX message.stream=call_CAXXXXXXX message.data="\"{\\\"event\\\":\\\"media\\\",\\\"sequenceNumber\\\":\\\"149\\\",\\\"media\\\":{\\\"payload\\\": \\\"dHd5en1//v39/fz8/f3+////f...(204)"

I wonder if this is because we need our custom encoder to do something with the message payload from Rails? Or perhaps we need to specify a BroadcastType to control how it is delivered to the socket? Any hints--even just a pointer on how to log what AnyCable is broadcasting to the client--would help tremendously 🙏

@junket
Copy link

junket commented Jun 17, 2024

Let me answer my own question: This example app actually contains everything you need for bi-directional communication with Twilio media streams. I just needed to learn some Go. Thank you, @palkan and @irinanazarova!

The encoder's Encode method is called on the reply per the in-code comment "Encoder converts messages from/to Twilio format to AnyCable format." It casts the message sent back to the socket connection as one of the Twilio formats according the Type which I believe can be set by adding the metadata field BroadcastType to the broadcast from our ActionCable server.

Although I have not yet figured out how to include this metadata in my default Rails ActionCable broadcasts (tips appreciated!) I can see that by tweaking the Go code to assume the Type is a MediaEvent, I can coax AnyCable into passing back my audio data to the stream, completing the bi-directional stream. 👍👍

@palkan
Copy link
Member

palkan commented Jun 17, 2024

@junket Thanks for sharing your insights!

how to include this metadata in my default Rails ActionCable broadcasts (tips appreciated!)

Something like this should work:

AnyCable::Rails.with_broadcast_options(broadcast_type: "...") do
   code_that_performs_broadcasts
end

Or you can directly use AnyCable: AnyCable.broadcast(stream, data, {broadcast_type: "..."}).

@junket
Copy link

junket commented Jun 18, 2024

One thing I'm exploring (and would love to hear your take @palkan) is the ideal audio chunk size when streaming from ActionCable to AnyCable-Go.

My first successful stream used 20ms (160 byte) chunks of 8000Khz mu-law and came through cleanly in local development. However, on all subsequent tries, ActionCable seemed unable to send the chunks quickly enough, leading to unusable choppy audio. Is that an ActionCable performance bottleneck? Weird that it had been more performant. My code is a simple test chunking a small file like:

chunk_size = 160 # bytes per 20ms chunk
File.open(file_path, 'rb') do |file|
  while (chunk = file.read(chunk_size))
    ActionCable.server.broadcast(stream, chunk)
  end
end

If I up the chunk size to 200ms, ActionCable can keep and the audio is better. I assume there are trade-offs here but I am too much of a newb to really know what they are 😊

@palkan
Copy link
Member

palkan commented Jun 18, 2024

I think, the reason is in added network latency: 20ms of audio can be processed faster than the next 20ms arrive; when you send in a larger chunks, the network latency is about the same, but there is a buffer still full of audio from the previous chunk.

@junket
Copy link

junket commented Jun 18, 2024

Yep, this has to be it. I switched my broadcast URL from ngrok to localhost and the latency was virtually gone. I'll experiment with ideal chunk size. I assume there is an upper limit, but I don't know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants