Is there any explanation behind 20ms opus page duration #2609
-
I've been experimenting for play audio file via webRTC, but I found out that I need to make the .ogg file have 20ms page duration (like the example), when i try to encode the file to another page duration ex. 40ms, I can't hear clear audio on the client, it's sounds robotic. Is there any reasons behind 20ms page duration? From RFC 7587, it's written that it should be able to encode 2.5 - 60ms of audio data.
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
After reading some of opus format, i found this statement on https://en.wikipedia.org/wiki/Opus_(audio_format) The format has three different modes: speech, hybrid, and CELT. When compressing speech, SILK is used for audio frequencies up to 8 kHz. If wider bandwidth is desired, a hybrid mode uses CELT to encode the frequency range above 8 kHz. The third mode is pure-CELT, designed for general audio. SILK is inherently VBR and cannot hit a bitrate target, while CELT can always be encoded to any specific number of bytes, enabling hybrid and CELT mode when CBR is required. SILK supports frame sizes of 10, 20, 40 and 60 ms. CELT supports frame sizes of 2.5, 5, 10 and 20 ms. Thus, hybrid mode only supports frame sizes of 10 and 20 ms; frames shorter than 10 ms will always use CELT mode. A typical Opus packet contains a single frame, but packets of up to 120 ms are produced by combining multiple frames per packet. Opus can transparently switch between modes, frame sizes, bandwidths, and channel counts on a per-packet basis, although specific applications may choose to limit this.
Also I tried to encode to .ogg with lower than 20ms page_duration with ffmpeg but eventually it doesn't works, the output still have average 20ms page_duration Solution:Further exploring, seems there is another parameter when encoding audio file to ogg which is frame_duration. We can find another libopus encoding parameter on this doc https://ffmpeg.org/ffmpeg-codecs.html#libopus-1 the rule is page_duration must be larger or equal to frame_duration, but to make it works with webrtc I need to set the same value for page_duration and frame_duration so to create opus file with larger page duration, I need to modify the frame duration to same as the page duration, example command to encode in 40ms page duration
We can verify the opus page duration and frame duration using opusinfo tool
The output looks like this
This also applies if we want to make the opus file with lower page duration ex. 10ms
By having this flexibility, I can modify the audio write interval based on my usecases |
Beta Was this translation helpful? Give feedback.
After reading some of opus format, i found this statement on https://en.wikipedia.org/wiki/Opus_(audio_format)
The format has three different modes: speech, hybrid, and CELT. When compressing speech, SILK is used for audio frequencies up to 8 kHz. If wider bandwidth is desired, a hybrid mode uses CELT to encode the frequency range above 8 kHz. The third mode is pure-CELT, designed for general audio. SILK is inherently VBR and cannot hit a bitrate target, while CELT can always be encoded to any specific number of bytes, enabling hybrid and CELT mode when CBR is required.
SILK supports frame sizes of 10, 20, 40 and 60 ms. CELT supports frame sizes of 2.5, 5, 10 and 20 ms. Thus, hybrid mode …