Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VoNR calls not completing properly #382

Open
Rashed97 opened this issue Oct 17, 2024 · 37 comments
Open

VoNR calls not completing properly #382

Rashed97 opened this issue Oct 17, 2024 · 37 comments

Comments

@Rashed97
Copy link

Hi,

First I wanted to thank you for the wonderful setup and guides. Using this, I've successfully set up a 5G SA network with a B210 radio. Using the experimental 5G IMS PyHSS branch, I've been able to get several Android UEs to register successfully with P-CSCF/S-CSCF. The UEs show a successful registration, and SMSoIP works fine. Calls however, do not complete; the devices never even get a dial tone. P-CSCF logs show that the N5 routes seem to be working fine, so I'm not really sure where to start with this.

A note on these UEs: when I start the B210 as an eNB instead of a gNB, they can register on the LTE network and VoLTE works fine. There appears to be an issue with video calling over IMS, but I have not looked at it enough to have any idea whether that's a UE issue or IMS issue.

Please let me know what logs you'd like me to provide and I can provide those in a follow up. I'm happy to help debug this however you'd like.

Thanks!

@herlesupreeth
Copy link
Owner

@Rashed97

I've successfully set up a 5G SA network with a B210 radio.

In general the VoNR is not working as expected mainly due to failure in setting up a dedicated bearer for the call. This is still largely work in progress. Will let you know once its ready to be tested again

A note on these UEs: when I start the B210 as an eNB instead of a gNB, they can register on the LTE network and VoLTE works fine. There appears to be an issue with video calling over IMS, but I have not looked at it enough to have any idea whether that's a UE issue or IMS issue.

Regarding issue with video calls, please attach a pcap taken for below scenario

  1. Start pcap
  2. Register two UEs
  3. Attempt video call
  4. Wait for it to fail
  5. Stop pcap and attach here

@Rashed97
Copy link
Author

@herlesupreeth Thanks for the clarification.

In general the VoNR is not working as expected mainly due to failure in setting up a dedicated bearer for the call. This is still largely work in progress. Will let you know once its ready to be tested again

Are there any workarounds for this right now, or is this pretty much entirely broken in srsRAN? I’ve seen some of the discussion threads around this, and it looks like it potentially works sometimes, or am I misunderstanding?

Regarding issue with video calls, please attach a pcap taken for below scenario

  1. Start pcap
  2. Register two UEs
  3. Attempt video call
  4. Wait for it to fail
  5. Stop pcap and attach here

For this, should it just be a PCAP on the host machine with -i any ?

@herlesupreeth
Copy link
Owner

Are there any workarounds for this right now, or is this pretty much entirely broken in srsRAN? I’ve seen some of the discussion threads around this, and it looks like it potentially works sometimes, or am I misunderstanding?

So far we haven't had any luck making VoNR call work when tested between two devices (as per the main contributor it worked when calling a device from an application server). I dont think there is a workaround for it. I will try to see whether this issue can be fixed easily or not.

For this, should it just be a PCAP on the host machine with -i any ?

Yes, thats right

@Rashed97
Copy link
Author

So far we haven't had any luck making VoNR call work when tested between two devices (as per the main contributor it worked when calling a device from an application server). I dont think there is a workaround for it. I will try to see whether this issue can be fixed easily or not.

OK, great thanks. Let me know if there's anything I can help with.

For this, should it just be a PCAP on the host machine with -i any ?

Yes, thats right

Great, appreciate it, thanks. I'll grab this at my first opportunity.

@NUCLEAR-WAR
Copy link

NUCLEAR-WAR commented Oct 24, 2024

I'm also interested in the PCAPs so I can improve the logic more.
srsRAN just dropped a Fix in the Test branch of there Repo for the modify request, maybe give it a shot and provide the traces.

just tested the new fix in my setup, but as @herlesupreeth I do a different approach, to test MOC and MTC, as I'm still fighting with the IPhone as a second device without a success attaching it to network.

here is my test for an Orig Call with the new Fix :
gnb_DockerHost_newFix.zip

@Rashed97
Copy link
Author

I'm also interested in the PCAPs so I can improve the logic more. srsRAN just dropped a Fix in the Test branch of there Repo for the modify request, maybe give it a shot and provide the traces.

Will do, may not get to it until tomorrow evening or Saturday (I’m currently on work travel) but will do as soon as I can get to my workstation.

just tested the new fix in my setup, but as @herlesupreeth I do a different approach, to test MOC and MTC, as I'm still fighting with the IPhone as a second device without a success attaching it to network.

I’ve found Android devices to be more flexible with this setup, and can provide more detailed logging. Do you have access to any rooted Android devices or ones running custom ROMs? If not, send me an email and I can arrange that if you’d like.

@NUCLEAR-WAR
Copy link

NUCLEAR-WAR commented Oct 29, 2024

@Rashed97 its more like vendor issues, they look down the devices to be only usable with normal Operators, not with Lab/Private, even if they attach they don't activate voice just data ( e.g. IPhones), I tried to reach Samsung for example over some internal channel, but did get any answer, not sure what the sense and reason behind looking it.

Anyway, I just bought a Xiaomi Redmi Note 12 5G and it attached perfectly to the network with both Regs/QoS VoNR directly from the experimental Branch without any issue, what I noticed though that there is no N5 upon SIP UPDATE, but everything just worked fine.

here are pcaps from my Tests with and without Precondition :

e2e_with_and_withoutPrecond.zip

@herlesupreeth, can you give it a look if there is something to be improved ?

EDITE: just to add, this test done with two devices using the experimental 5G Branch and the new srsRAN Test Branch, I repeated it many time without any issues.

@herlesupreeth
Copy link
Owner

herlesupreeth commented Nov 3, 2024

@NUCLEAR-WAR Awesome work, I checked the pcap with precondition and it seems to work well.

what I noticed though that there is no N5 upon SIP UPDATE, but everything just worked fine.

This could be because the packet filters/flow descriptions didnt change (because SDP ports didnt change)

So basically I should update the srsRAN dockerfile to use the latest commits right?

@Rashed97
Copy link
Author

Rashed97 commented Nov 3, 2024

Thank you both. I just got home after a few weeks of travel so I can resume testing as well. I’ll build an srsRAN container with the test branch shortly.

@NUCLEAR-WAR have you seen the test2 branch as well? Looks like it might have additional fixes but I haven’t gotten a chance to look at it too deeply yet.

@NUCLEAR-WAR
Copy link

@herlesupreeth

This could be because the packet filters/flow descriptions didnt change (because SDP ports didnt change)

actually, Its because I did not add the in-Dialog UPDATE to the logic that should be easy to add, this will not affect the case where we use the docker setup directlly, but could affect advanced use cases where the update could be for Ringtone/announcements injection ( the cases mostly will be using external IMS or using an AS that do some early media ).
I think I would add some condition to prevent N5 Requests spamming to only trigger in case of SDP Media/Port/IP change, I will add it to the fine tuning To-Do list.

So basically I should update the srsRAN dockerfile to use the latest commits right?
Not sure if the test branch has been migrated to teh main/master branch, in any case you need to pay attention that the setting in the .yml files changed and the ones from the docker setup will not work (least this was my experience) here is mine :

gnb.yml :

cu_cp:
  amf:
    addr: AMF
    bind_addr: GNB_IP
    supported_tracking_areas:
      - tac: 1
        plmn_list:
          - plmn: "PLMN"
            tai_slice_support_list:
              - sst: 1

ru_sdr:
  device_driver: uhd                                            # The RF driver name.
  device_args: type=b200,num_recv_frames=64,num_send_frames=64  # Optionally pass arguments to the selected RF driver.
  clock: external                                               # Specify the clock source used by the RF.
  sync:                                                         # Specify the sync source used by the RF.
  srate: 23.04                                                  # RF sample rate might need to be adjusted according to selected bandwidth.
  otw_format: sc12
  tx_gain: 80                                                   # Transmit gain of the RF might need to adjusted to the given situation.
  rx_gain: 40                                                   # Receive gain of the RF might need to adjusted to the given situation.

cell_cfg:
  dl_arfcn: 632628                                                # ARFCN of the downlink carrier (center frequency).
  band: 78                                                        # The NR band.
  channel_bandwidth_MHz: 20                                       # Bandwith in MHz. Number of PRBs will be automatically derived.
  common_scs: 30                                                  # Subcarrier spacing in kHz used for data.
  plmn: "PLMN"                                                    # PLMN broadcasted by the gNB.
  tac: 1                                                          # Tracking area code (needs to match the core configuration).
  pci: 1                                                          # Physical cell ID.

log:
  filename: gnb.log                                   # Path of the log file.
  all_level: info                                              # Logging level applied to all layers.

pcap:
  mac_enable: true                                               # Set to true to enable MAC-layer PCAPs.
  mac_filename: gnb_mac.pcap                          # Path where the MAC PCAP is stored.
  ngap_enable: true                                              # Set to true to enable NGAP PCAPs.
  ngap_filename: gnb_ngap.pcap                        # Path where the NGAP PCAP is stored.
  f1ap_enable: true                                              # Set to true to enable F1AP PCAPs.
  f1ap_filename: gnb_f1ap.pcap                        # Path where the F1AP PCAP is stored.
  e1ap_enable: true                                              # Set to true to enable E1AP PCAPs.
  e1ap_filename: gnb_e1ap.pcap                        # Path where the E1AP PCAP is stored.
  e2ap_enable: true                                              # Set to true to enable E2AP PCAPs.
  e2ap_filename: gnb_e2ap.pcap                        # Path where the E2AP PCAP is stored.

qos.yml :

# Quality of Service (QoS) example configurations for 5QI 1, 2, 5, 7 and 9
# Based on 5QI characteristics in TS 23.501 table 5.7.4-1 

# This is a supplementary configuration to modify the RLC and PDCP radio bearers on 
# a per 5QI basis.

qos:
  -
    five_qi: 1 # E.g. Conversational Voice
    rlc:
      mode: um-bidir
      um-bidir:
        tx:
          sn: 12
          queue-size: 16384
          queue-bytes: 6172672
        rx:
          sn: 12
          t-reassembly: 50
    pdcp:
      integrity_required: false
      tx:
        sn: 12
        discard_timer: -1
        status_report_required: false
      rx:
        sn: 12
        t_reordering: 80
        out_of_order_delivery: false
    f1u_du:
      backoff_timer: 10
    f1u_cu_up:
      backoff_timer: 10
    mac:
      lc_priority: 4
      lc_group_id: 1
      bucket_size_duration_ms: 5
      prioritized_bit_rate_kBps: 65537
  -
    five_qi: 2 # E.g. Conversational Video
    rlc:
      mode: um-bidir
      um-bidir:
        tx:
          sn: 12
          queue-size: 16384
          queue-bytes: 6172672
        rx:
          sn: 12
          t-reassembly: 50
    pdcp:
      integrity_required: false
      tx:
        sn: 12
        discard_timer: -1
        status_report_required: false
      rx:
        sn: 12
        t_reordering: 80
        out_of_order_delivery: false
    f1u_du:
      backoff_timer: 10
    f1u_cu_up:
      backoff_timer: 10
    mac:
      lc_priority: 4
      lc_group_id: 1
      bucket_size_duration_ms: 5
      prioritized_bit_rate_kBps: 65537
  -
    five_qi: 5 # E.g. IMS signaling
    rlc:
      mode: am
      am:
        tx:
          sn: 12
          t-poll-retransmit: 80
          max-retx-threshold: 4
          poll-pdu: 64
          poll-byte: 125
          queue-size: 16384
          queue-bytes: 6172672
        rx:
          sn: 12
          t-reassembly: 80
          t-status-prohibit: 10
    pdcp:
      integrity_required: false
      tx:
        sn: 12
        discard_timer: -1
        status_report_required: false
      rx:
        sn: 12
        t_reordering: 80
        out_of_order_delivery: false
    f1u_du:
      backoff_timer: 10
    f1u_cu_up:
      backoff_timer: 10
    mac:
      lc_priority: 5
      lc_group_id: 2
      bucket_size_duration_ms: 5
      prioritized_bit_rate_kBps: 65537
  -
    five_qi: 7 # E.g. Voice, Video (live streaming)
    rlc:
      mode: um-bidir
      um-bidir:
        tx:
          sn: 12
          queue-size: 16384
          queue-bytes: 6172672
        rx:
          sn: 12
          t-reassembly: 50
    pdcp:
      integrity_required: false
      tx:
        sn: 12
        discard_timer: -1
        status_report_required: false
      rx:
        sn: 12
        t_reordering: 80
        out_of_order_delivery: false
    f1u_du:
      backoff_timer: 10
    f1u_cu_up:
      backoff_timer: 10
    mac:
      lc_priority: 4
      lc_group_id: 1
      bucket_size_duration_ms: 5
      prioritized_bit_rate_kBps: 65537
  -
    five_qi: 9 # E.g. Buffered video streaming, TCP-based traffic
    rlc:
      mode: am
      am:
        tx:
          sn: 12
          t-poll-retransmit: 80
          max-retx-threshold: 4
          poll-pdu: 64
          poll-byte: 125
          queue-size: 16384
          queue-bytes: 6172672
        rx:
          sn: 12
          t-reassembly: 80
          t-status-prohibit: 10
    pdcp:
      integrity_required: false
      tx:
        sn: 12
        discard_timer: -1
        status_report_required: false
      rx:
        sn: 12
        t_reordering: 80
        out_of_order_delivery: false
    f1u_du:
      backoff_timer: 10
    f1u_cu_up:
      backoff_timer: 10
    mac:
      lc_priority: 5
      lc_group_id: 2
      bucket_size_duration_ms: 5
      prioritized_bit_rate_kBps: 65537

@Rashed97

I’ll build an srsRAN container with the test branch shortly.
If you are running gNB on separated host, I would recommend that you compile it and run it directly.

@NUCLEAR-WAR have you seen the test2 branch as well? Looks like it might have additional fixes but I haven’t gotten a chance to look at it too deeply yet.

No, this is new to me my test was with the "Test" branch, not sure what the the "Test 2" has for changes, did have a look in it.

@Rashed97
Copy link
Author

Rashed97 commented Nov 4, 2024

I updated to the latest srsRAN test2 branch and used @NUCLEAR-WAR your qos.yml but still not seeing the calls complete. Here is the latest pcap collection.
pcaps-241104-065617.zip

Update: alto tested with the normal test branch. Same behaviour with the call just not getting a dial tone, but not hanging up either. Here are the pcaps for that as well:
pcaps-241104-071514.zip

@NUCLEAR-WAR
Copy link

@Rashed97
can you take the trace on the Docker Host where the Open5gs+IMS are running ? need to check the N5 Flow, if anything wrong in it.

@Rashed97
Copy link
Author

Rashed97 commented Nov 5, 2024

@NUCLEAR-WAR by full trace, do you mean a pcap on all interfaces? If so, here you go 😄 :
full-241105-040331.zip

Edit: also here is the log from the attached docker containers for the network core and IMS: https://gist.github.com/Rashed97/194d8ed2db912d72806949356bfc90ab

Edit: I ran tcpdump on interface "any" and filtered out port 22 since I'm SSH'ed into the machine from my laptop, but everything else is in the pcap, so you'll see a bunch of irrelevant packets as well.

@NUCLEAR-WAR
Copy link

@Rashed97 I am still looking into the trace now but have some questions:

Is the device you are using a dual SIM device and both SIM Cards are in it ? if so, try to use separated devices.

does the PC running gNB has enough resources ? not having enough resources could lead to disconnecting the device from Network, can you describe your setup ?
Is there any reason why the device is losing connectivity to the network ? each user has multiple Contacts the INVITE is sent to all its contacts, although the current N5 logic does not support Forking, at the end only the working IP is getting an AppSession ID and also modified without any complain from PCF, so for me it looks like the issue on RAN, though i could be wrong .

Best Regards

@Rashed97
Copy link
Author

Rashed97 commented Nov 5, 2024

@NUCLEAR-WAR see answers below:

Is the device you are using a dual SIM device and both SIM Cards are in it ? if so, try to use separated devices.

No, the SIM cards are inserted into 2 separate devices, but they’re the same model, running the same Android build.

does the PC running gNB has enough resources ? not having enough resources could lead to disconnecting the device from Network, can you describe your setup ? Is there any reason why the device is losing connectivity to the network ? each user has multiple Contacts the INVITE is sent to all its contacts, although the current N5 logic does not support Forking, at the end only the working IP is getting an AppSession ID and also modified without any complain from PCF, so for me it looks like the issue on RAN, though i could be wrong .

Yeah sure. I’m running the entire stack on a machine without Ubuntu 24.04 with an i9-9900K and 96GB RAM. The radio is a modified B210 from Aliexpress (they’re calling it a B220 since it has an upgraded FPGA), and I’m running it with QOS, MIMO, and QAM256. The radio is a bit far from the drives so I can move them closer but I don’t think it’s a signal issue.

@Rashed97
Copy link
Author

Rashed97 commented Nov 7, 2024

@NUCLEAR-WAR is there anything that you think is worth trying to fix the multiple contact entries for the devices, or do you not believe that's what's causing issues right now? I don't see any issues on the RAN but maybe I'm missing something.

Edit: regarding the device disconnections, this appears to be the UE. The Qualcomm modem on these devices tries to fall back to LTE then 3G if the VoNR call fails (in this example, the SIP session is never initiated properly and the device times out waiting for that). Since I don't have an LTE eNB stood up yet, there's no LTE to fall back on, so then it just drops off the network entirely. The phone comes back online on the NR gNB as soon as the call is terminated (either by the modem or I hang up).

@NUCLEAR-WAR
Copy link

@Rashed97 can you send also the logs from the log folder ?
if you used the gnb.yml from my comment above just make sure that the log setting use the correct path to log :

log:
  filename: /mnt/srsran/gnb.log  

@Rashed97
Copy link
Author

Rashed97 commented Nov 9, 2024

@NUCLEAR-WAR Sure, here you go: https://gist.github.com/Rashed97/539698328fc909b6fbb6aad0f4c89687

I switched off MIMO as I was having a lot of late commands and underruns with that enabled, and the UHD benchmark showed a lot of dropped samples. After I switched back to SISO, I then ran the UHD benchmark again with a 23.04 MHz sampling rate and got no dropped samples, late commands, or other errors, so not I'm really sure why I'm still having lates and underflows in the gNB log.

@NUCLEAR-WAR
Copy link

@Rashed97, please follow the comments from @herlesupreeth in this issue starting from comment 1874053020.

usually this happens if the PC performance ,where srsRAN is running, has poor performance, also for 5G its strongly recommended to attach a GPSDO to the SDR to prevent any signal instability.

@Rashed97
Copy link
Author

Rashed97 commented Nov 9, 2024

@NUCLEAR-WAR I had already done some of that but went ahead and made some additional adjustments. CPU is now running close to it's max 5GHz 100% of the time, but I'm still getting lates and underflows, tho significantly less. The calls are still not completing however.

Here is the latest gnb.log and output from the core and IMS: https://gist.github.com/Rashed97/868106e0d4e68705c10819086afd74fc

The gNB log doesn't look to have any radio failures anymore, but calls are still failing the same way. I've also attached the pcaps from srsRAN and the full pcap of the entire system.
pcaps-241109-161605.zip

@NUCLEAR-WAR
Copy link

@Rashed97, I will add some optimization for the N5 Request to deal with scenarios where there is multiple contacts due to disconnects without de-register, that will prevent overwriting the AppSession(saw that happend in your last log) and sends error response on the transaction where no QoS available.
I hope this will at leaset improve the call setup, but will not solve your srsRAN Issue.

I will do some modifications and some tests to verify the changes.

@Rashed97
Copy link
Author

@NUCLEAR-WAR thanks, let me know what else I can help with.

I'll take a look at why I'm having srsRAN issues. i9-9900K running at 100% C0 shouldn't have any issues with this, so need to investigate a bit. I've got a GPSDO arriving today as well tho.

@Rashed97
Copy link
Author

Regarding issue with video calls, please attach a pcap taken for below scenario

@herlesupreeth I've gotten a chance to look at this finally, it's a problem with the UEs - a few bugs with Qualcomm's VT components on this version of their Android BSP. Working to fix these, then will try again, just wanted to keep you updated.

Also do you have any ideas on why I'm still experiencing late commands, and packet underflow/overflow on my setup? I followed your instructions that @NUCLEAR-WAR linked, which significantly reduced them, and also setup an external clock with a GPSDO, but still seeing some issues.

@NUCLEAR-WAR
Copy link

@Rashed97, first to prevent multiple unnecessary contacts, go to scscf folder and to the file kamailio_scscf.cfg and change the following setting as follows :

modparam("ims_usrloc_scscf", "maxcontact", 1)
modparam("ims_usrloc_scscf", "maxcontact_3gpp", 1)

this will set the Max Contact to one, the max contact behavior is set to "2" that means it will overwrite the old contact, so in case the device lost its RAN Link and is registering again, this will prevent creating multiple contact and keep only the last known one.

I done some modifications for the N5 Routing logic, you could give it a try, replace the mt.cfg,mo.cfg,register.cfg in the pcscf/route folder with those ones :
route.zip

I ran a few tests and they worked fine with the settings I suggested for the max contact.
Give it a try and let me know.

I will plan a PR later so they can be reviewed.

@Rashed97
Copy link
Author

Rashed97 commented Nov 14, 2024

@NUCLEAR-WAR Just gave it a whirl and it's behaving the same, if not worse. The devices fall off the network as soon as I try to make a call most of the time, and the rest of the time they hang in the same way as before.

Here are the logs from the core and IMS and gNB: https://gist.github.com/Rashed97/b0c388507481f291893b5de0ae4ddbc9

Here are the pcaps from the system and gNB: pcaps-241113-194629.zip

Question about the overall setup: should I have defined the IMS QoS parameters in Open5GS under the IMS APN for every subscriber? I have that setup for every subscriber, but not sure if there's something potentially conflicting.

Edit: I've cleaned up the P-CSCF config files a bit (spacing, tabs, making sure the N5 patching is properly ifdef-ed, etc). Here are the configs in case you want to see them: https://github.com/Rashed97/docker_open5gs/tree/exp_5g_ims_pyhss/pcscf/route

Edit 2: that branch I linked above has my current working dir. I rebased the existing exp_5g_ims_pyhss on top of master so I could track the changes made for it properly. It also has various cleanups, like adding WITH_N5 guards and fixing IP address hardcoding in the P-CSCF configs.

@herlesupreeth
Copy link
Owner

herlesupreeth commented Nov 14, 2024

Question about the overall setup: should I have defined the IMS QoS parameters in Open5GS under the IMS APN for every subscriber? I have that setup for every subscriber, but not sure if there's something potentially conflicting.

yes, it should be setup for all subscribers you use for calling

Edit:

Also do you have any ideas on why I'm still experiencing late commands, and packet underflow/overflow on my setup? I followed your instructions that @NUCLEAR-WAR linked, which significantly reduced them, and also setup an external clock with a GPSDO, but still seeing some issues.

Its okay if you receive underflow/overflow occasionally, but it shouldn't be too often and too many. Btw, did you set clock as external in the configuration file as well?

@NUCLEAR-WAR
Copy link

@Rashed97 you are using the wrong deploy file, please use sa-vonr-deploy.yaml, this will tell the Docker Stack that the deploy is in 5G Mode and activate all the Ifdef conditions/IP Addresses/Docker env variables, so you don't need to edit the files, also will load only the 5G Core, the 4G core is not needed in the setup..

I see with the new config/settings there is no more INVITEs to dead contacts, and therefore less trouble on creating N5 Session, you need to fix the RAN instability though, as the device is losing RAN Link, otherwise nothing could help could help, maybe bad SDR/SDR Driver, or something is killing the ressoucres, the issues in gNodeB logs happens in the same second so many times, that from my point of view is pretty bad.
my gNodeB is running pretty quit with load average "0.29, 0.24, 0.11" here is a htop screenshot while two devices are attached and doing calls :

gnodeb

@Rashed97
Copy link
Author

@herlesupreeth

yes, it should be setup for all subscribers you use for calling

OK, that’s what I thought, thanks for confirming.

Its okay if you receive underflow/overflow occasionally, but it shouldn't be too often and too many. Btw, did you set clock as external in the configuration file as well?

I did set it to external, yes, tho when starting the gNB the UHD driver says “setting clock to automatic” or something like that (on my phone at the moment but can get the actual line in a few hours).

After I made some changes to the config for my CPU (assigning cores to L1, L2, etc) I think I’ve gotten them minimized, as reflected by the latest gnb.log I linked in the last comment. I don’t see any PRACH or PUxCH or similar errors in this log.

@NUCLEAR-WAR

@Rashed97 you are using the wrong deploy file, please use sa-vonr-deploy.yaml, this will tell the Docker Stack that the deploy is in 5G Mode and activate all the Ifdef conditions/IP Addresses/Docker env variables, so you don't need to edit the files, also will load only the 5G Core, the 4G core is not needed in the setup..

I have DEPLOY_MODE force set to 5G in my tree for that. I have the exact same behavior with sa-vonr-deploy.yaml. The reason im using deploy-all.yaml is I’m looking at how to enable both N5 and Rx for VoNR and VoLTE coexistence, but obviously I need VoNR working first. I have 2 radios so my plan is to connect one as a gNB and one as an eNB eventually.

I see with the new config/settings there is no more INVITEs to dead contacts, and therefore less trouble on creating N5 Session, you need to fix the RAN instability though, as the device is losing RAN Link, otherwise nothing could help could help, maybe bad SDR/SDR Driver, or something is killing the ressoucres, the issues in gNodeB logs happens in the same second so many times, that from my point of view is pretty bad. my gNodeB is running pretty quit with load average "0.29, 0.24, 0.11" here is a htop screenshot while two devices are attached and doing calls :

OK, that’s what I thought I was seeing regarding the invite so glad you can confirm. Regarding the RAN, given the latest log showing far fewer late commands and underflows/overflows, do you think that’s still the issue? My htop looks very close to that too when it’s running, none of the cores gets above 20% utilization. I’ve tried swapping USB cables, USB ports, everything to no avail.

Lastly could you confirm the SHA1 you used for srsRAN? I just want to make sure I’ve got the right setup compiled.

@NUCLEAR-WAR
Copy link

NUCLEAR-WAR commented Nov 14, 2024

@Rashed97

I have DEPLOY_MODE force set to 5G in my tree for that. I have the exact same behavior with sa-vonr-deploy.yaml. The reason im using deploy-all.yaml is I’m looking at how to enable both N5 and Rx for VoNR and VoLTE coexistence, but obviously I need VoNR working first. I have 2 radios so my plan is to connect one as a gNB and one as an eNB eventually.

That would be interesting to see, srsRAN added support for QoS modify, that would be the first step, not sure though if Open5GS support it, as VoNR still experimental as you see., but now I suggest to run VoNR only until it get stable, before starting the next step.

Lastly could you confirm the SHA1 you used for srsRAN? I just want to make sure I’ve got the right setup compiled.

45b07b516bf15fdd606d48b0c62ff15459b54712 gnb
I recommend to compile it and run it directly on the machine, that could improve performance also.

I fixed RTP/RTCP Port store/retrieval on the mt.cfg file, perhaps you give it a try :
mt.zip

@Rashed97
Copy link
Author

@NUCLEAR-WAR

That would be interesting to see, srsRAN added support for QoS modify, that would be the first step, not sure though if Open5GS support it, as VoNR still experimental as you see.

Hypothetically should work fine, with the current srsRAN and srsRAN_4G setup (once I can get VoNR to work properly), at least for what I'm trying to do. My goal is to have 1 UE on NR and 1 UE on LTE with a call between them. The changes needed are mostly in the P-CSCF configs right now, since there's no reliable way to determine if the initial registration message coming from a UE needs to routed to PCF over N5 or to PCRF over Rx. I'm investigating whether adding an endpoint to Open5GS to query "current attached network" is the best option for this or if other options exist.

45b07b516bf15fdd606d48b0c62ff15459b54712 gnb

I fixed RTP/RTCP Port store/retrieval on the mt.cfg file, perhaps you give it a try : mt.zip

I'm not seeing that SHA1 in the srsRAN_Project repo at all, can you check and see if you can? I'm at 366744f14425b00008f4fea379d8d0c9c3fa43d5 since I've been pulling in the latest test branch every few days.

And sure I'll give the new mt.cfg a try later this evening.

@Rashed97
Copy link
Author

Rashed97 commented Nov 15, 2024

@NUCLEAR-WAR tried the new mt.cfg, same issues. I'm noticing this in the logs tho:

pcscf       | 94(142) INFO: <script>: Sending the request to PCF
pcf         | 11/14 19:29:01.799: [pcf] ERROR: Not found [/npcf-policyauthorization/v1/app-sessions/0] (../src/pcf/pcf-sm.c:320)
pcscf       | 94(142) ERROR: <script>: N5 QoS Session modification faild - reason code: 404

Looks like the QoS session modification is failing on PCF? This would result in the RAN not even attempting the PDU session modification, correct?

@NUCLEAR-WAR
Copy link

@Rashed97 this could have have many reason, can you provide the logs and the pcaps?

@Rashed97
Copy link
Author

@Rashed97 this could have have many reason, can you provide the logs and the pcaps?

@NUCLEAR-WAR sorry for the delay. I'll get more logs and PCAPs tomorrow, but looks like the AppSession parsing from the the original INVITE request is producing that AppSession 0:

pcscf       | 98(146) INFO: <script>: received early answer in 18x, Patching N5 session in PCF...
pcscf       | 98(146) INFO: <script>: N5 PATCH: About to test if this is a retransmitted reply which is still currently suspended
pcscf       | 98(146) INFO: <script>: N5_PATCH_MT_REQ, building N5 PATCH Request
pcscf       | 98(146) INFO: <script>: SDP IP for UE with MSISDN 11012026330 Call-ID [email protected] is: 172.22.0.16
pcscf       | 98(146) INFO: <script>: SDP RTP Port for UE with MSISDN 11012026330 Call-ID [email protected] is: 49028
pcscf       | 98(146) INFO: <script>: SDP RTCP Port for UE with MSISDN 11012026330 Call-ID [email protected] is: 49029
pcscf       | 98(146) ALERT: <script>: SDP Answer connection Info is: 192.168.101.3, RTP port 50048, RTCP Port 50049 and codec is EVS/16000/1
pcscf       | 98(146) INFO: <script>: Stored MTC AppSession for user 11012026331: 0
pcscf       | 98(146) ALERT: <script>: DEBUG: Preparing PATCH N5 Message for SDP Answer
pcscf       | 98(146) ALERT: <script>: DEBUG: Initialize empty arrays and objects
pcscf       | 98(146) INFO: <script>: Preparing PATCH N5 Message for SDP Answer
pcscf       | 98(146) INFO: <script>: Initialize empty arrays and objects
pcscf       | 98(146) INFO: <script>: DEBUG: Set evSubsc
pcscf       | 98(146) INFO: <script>: DEBUG: Set headers for the HTTP2 Request
pcscf       | 98(146) INFO: <script>: Today is Mon, 18 Nov 2024 05:02:25 EST
pcscf       | 98(146) INFO: <script>: Sending the request to PCF
pcf         | 11/18 05:02:25.406: [pcf] ERROR: Not found [/npcf-policyauthorization/v1/app-sessions/0] (../src/pcf/pcf-sm.c:320)
pcscf       | 98(146) ERROR: <script>: N5 QoS Session modification faild - reason code: 404

Additionally, I spent the weekend upgrading my computer, so I've got a 9950x now, and i have 0 messages in gnb.log now. No late commands or under/overflows, so hopefully my RAN issues should be resolved now.

@NUCLEAR-WAR
Copy link

@Rashed97 interesting to know why its failing in your case, I will be happy to look at the logs.

In my Test the Session is stored and retrieved correctly:

101(145) INFO: <script>: received early answer in 18x, Patching N5 session in PCF...
101(145) INFO: <script>: N5 PATCH: About to test if this is a retransmitted reply which is still currently suspended
101(145) INFO: <script>: N5_PATCH_MT_REQ, building N5 PATCH Request
101(145) INFO: <script>: SDP IP for UE with MSISDN 0912584711 Call-ID [email protected] is: 172.22.0.16
101(145) INFO: <script>: SDP RTP Port for UE with MSISDN 0912584711 Call-ID [email protected] is: 49000
101(145) INFO: <script>: SDP RTCP Port for UE with MSISDN 0912584711 Call-ID [email protected] is: 49001
101(145) INFO: <script>: SDP Answer connection Info is: 192.168.101.3, RTP port 50042, RTCP Port 50043 and codec is EVS/16000/1
101(145) INFO: <script>: Stored MTC AppSession for user 0912584710: 4
101(145) INFO: <script>: Preparing PATCH N5 Message for SDP Answer
101(145) INFO: <script>: Initialize empty arrays and objects
101(145) INFO: <script>: DEBUG: Set evSubsc
101(145) INFO: <script>: DEBUG: Set headers for the HTTP2 Request
101(145) INFO: <script>: Today is Mon, 18 Nov 2024 12:59:13 CET
101(145) INFO: <script>: Sending the request to PCF
101(145) INFO: <script>: N5 QoS Session modification success - reason code: 200
101(145) INFO: <script>: HTTP results: {"ascReqData":{"afAppId":"+g.3gpp.icsi-ref=\"urn%3Aurn-7%3A3gpp-service.ims.icsi.mmtel\"","evSubsc":{"events":[{"event":"QOS_NOTIF","notifMethod":"PERIODIC"},{"event":"ANI_REPORT","notifMethod":"ONE_TIME"}]},"medComponents":{"0":{"qosReference":"qosVoNR","codecs":["downlink\nEVS/16000/1\n","uplink\nEVS/16000/1\n"],"medCompN":1,"medSubComps":{"0":{"fNum":1,"fDescs":["permit out 17 from 172.22.0.16 49000 to 192.168.101.3 50042","permit in 17 from 192.168.101.3 50042 to 172.22.0.16 49000"],"fStatus":"ENABLED","marBwDl":"5000 Kbps","marBwUl":"3000 Kbps","flowUsage":"NO_INFO"},"1":{"fNum":2,"fDescs":["permit out 17 from 172.22.0.16 49001 to 192.168.101.3 50043","permit in 17 from 192.168.101.3 50043 to 172.22.0.16 49001"],"fStatus":"ENABLED","marBwDl":"6000 Kbps","marBwUl":"5000 Kbps","flowUsage":"RTCP"}},"medType":"AUDIO"}},"sponStatus":"SPONSOR_DISABLED"}}
101(145) INFO: <script>: HTTP response: 1
101(145) INFO: <script>: cURL Response: No error
101(145) INFO: <script>: Location Header header: <null>

@Rashed97
Copy link
Author

@NUCLEAR-WAR Thanks, I just resolved it actually, I had a typo in my config file so it wasn't storing the AppSession ID.

I'm observing some weird UE behaviour with the devices disconnecting from the network, so I'm taking a look at that on the UE side now.

@NUCLEAR-WAR
Copy link

@Rashed97 thanks for the info, just in next time add a notice that you edited the files, so I can know that its a custom file not the one provided by this repo.
btw, unless you you need to add new case, you don't need actually to edit those files as they work fine, is your changes you made are in your repo you linked above ?

@Rashed97
Copy link
Author

Rashed97 commented Nov 18, 2024

@NUCLEAR-WAR yes my configs are in the repo I linked above. I have modified versions of the configs that just have formatting fixes (tab, white spaces, etc) and cleaned up comments, just to be able to read them easier. I also have PCF_IP and PCSCF_IP in them, along with added ifdef for WITH_N5. I’ll upload the latest copies shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants