LTE KPI Optimization

This is the first part in LTE KPI Optimization and more related articles will be published soon. So, let us get started without wasting any further time.

When the UE wants to attach or connect to the network, it needs to setup a RRC Connection as explained in my article LTE Network Entry Steps. But before that it needs to get synchronized in uplink. This is done by sending a RACH preamble (Msg1) to the eNB and eNB responds with a Random Access Response (RAR aka Msg2). This is where the UE sends a Msg3 also known as the RRC Connection Request and this marks the attempt for the RRC Success Rate KPI. This message contains the objective of the connection and based on that it is subdivided into following major categories:

  • Mo-data : Usually used for UE coming back from idle mode if it has data to send or if it has to make a call
  • Mo-signaling : Most commonly observed for TAUs and Attach messages
  • Mt-access : Idle UE responds to a paging message
  • Emergency
  • High Priority Access

It also contains a UE identity which can be a TMSI value if the UE was already previously attached to LTE and had a TMSI allocation or it can be a random value indicating that the UE does not know about its TMSI or it might be coming from another RAT.

Based on this request, the eNB sends a RRC Connection Setup message which contains the information of SRB and some basic radio parameters like power control, SRI & CQI periodicity.

Once, the UE gets the RRC Connection Setup, it makes the changes based on the instructions in the message and then responds with RRC Connection Setup Complete message. This message also contains the NAS information if the UE intends to send it.

LTE KPI Optimization

The eNB pegs RRC attempt counter when it receives the RRC Connection Request and the process is deemed successful on the receipt of the RRC Connection Setup Complete message.

Common Failures In RRC Setup Phase

In order to maintain and optimize the RRC KPI, one should know the major issues that can cause a RRC setup failure.

RRC Setup Failure Due To No Response

This is the most common RRC failure which is present in every network. Most of the failures in the RRC stage are due to no response from the UE. This means that the eNB receives RRC Connection Request message from the UE and responds with a RRC Connection Setup message but does not receive or is unable to decode the RRC Connection Setup Complete message.

LTE KPI Optimization

Now let’s understand why this happens. The RRC Connection message is usually around 7 bytes in length while the RRC Connection Setup Complete message may contain the whole NAS information (like TAU or Attach Request) and its size can vary from as small as 8 bytes to as big as over 100 bytes. Consider that a UE near cell edge with limited power sends a RRC Connection Request. Since, it is only around 7 bytes, it will need a small number of RBs so power per carrier will be high. But when it needs to send RRC Connection Setup Complete which is around 100 bytes, it will need a bigger number of resources even if the message is fragmented. So, the average power per carrier will be reduced leading to a higher probability that the message may not be decoded at the eNB.

This can also happen if there is interference on the cell as it will make it further difficult for the eNB to decode the message. It can also happen if the UE fails to decode RRC Connection Setup message so it will never send the RRC Connection Setup Complete message.

RRC Rejections

This is the second issue that can happen but it is usually much less observed in commercial networks compared to the failures due to no response. In these cases, the eNB rejects the incoming RRC Connection Request by sending a RRC Reject message. This is mostly observed when eNB experiences congestion and there are not enough resources left to assign to a new user requesting a RRC Connection.

If the PUCCH is congested, the RRC connection can be rejected. PUCCH carries HARQ ACK/NACKs, CQI and SRIs. If the PUCCH resources are not available, users will not be able to send CQIs and the eNB cannot schedule without CQI information. Usually vendors implement PUCCH in a way that when PUCCH utilization is increased, the CQI interval is increased. For example, users sending CQI with an interval of 10ms will be shifted towards 40ms in order to increase the capacity of the PUCCH.

But when no further capacity is available, the eNB needs to put a limit on new incoming connections resulting in RRC Rejections. Similarly, RRC Rejections can be seen if the active UE count increases beyond the board limit or if the CAPS exceed the limit. The details related to troubleshooting and optimizations for such issues is given below.

Optimization For RRC Success Rate KPI

The following procedures are usually used based on different scenarios

Conventional Method : Physical Optimization

The easiest and conventional method is the physical optimization. For instance, down-tilting a cell will reduce the coverage and remove the far-away users. This will reduce the probability of RRC failure due to no response. However, there will be issues that might not be resolved by the conventional approach so I have listed down other methods that might come in handy.

Relevant Timers

There are two relevant timers for RRC Success Rate KPI.

The first timer is maintained on the UE and it is the famous T300. UE starts it after sending the RRC Connection Request and stops it at the receipt of RRC Connection Setup or Reject message. If this timer is too small, the UE will stop waiting for the RRC Connection Setup message and the RRC procedure will fail. So, increasing this timer can help in this phase.

Secondly, eNB has an internal timer (different vendors have different names for it) which the eNB starts after sending the RRC Connection Setup message. It stops this timer after successfully receiving the RRC Connection Setup Complete message. So, if this timer is small and the UE is trying to send the RRC Connection Setup Complete with retransmissions, then the eNB will consider it a failure as soon as the timer expires. So, increasing this timer might also help in certain scenarios.

LTE KPI Optimization

Coverage Enhancement & Power Control

The RRC failures due to lack of response from UE can also be caused if the power control on the PUSCH is not correct or if it is too conservative. For instance, the power control on PUSCH depends on the P0 Nominal value as well as Alpha factor. Different vendors use different settings here like using a low P0 Nominal value (for example -100 dBm) with a higher Alpha factor of around 0.9 or 1 or a using a high P0 Nominal value (for example -70dBm) with a smaller Alpha factor of 0.7 or 0.8. But if both the P0 Nominal and Alpha factor are low then the UE will use a smaller power value to send the RRC Connection Setup Complete and therefore, the chances are that it will not be decoded correctly.

In case there is interference on the cell, then features which mitigate interference should be enabled. For instance, enabling Interference Rejection Combining can provide good gains in such scenarios.

Mobile Originating Signalling RRC Success Rate

Usually Mo-Sig RRC Success Rate is lower than others. The reason is once again linked to the size of the MSG-5 (RRC Connection Setup Complete). For a normal Mo-data or Mt-access, the size of RRC Connection Setup Complete message is around 8 to 10 bytes but for Mo-signalling, it can vary and usually is above 50 bytes. This is because Mo-signalling RRC Request is usually used for NAS signalling messages like Attach Request or Tracking Area Update Requests. These messages are big in size and are sent inside the RRC Connection Setup Complete message as NAS. So, this reduces the RRC Success Rate of RRC Mo-Signalling compared to other RRC Request types.

This means that if the network has a higher ratio of RRC Mo-Signalling requests then it will have a lower RRC Success Rate. Usually, Mo-Signalling is around 20 to 25% while Mo-data has the highest percentage. Still it can vary from network to network based on TAC planning and mobility strategy. However, if you have very high Mo-Signalling percentage then the chances are that RRC Success Rate will be relatively lower compared to another similar network with lower Mo-Signalling percentage.

Incompatible UEs

It has been seen that sometimes there are users that are not compatible with the configuration of the network. So, once they receive the RRC Connection Setup message and they find out that they are not compatible with the configuration provided, they do not respond with a RRC Connection Setup Complete message resulting in a RRC failure on the eNB. However, such users keep trying again and again impacting the KPI. This kind of issue can be seen from the traces or CHRs that verifies that it is a single user. It might be inferred from the RRC counters as well since the number of failures are relatively same in consecutive intervals. However, such cases usually go unsolved as it is not a network issue but an abnormal UE problem.

PUCCH Based RRC Rejections

RRC Rejections due to PUCCH congestion can be solved by simply increasing the PUCCH Resource Blocks. Vendors have parameters for PUCCH allocations and minimum PUCCH Resource Block allocation is 4 per subframe. This is because each slot has PUCCH RB allocation on both ends of the band so that means that each slot will have atleast 2 Resource Blocks for PUCCH – one at the top of the frequency band and the other at the bottom of the band. Since, each subframe has two slots so that means that the subframe will have atleast 4 PUCCH Resource Blocks.

When 4 PUCCH RBs are not enough, they can be expanded to a higher value using parameter or in some implementations, an adaptive approach can be maintained where the eNB changes the PUCCH RB count dynamically based on the load requirement. This approach solves the issue completely.

User Count Based or Flow Control Based RRC Rejections

Different baseband boards and vendors have different limitations on active user count and CAPS (Call Attempts Per Second). When such limitation is reached, incoming RRC Connection Requests are rejected by the eNB based on flow control or resource issue. In such cases, the following basic steps can be done

  • Decrease the UE Inactivity Timer to a smaller value. This will initiate early release for the users and load due to user count will be reduced. However, this can increase the signalling load as idle users can try to come back to network more frequently which can increase CPU usage of the eNB. So, only use this if the issue is related to user limitation while CPU usage is fine.
  • T302 should be increased to limit the RRC signalling load. When a UE gets a RRC Reject from eNB, it has to wait for T302 seconds before sending another RRC Connection Request. So, increasing T302 will increase the interval between such RRC Connection Requests and therefore, reduce the signalling load on the eNB.
  • Mobility Load Balancing is another feature that can help in such a scenario by moving users away from the congested carrier to another less utilized carrier.

If you have any questions or feedback regarding this article, simply drop a comment below. I will respond accordingly and also intend to write more about KPI Optimization soon.

The following two tabs change content below.

Ali Khalid

Head Of Technical Strategy (MVNO - 4G, 3G, VoLTE, 5G)
Ali Khalid is a Senior LTE & VoLTE RNPO Expert and has successfully delivered a number of LTE RNPO Projects in different regions across the globe including Pakistan, Bahrain, UAE, Qatar, Oman, Nigeria and Turkey. In case of any questions or feedback, please feel free to drop a comment below or connect with him on LinkedIn.

93 thoughts on “LTE KPI Optimization Deep Dive: RRC Success Rate”

  1. Excellent information and very useful with great explanation which covers whole subject. Thanks Ali Bai. Looking forward for more.

        1. Such cell are difficult to tackle as they have pure coverage constraints. Sometimes, making aggressive changes to cell selection criterion and making it easy to reselect can help the KPIs.

  2. Thanks a lot for very informative post which is very easy to understand !!! Request you to keep posting on another LTE KPIs as well. 🙂

  3. Thanks for sharing great info !!
    For vendor specific part if rejection due to number of users can we increase maxnumrrc parameter instead of changing inactivity timer

  4. Hi Ali,

    I’m just curious about this part “eNB has an internal timer (different vendors have different names for it)”
    Such as what parameter name is that? Would you like let me know from part vendor Ericsson, Nokia or Huawei?
    Thank you 🙂

  5. “it can be a random value indicating that the UE does not know about its TMSI or it might be coming from another RAT.”

    Are you referring to the random value which appears in CHR logs when analyzed via eFMA.
    Huawei Chinese expert said is likely that it is single UE issue and this UE is accessing the network for the first time

    1. No, when a UE tries to access the LTE network, it has to add an identity in the RRC Request. This is usually TMSI but when the UE is not active for a long time or the UE moves to another RAT and then comes back again after some time, the UE sends a random value in the RRC Request. Then the EPC assigns a new TMSI to this UE. This does not have to be a UE that accesses the network for the first time. Usually you can see this in the Uu or RRC layer logs.

  6. So all possibilities are there
    So “Random Value” under TMSI header in CHR logs analysis via eFMA corresponds to the same “Random Value” you mentioned in your article and it is likely it is single UE issue

    1. Whenever i checked CHR logs , either there is one TMSI or Random Value under which all RRC Setup No Reply (Mo-SIg) Failures are pegging
      So i just need to confirm it

        1. If it is same TMSI then definitely it is a singe UE issue. But if you do not have TMSI and the logs show a Random Value then it may or may not be the same UE. You may need to look at the TA information or FGI if that is available.

          1. So if for example TA Index 3 samples are increased from the same time (RRC Failures) then it will point to single UE issue?
            And reason would be ?

            Secondly FGI ?

          1. Another query i have regarding TMSI
            For example following TMSI are appearing
            FF 3G 81 D7
            FF 5K 81 D7
            FF 6O 81 D7
            Does it correspond to same UE
            Plus do you have anything how TMSI is created , any common thing which may hint towards Single UE TMSI

          2. Yes , vendor is Huawei
            I have witnessed cases in CHR logs where TMSI is similar to e.g FF XX 81 D7 (where XX is random) and increase in TA Index Y has increased mapping with KPI degradation

  7. Nice Explanation
    for the T300 timer, what is the expected disadvantage for increasing it?
    also what is the name of eNB internal timer for Huawei?

    1. FF4FFE9A FGI is observed to be Faulty UE but can it be of any other handset
      Can we say FGI is unique wrt one specific handset

  8. Hello Ali , Nice Article.

    When we increased T300 from 200ms to 800ms , we observed that RRC SR enhance ( expected ) , but also there was degradation on RA SR , are you ave any clue?

  9. i need a solution to improve the Erab Failures in LTE , what parameter retune will improve it in Ericsson system pls help on it

  10. Nice article.
    Could you please share troubleshooting steps for 3G hsupa and hsdpa alarms and RRCSR Issues in nokia tool. Like from BTS manager.

  11. Nice article.
    Could you please share troubleshooting steps for 3G hsupa and hsdpa alarms and RRCSR Issues in nokia flexi Bts .

  12. Excellent notes.. Sir.. Really helping me to know about kpi troubleshooting.. Thanks sir..
    Sir if possible could you Plz share some knowledge on handover in lte..

  13. Thank you sir for your valuable information, which gave me good knowledge in lte kpi troubleshooting… I would like to know more on CSFB, SRVCC, volte KPI troubleshooting tips if possible.

  14. Hi Ali,
    Can you explain more about T302?
    Because in my network T300 is the maximum. Therefore only T302 can optimize. How many seconds we can put for maximum for T302? Is there any side effects after changing? Thanks very much and looking forward to your reply.

  15. Assalamaulaikum wa rahamthullahi wa barkathuhu brother ALI masha allah Excellent articles,Easy to understand

    May allah(swt) increase your knowledge in deen and duniya. AAMEEN

    jazakallahukhairan

    *Mubeen Hussain Mohammad* From INDIA(Hyderabad)

  16. Assalamaulaikum wa rahamthullahi wa barkathuhu brother ALI masha allah Excellent articles,Easy to understand

Leave a Reply

Your email address will not be published. Required fields are marked *