Author's Quick Intro:
Praveen Batta has 16 years of experience in various technology stacks including Kubernetes, VMware cloud with SDN integrations of VMware NSX, Juniper Contrail, Linux, Storage and Openstack cloud. Praveen has been serving in telecom domain from last 7 years.
This article is to give an overall idea on the considerations or care to be taken to improve the performance of an NFVi infrastructure.
Assuming the reader is well versed with NFVi architecture. And this article should complement below blog.
Telco workload is sensitive, in terms of latency, Jitter, high throughput, and packet drops. This demands for high bandwidth, QOS, performance enablers, and tuning in all the areas of the packet traverse.
To make your environment ready for NFVi traffic, be prepared with required tools and plans.
First and foremost, Study and understand the traffic in the environment. Depending on the applications running in the infrastructure, different types of traffic will be evolved. And accordingly, we need to cater the infrastructure resources.
Signaling traffic – This is a kind of control traffic which generates when any communication starts from the user end device and at times of terminating the communication.
Signal processing VNF’s will be highly CPU intensive.
Data traffic – This is the traffic where the actual packets are generated and travelled with the user data like voice, files etc.
Data plane workloads are memory consuming.
Accordingly, we need to plan our infrastructure to cater both the traffic..
Note: This article is not covering the network tuning parameters. Please note that I am covering on compute (Hypervisor) area.
Be prepared to tune the infrastructure. Below are some of the areas which needs to be known to design any Telco infrastructure.
There are 2 ways to prioritize the cpu assignment to the VNF’s. 1- on hypervisor level with cpu reservation and 2nd option is to allocate the CPU cores to a service or a VM (VNF).
CPU reservation, this is the parameter which can be set on the Hypervisor to reserve the CPU cores to respective VNF.
CPU pinning: This is one core concept which is linked with the Base OS CPU scheduling. With this feature, we are going to dedicate the CPUs cores to a service (In our scenarios its VNF) which is running on the Hypervisor OS.
Analyze the VNF functionality and its traffic. If you are VNF vendor. Test and validate the cpu intensive VNF traffic with CPU pinning as well as CPU reservation and then take a decision.
This guarantees the memory allocation to the VM (VNF). This helps the hypervisor to prioritize the VM’s when it is allocating the memory.
We all know that CPU needs memory to store the data temporarily and process the needs. When it stores the data, performance is mainly dependent on the location of memory and the bus between CPU and memory.
Using centralized memory with a single shared bus between CPU’s may not be a good option to get the optimum performance.
When CPU, RAM and Bus together deliver performance, VNF (VM) will get the better advantage. Hence, we got into the concept called NUMA – “Non Uniform Memory Access”. With NUMA configuration CPU will use the memory which is near to it. In this model, CPU will use the memory which is local to it. Rather than memory in the central location or from different slot. This will boost up the performance of VNF and that will complement the deliverables of Telco.
See below picture for understanding:
This also boils down to the entire set of resources for a VNF. i.e allocate the networks also from same NUMA node to get the optimum performance.
This performance enabler will help at times of memory centric applications which sends huge data. Default memory page size is 4K. Configuring page size to 1G will help the memory intensive VNF’s.
Also remember that if this is configured on the system which doesn’t need huge pages may impact negatively.
Plan in such a way that memory hungry VNf’s are placed on the hypervisor with configured Huge pages.
Proper testing is recommended before deciding.
DPDK – Data Plane Development Kit:
This performance enabler makes transaction intensive VNFs/CNFs more efficient with this NIC level performance improvement.
In traditional Operating system, network drivers / libraries will be managed by Kernel space. And Applications running in user space will have to depend on Kernel space.
Using DPDK, VNF/CNF traffic is being directly sent to network cards by User space. This will remove burden on Kernel space which is taking care of other OS processes.
Example: DU node, NSX Edge node, Ericsson vEPG
For latency sensitive applications configure with SR-IOV / Passthrough to get optimum performance. SR-IOV refers to Single route IO Virtualisation. This helps to provide guaranteed bandwidth, which is needed for some of the VNF’s and CNFs.
Physical NIC, technically called as PNIC, can be divided into virtual functions. And the bandwidth will be divided into # of VF’s.
With this, we are bypassing the virtual switch and sharing the bandwidth only to desired NF’s with promising bandwidth.
Usecases for SR-IOV:
- Mavenir uses SR-IOV and passthrough for DU and CU in cell sites with MTU of 9000.
- Altiostar CU uses SR-IOV, Altiostar DU uses Pci-Passthrough for PTP.
Also, consider the possibilities of Jumbo frames, MTU configurations when you are planning for Telco traffic in NFVi environment.
Some of the VNF/CNF’s may demand for greater IOPS. Make sure that respective IOPS can be served by provided storage.
Nokia AAA VNF demands for minimum of 7200 RPM SSD disk type and with it works with concurrent IOPS rather than with sequential IPOS.
Understand the storage requirement of the NF’s and accordingly plan for the type of disks for the storage. Flash drives gives much better IOPs than SATA drives. Flash drives are a bit costly than SATA, Plan the disk groups and place the NF’s accordingly.
Hope this article gives overall idea on the Key performance areas for NFVi infrastructure.