Last month, Google along with Linux Foundation launched a new open source community called Nephio. The purpose of this project is to provide the cloud-native automation drive-by Kubernetes.
An announcement blog post stated that – true cloud-native automation is lacking scale, efficiency, and high reliability across network operations. For the last few years, simply containerizing the network functions and workloads and running them on cloud and edge cloud did not help. In fact, shifting workloads from virtualized to containers can only increase costs due to additional infrastructure, new operating models, and a more fragmented ecosystem.
In this post, let’s find out where things are gone wrong while adopting cloud-native technologies for large-scale enterprise and telecom infrastructure.
4 years ago, I wrote about how the containers can be used to host the VNFs and enable cloud-native networks. Now things are progressing rapidly and containerization of network functions along with orchestration using Kubernetes has become a key focus. Telco operators like Dish and Rakuten successfully architected, designed, built, and delivered the cloud-native 5G networks with end-to-end orchestration and network deployment automation.
While adopting the cloud-native methodologies, telco operators are looking to decrease the time to launch new services, improve customer services, and keep the optimized OPEX to deliver the network. Additionally, telcos are looking to increase operational agility by automating network deployments along with cloud infrastructure.
But Google’s post suggests some of the key pitfalls of cloud-native adoption for achieving promised goals, especially by telecom operators and large-scale enterprise infrastructures. Let us understand where things are going wrong.
Problems with cloud-native adoption
Cost: In the evaluation process of building a programmable and high-performing 5G network, telco operators do not shift their attention from controlling the cost. To be able to provide the best of network services to subscribers, they are already evaluating to shift their workloads on hyperscalers and looking to leverage existing solutions from them. But going cloud-native requires a complete transformation in how network functions and services are built. They are unsure about the required CAPEX and OPEX while achieving scalability and flexibility, keeping in mind ever-growing usage.
Skills: if we are honest and look around, we are struggling to literate existing resources and develop skills that have an understanding of the basics of cloud-native architecture. Like, such as how containers can be implemented along with microservices, how Kubernetes can be useful to handle multicluster, why services meshes need to be implemented, how CNFs can be developed to support in any environment, and so on. There is a definite conflict of understanding of the way cloud-native automation can be achieved.
Security: Adopting cloud-native methodologies and implementing containers to host network functions emphasizes security aspects in terms of software core and programmable network deployment. Telcos are unsure about security practices that are going to be followed. The reason for is 5G networks is that cloud-native deployments in 5G networks can be susceptible to attacks at several points. For example, there might be the introduction of malicious applications and misconfiguration of virtual networking components or in core clouds or maybe an attack directly on the administrative portal by exploiting vulnerabilities. Telcos with cloud-native architecture may find it difficult to hunt down attackers due to the presence of tons of devices generating huge amounts of data. Also, a cloud-native telco core may have a huge number of Kubernetes clusters hosting 10 thousand containers for hosting network functions and applications. Implementing monitoring and finding out glitches in real-time is challenging in complex telco networks.
CNF related issues
- CNFs are the core of any modern network that is transforming. A CNF is decomposed into multiple containers in a microservices architecture. But at this point, CNF components have not complied with standards that enable interoperability between CNFs from different vendors. This interoperability is important as it allows infrastructure to build using a mix and matched CNFs to a single one, and importantly supports API calls for interacting with the underlying infrastructure.
- The transformation from VNFs to CNFs is still not accomplished by telco infrastructure vendors. Automation tools that are used in CI/CD pipelines are still working on legacy VNFs. This is putting breaks on cloud-native automation.
- Other issues with CNFs are no declarative configurations from individual vendors and security principles being followed while designing CNFs.
Kubernetes related issues
Kubernetes is hosting telecom network functions within containers. Kubernetes is being used for handling compute, networking, and storage requirement of containers and pods. But CNFs have more requirements than this. Such as pod extensions: Multus, SR-IOV support, DPDK, and Node configuration: such as VLAN membership, CPU isolation, huge pages, and RT kernel. Also, network functions and cloud-infrastructure components have more complex LCM requirements that Kubernetes is falling short to provide.
The base of digital transformation to get agility in delivering services, and improving customer experience with keeping costs in control is not yet achieved by implementing cloud-native principles. But with some of the collaborative efforts and managed carrier-grade layers, it is going to be mainstream for telecom operations. We will see intent-based zero-touch network automation where a cloud-native technology stack will boost the innovation. In an upcoming ContainerDays event, I am hopeful to see widespread discussion on issues and solutions.