Cloud

Modern Analytics: How to Make Sense of Mountains of Data

Unlike previous generations of mobile technologies, 5G employs a service-oriented architecture built using cloud native technologies and microservices methodologies, delivering an open infrastructure that employs standard compute platforms to efficiently scale and readily embrace new features. Hosted on either private clouds, public clouds or a hybrid of both, 5G networks are horizontally integrated and scaled, comprising multiple independent but interoperating network functions, each serving a specific purpose. These are then deployed on a decoupled multi-layer hardware and software stack, each with its own event and alarm subsystem. Combine these, and it becomes clear why telcos will increasingly struggle to rapidly isolate the source of outages or identify and diagnose conditions that may adversely affect the overall customer experience.  

While distinct network analytics toolsets and the various applications monitoring other layers of the underlying platform can detect specific issues, correlating this massive amount of information to determine cause and effect demands a new approach.  And with the power of modern cloud services, we can completely redefine and reimagine the role of network analysis in network operator infrastructures. First, however, we must collect and store this data.

From user devices to core infrastructure and applications, network operators generate massive amounts of data daily, adding complexity to the task of swiftly processing, economically storing and efficiently analyzing information to garner the required insights.

Public clouds provide the solution

Leveraging public cloud offerings is crucial for overcoming these challenges, as they streamline data aggregation and analysis while enabling operators to promptly detect and respond to irregularities or opportunities. Azure stands out in this domain, offering a comprehensive suite of services, including storage solutions, machine learning capabilities, business intelligence tools and automation resources.

For instance, the Microsoft Azure Data Lake is ideal for storing incredibly high volumes of diverse logs, events, alarms and telemetry data produced by the platforms and network functions that make up both traditional and modern communication services. Supporting structured, semi-structured and unstructured data, Azure Data Lake works with lakehouse implementations like Azure Databricks, which serves as a mediation layer that guarantees reliable, high performance, handling of both batch and streaming information. Microsoft Fabric is a platform that can provide a complete solution, integrating data lake, data engineering, and data integration from Power BI, Azure Synapse and Azure Data Factory into a single SaaS offering.

Gaining new network insights

Once acquired, the next step requires adoption of a platform flexible and powerful enough to correlate and analyze this data, proactively identifying potential issues and rapidly pinpointing the root cause of failures. Microsoft Azure Operator Insights (AOI) is an example of such a platform, enabling not only error detection but also the ability to extract actionable business intelligence. This can result in more efficient and cost-effective operation of network infrastructures, coupled with the ability to tune marketing strategies towards reducing churn or increasing monetization opportunities through the introduction of new products and services.

As communications service providers are huge entities comprising numerous different areas of responsibility, AOI is built on the foundation of a data mesh that decentralizes ownership of information based on the domain where it was generated. The next step is to normalize the data through a transformation process that includes the application of ingestion agents, validation engines and data processing. Once complete, the output data is ready for analysis and visualization. Azure Data Explorer (ADX) and Microsoft Power BI are two examples of services that can rapidly interrogate numerous data repositories simultaneously across multiple dimensions to present easily digestible comparative tables, charts and graphs.

Not only can these tools be employed to reactively diagnose problems, but they can also be coupled with Azure artificial intelligence (AI) and machine learning (ML) services to proactively identify minor issues long before they become major outages. Training models on the vast amounts of historical network event data, operators can be alerted to small anomalies that might indicate a larger problem—including the imminent possibility of a catastrophic outage. Once trained, the application of rules-based ML—referred to as narrow AI for its limited focus— to real time data can provide a definitive indication of known conditions. Alternatively, the application of generative AI (GenAI) using a large language model (LLM) like the generative pretrained transformer (GPT) could surmise issues based on a set of distinct factors that had not been previously witnessed or foreseen.

Three UK embraces AOI

Three UK has approximately 10 million customers and carries about 30% of the UK’s network traffic. They are just one example of an operator that has embraced the benefits of Microsoft’s Azure Operator Insights.  Three UK recognized that drawing insights from the massive amounts of data generated by its network was critical to improving network operation and customer experience.

In particular, the operator wanted to support gamers by giving them optimal network performance. The company deployed 5G throughout its network with the goal of delivering high speeds and low-latency network performance to its gaming customers but it needed to be able to analyze user behavior on its network so it could identify any pain points and be able to make improvements quickly and effortlessly. 

According to Julie Bushell, head of service quality experience at Three UK, the company decided to work with Microsoft and use AOI to streamline its data collection process and analysis. By working with Microsoft, Three UK was able to remove its data silos and simplify the analysis of its network architecture. The result was better data collection, ingestion and processing. “It enables us to have end-to-end visibility across the RAN network, the transport network, and the core network to break silos and achieve a single pane of glass,” said Ankush Saikia, senior manager of network strategy and architecture at Three UK.

Three UK wanted to get a single customer view into their network to not only respond to the individual user experience but also to get a broader perspective into how all users are engaged with the network. Having this type of insight allows the operator to allocate network resources efficiently and make targeted improvements when needed. And the operator isn’t stopping there: Three UK said its next step is to bring its IT operations into the platform so that its engineering team and operations team can also have a complete view of its customer experience.

Closing the loop

Ultimately, the goal of every operator is to completely automate the deployment of individual network functions and the provisioning of the end-to-end services that use them. But these networks and services also need to adapt constantly to changes caused by events such as subscriber growth, application adoption, errors and unexpected outages.  As such, there are many dynamic factors that must be considered when scaling infrastructures, working around faults or identifying problems. Both narrow and generative AI have a role to play in reducing the manual processes required to recognize and resolve complex issues.

For example, GenAI can collapse the classic root cause analysis cycle, where one or more unclassified alarms or log messages must first be recognized as consequential by an operator. From there, the operator surmises any contributing factors, considers remedies and determines the specific action required to rectify the problem.  Conversely, an application employing a trained LLM can bypasses all other time consuming and labor-intensive steps and skip straight to presenting the commands required to resolve an issue. Add narrow AI to strictly police automated actions, and even that last manual step can be averted, in some cases, with the application also issuing those commands.

With vast amounts of data at their disposal, modern telcos need to be able to easily analyze and make sense of their data if they want to be able to deliver the best possible customer experience. By embracing modern data analytic tools, such as Microsoft’s Azure Operator Insights, telcos can eliminate data silos and get a much more accurate view of their 5G network performance.

The editorial staff had no role in this post's creation.