How AI at the Edge of the Network Improves the User Experience

September 8, 2017 Mike Faden

Today, AI has matured to a point where it’s being applied to enhance an extraordinary variety of applications. So perhaps it’s not surprising that AI plays a critical role at the cloud edge, where Dyn uses AI—machine learning, to be more specific—to help determine the best path for connecting end users to Internet resources. Furthermore, that technology is being extended to provide key new capabilities for companies using the Oracle Cloud.

Why is AI so important at the cloud edge? Because determining the optimum access path to resources is crucial in shaping the user’s experience overall. Accessing a typical web page requires multiple—sometimes many—DNS lookups; adding even a few milliseconds to each lookup results in significant delays, resulting in a poor user experience and, potentially, changes in behavior, such as abandoned e-commerce shopping carts.

Determining the best path for server requests to navigate at any specific moment is a highly complex, continually changing problem that requires the analysis of vast amounts of unstructured data—which makes it a good fit for machine learning, notes Mike Kane, Senior Product Manager at Oracle. In a mobile world, a user’s location may change at any time, so their optimum path may change too. And the Internet itself is also constantly changing due to factors such as outages, security threats, and shifts in traffic. As Kane puts it: “What was a good path a few milliseconds ago may not be a good path any longer. To determine the right path at any given moment, we need a system that constantly learns, then applies that learning to make decisions in near real time.”

To achieve that goal, data collectors at strategic locations across the world collect vast amounts of traffic-related data from many different sources—including ISPs, mobile network operators, and individual devices—amounting to more than 240 billion data points each day. Initial processing and filtering is performed at the cloud edge; then the data is added to a data lake, where machine-learning algorithms search for patterns that the system can use to make smarter decisions. The results are pushed back to Dyn’s locations worldwide, where they are applied at the cloud edge to optimize every DNS lookup in near real time.

As the system identifies patterns in the data, such as how problems tend to spread across the Internet, it learns to predict what will happen next and adjust its path selection accordingly. “If a fiber-optic cable gets cut on the west coast, there is a cascading effect across the country,” Kane says. Understanding these patterns will help Dyn DNS servers make better decisions the next time there’s a similar outage. “You don’t want to keep choosing a network that you know will eventually be impacted.”

During the 2016 Rio Olympics, to cite one concrete example, it became apparent that Internet viewers around the world tended to access websites in their home countries to follow the progress of their favorite athletes. This generated unusual traffic patterns, quickly saturating the capacity of routes that weren’t designed to handle the load. As the traffic grew, the system adapted by finding new pathways: instead of routing users to Europe via London or Madrid, for example, Dyn’s servers directed traffic to less-direct routes via Miami, where lower congestion provided users improved better performance.

AI also helped to mitigate the world’s worst-ever DDoS attack in October 2016, in which an estimated 100,000 malware-infected IoT devices flooded Dyn’s servers with an extraordinary volume of requests. Because Dyn’s software identified that the attack was progressively rolling across the country, the company, working closely with customers, was able to constantly move traffic to different points of presence to stay ahead of the wave. This had two benefits, Kane says. First, although the attack lasted for hours, the impact on customers was reduced to only a fraction of that time. Second, the lessons from that event were used to enhance Dyn’s algorithms so that the system could respond even faster in future. As a result, the many DDoS events that have since occurred on the Internet have not impacted Dyn’s network, he says.

Next steps for Dyn’s machine learning technology include integration with the Oracle Cloud; in addition to gathering information across the Internet, the system will collect data directly from servers in Oracle’s cloud data centers, via APIs. This additional level of information will help companies using the Oracle Cloud further optimize traffic, Kane explains. For example, the system might transparently redirect end users from a heavily used datacenter to a less-utilized one—thus improving response times and the overall user experience.

Previous Article
The Role DNS Can Play in Maximizing Availability
The Role DNS Can Play in Maximizing Availability

Availability is an essential aspect of any internet-based system, with downtime often resulting in short-te...

Next Article
How To Navigate Email Through Gmail’s Tab System
How To Navigate Email Through Gmail’s Tab System

We’re now four years into Gmail’s tab system — an innovation that the email giant rolled out in 2013 in ord...