Cloud

AI 'elephant flows' challenge networks, Backblaze data shows

By Mitch Wagner Apr 28, 2026 9:00am

Neocloud and hyperscaler traffic dropped sharply over winter but rebounded in March, suggesting AI workloads may follow a seasonal business cycle, according to Backblaze
AI training bursts generate massive, concentrated data flows — "elephant flows" — pushing carriers to rapidly scale up port capacity well beyond historical norms
Co-packaged optics and 1.6T ports are entering the market as vendors race to meet AI's bursty bandwidth barrage

AI is generating new kinds of internet traffic: sudden, massive spikes that don't fit the predictable growth curves network operators are used to planning around, according to Brent Nowak, Backblaze manager of network engineering.

"We were seeing very bursty, spiky elements in our normal pattern," Nowak said. To better understand network traffic and how it affects Backblaze's infrastructure requirements, the cloud storage provider launched a series of studies. "We needed to figure out where those spikes were coming from and how they impact our business."

The winter freeze and spring rebound

The defining theme of the Q1 2026 Backblaze network stats report, released Tuesday, shows a "winter freeze," Nowak said. Neocloud and hyperscaler traffic combined fell from 36.4% of Backblaze's total volume in Q4 2025 to 25.5% in Q1 2026. CDN traffic rose from about 20% to 32% of total volume, and regional ISP traffic climbed from 21.5% to 27.8%.

Then, in March, AI-related traffic turned upward again.

Chart showing breakdown of quarterly network traffic — Monthly view of bits transferred to each network type, from May 2025 (Backblaze)

Why does the traffic slump and then bounce back? One possible explanation is that AI workloads are seasonal, slowing during winter holidays when teams step back from active projects. Another possible explanation is structural — once a large dataset has been ingested, models may be refined for months before a large new data transfer is needed, followed by a sudden surge. Backblaze is uncertain of the cause, or whether the trend is long-term, because it only has two full quarters of data.

What 'elephant flows' mean for carriers

These "elephant flows" are high-volume transfers involving a small number of IP addresses. That's the opposite pattern of traditional CDN traffic where a file fans out to hundreds of destinations.

Why is the traffic bursty? GPU cluster time is scarce and time-limited — a company might get four hours or a single day — and the entire workflow of loading data, running workloads and moving results out must complete before the window closes. "It's very price-driven, where you need to do your work and then your time is over, and then someone else gets time on that GPU cluster," Nowak said.

A single GPU pulling data runs at 8 to 10 gigabits per second. A customer operating 30 or 50 GPUs simultaneously runs those streams in parallel, quickly reaching 100G or 400G aggregate throughput. And the scale of AI training data — Nowak cited datasets of 10 petabytes or more — means the flows are unlike anything traditional network planning models anticipated.

The multimodal shift matters here too. While text has historically dominated AI training data, model releases from the major AI labs increasingly expect image, audio and video input.

Where AI traffic concentrates

AI-driven neocloud and hyperscaler traffic is heavily concentrated in the United States, particularly California and Virginia, including the Ashburn and Reston corridor, according to Backblaze data. Globally, notable concentrations appear in the Netherlands via the Amsterdam Internet Exchange, Singapore, Finland, Germany and Canada.

Texas and other emerging data center markets are absent from the top tier, despite a wave of high-profile buildout announcements. Established network paths are still carrying the AI traffic load, Nowak said.

That geographic concentration is driving long-haul fiber investment across the industry. Demand for fiber is putting hyperscalers in competition with telcos. And Zayo is seeing strong demand in its long-haul business, fueled by AI. The shift is so pronounced that the industry's flagship optical conference has effectively transformed from a telecom show into an AI show.

Hardware market shifts gears

Backblaze is transitioning its infrastructure to manage changing demand. The company moved from 100G as its standard external port capacity to 400G, and for its most active AI clients is deploying two or four 400G links to approach terabit-range throughput. "This was something that three or four years ago was not on the horizon for our business," Nowak said.

In the wider market, 800G ports are entering enterprise deployments and 1.6 terabit ports are arriving in available hardware, Nowak said.

A structural shift is also underway at the switch level. Co-packaged optics — where the fiber plugs directly into the switch chassis rather than through a separate pluggable module — are moving toward mainstream adoption, with Arista, Juniper and Cisco all advancing the technology, Nowak said.

The benefits are less heat, greater port density and reduced power per bit. The tradeoff is a larger failure domain — a problem with a co-packaged switch could affect all its ports rather than isolating the failure to a single module. But the industry is moving in that direction regardless, Nowak said.

artificial intelligence (AI) traffic management BackBlaze GPU Fierce Network Research Bulletin neocloud optical transport Cloud