The 3 big things you need to know from Red Hat Summit

  • Red Hat Summit is in full swing this week
  • The company leaned into inference, debuting several new inference-focused products and tools
  • It also refreshed its classic Linux platform

It’s Red Hat Summit week and the vendor came out swinging with a slew of announcements designed to show it’s poised to help enterprises navigate the age of AI. From an inference server, to an open source project on inferencing to Linux 10, we've summerized the key news below.

1) Red Hat AI Inference Server

Leading the pack of announcements was the debut of Red Hat AI Inference Server. In a nutshell, it’s a platform that serves up a containerized version of vLLM and supports a wide range of both accelerators and models.

For those out of the loop, vLLM is an open source library for LLM inferencing and serving. Red Hat has said vLLM can help accelerate generative AI output by making better use of GPU memory.

On the hardware side, Red Hat AI Inference Server is compatible with chips from Nvidia, AMD, Google, AWS, Intel and IBM. On the model side, it supports pre-optimized options from the likes of Google, Mistral, Microsoft, Qwen, IBM, DeepSeek and more.

“It just reminds us exactly where we were with Linux many years ago where Linux was key as a unified platform to enable enterprise hardware all on a single platform serving apps,” Red Hat’s SVP and AI CTO Brian Stevens, said on a call with media.

Notably, Red Hat has chosen to include support for the Model Context Protocol (MCP) APIs, which make it easier for developers to access the data they need for AI apps. Red Hat VP and GM for AI Joe Fernandes said on the call the company is also working with Google to support the latter’s new Agent2Agent protocol in the future.

2) Open sourcing inferencing

In case the previous bit didn’t hammer the point home about its focus on inference, Red Hat also announced the launch of LLM-D, an open source project designed to enable inferencing at scale using Kubernetes-based orchestration and AI-aware network routing techniques. The idea is to use advanced serving tools to deliver the varying service level objectives inferencing use cases require.

CoreWeave, Google, IBM Research and NVIDIA are all contributing code to the project, with AMD, Cisco, Intel, Lambda, Meta and Mistral AI all also onboard.

LLM-D “builds on the vLLM and Red Hat AI Inference server and builds it into a distributed cluster,” Stevens said on a call with media.

Stevens explained that scaling LLMs requires more than just auto scaling and adding more inference servers to a cluster.

“The state-of-the-art is actually moving to distributed inference. What that means is your first tokens of your inference may be handled by specialized nodes and your second and ongoing tokens could be handled by different nodes optimized for serving those ups.”

3) Linux 10 with Lightspeed

Red Hat also unleashed Linux 10, an updated version of the classic platform. Notable changes include post-quantum cryptography security protections; a container-based image mode; and Red Hat Enterprise Linux Lightspeed, a generative AI tool to provide task assistance via a natural language platform.

The latter is potentially a big deal. Why? “It helps to address the skill gap that we and our customers face with respect to the Linux skill set,” Raj Das, Red Hat Senior Director of Product Management, said on the call.

“What we are effectively doing is leveraging the AI playbook,” he continued. “Basically, we have a retrieval augmented generative app in the command line and you can effectively type in commands in plain English” to troubleshoot issues.