AI Startup Funding Alert

a16z and Lightspeed back vLLM team’s $150M AI inference startup

By StartupStory | January 23, 2026

Inferact, an AI startup founded by a group of AI researchers and open-source software developers, has raised $150 million in a seed funding round, valuing the company at $800 million.

The round was led by Andreessen Horowitz and Lightspeed, with participation from Databricks’ venture arm, the UC Berkeley Chancellor’s Fund, and other backers.

Founded by the core team behind vLLM

Inferact is built by the core team behind vLLM – Simon Mo, Woosuk Kwon, Kaichao You, and Roger Wang.

vLLM, which stands for virtual large language model, is an open-source library maintained by the vLLM community.

vLLM focuses on inference, the stage where trained AI models generate responses in real-world applications. As AI models become more reliable and capable, inference is emerging as the new bottleneck. Applications now require models to run longer, handle more tokens, and serve thousands of users simultaneously. That puts pressure on memory, hardware, and performance.

vLLM addresses this by optimising how models use memory and compute.

One of its core features, PagedAttention, reduces memory waste by storing key-value cache data more efficiently across system RAM. The software also uses techniques such as quantisation to reduce model size and allows models to generate multiple tokens at once, speeding up response times.

vLLM has more than 2,000 contributors and over 50 core developers, and it is used by large companies like Meta and Google.

Making inference cheaper and faster

“Our mission is to grow vLLM as the world’s AI inference engine and accelerate AI progress by making inference cheaper and faster,” says the company in a blog post.

Inferact has two main goals. First, it aims to support the vLLM project by providing financial and developer resources to help it grow, especially as new model architectures, hardware, and larger models emerge.

Second, Inferact plans to develop a next-generation commercial inference engine, focusing on refining the software layer called the “universal inference layer,” while collaborating with existing providers rather than competing with them.

In a blog post, co-founder Woosuk Kwon said the goal is to make AI serving simple, so teams no longer need large infrastructure groups to deploy models at scale.

The company is expected to offer a serverless version of vLLM and add features such as observability, troubleshooting, and disaster recovery, likely running on Kubernetes.

At the same time, Inferact says it will keep improving the open-source vLLM project. That includes adding support for new model architectures, more hardware platforms, and larger, multi-node deployments.

Also Read

Alpa lands $3.5M to build financial platform for hospitality...

Follow Startup Story

Latest Post

Konkana Bakshi on Why High Society Is Rediscovering the Art of Deportment...

Peak XV joins $6.3m pre-series A for India’s Newtrace...

Vast Data raises $1b at $30b valuation...

Founders Fund nears $6b growth fund target: sources...

India’s Wipro Enterprises eyes semiconductor entry...

Indian digital payment firm Easebuzz seeks up to $32.5m funding...

Indian quick commerce firm Inamo raises $8m led by Prime Venture...

From Listening to Aspirants Across India to Building a Network for Them: The Min...

SEVORA: India’s First Stylist-Led Fashion App Redefining Personalized Fashion ...

Scoopable Success: How Krimmy Thickshakes Is Turning India’s Favorite Desserts...

March 12, 2026,

Konkana Bakshi on Why High Society Is…

General

March 10, 2026,

Peak XV joins $6.3m pre-series A for…

Greentech

March 10, 2026,

Vast Data raises $1b at $30b valuation

Artificial Intelligence (AI)

March 10, 2026,

Founders Fund nears $6b growth fund target:…

Investment

Startup story is a platform designed to promote businesses and entrepreneurs, prioritising ventures that are left overlooked and unrecognised in the Indian startup ecosystem i.e the startups from tier 2, tier 3 and tier 4 cities but are progressively succeeding. Startup Story becomes their voice by sharing their journey and business ideas to a greater audience using various verticals including articles, podcast, storytelling, video Interviews, e-newspaper and magazine.

Subscibe for FREE ! Get the Startup Story Podcast delivered straight to your inbox before everyone else.

By subscribing you agree to our Terms & Privacy-Policy.

Terms and Condition Privacy policy Support Us

a16z and Lightspeed back vLLM team’s $150M AI inference startup

Founded by the core team behind vLLM

Making inference cheaper and faster

Also Read

Follow Startup Story

Related Posts