We’re asserting the general public preview of Fireworks AI on Microsoft Foundry, bringing excessive‑efficiency open mannequin inference into Azure. This integration displays Microsoft Foundry’s broader course: offering a single place the place builders cannot solely run open fashions effectively but additionally customise and operationalize them as a part of an entire enterprise‑prepared AI lifecycle.
Throughout industries, organizations are more and more standardizing on open fashions to achieve better management over efficiency, price, customization, and the safety and compliance required for enterprise deployment. Open fashions give groups the pliability to decide on the correct structure for every workload and keep away from lock‑in to a single mannequin supplier as their wants evolve.
As adoption grows, nonetheless, efficiency alone is not sufficient. Groups want a constant technique to consider fashions shortly, function them safely in manufacturing, and enhance them over time with out rebuilding infrastructure or fragmenting their tooling. Too typically, organizations are compelled to assemble bespoke serving stacks, slowing innovation and making it more durable to scale and compound progress.
Microsoft Foundry is designed to handle this problem. It serves as a unified system of report and enterprise management aircraft for AI, bringing collectively fashions, brokers, analysis, deployment, and governance right into a single expertise. With Microsoft Foundry, groups can transfer from experimentation to manufacturing with confidence, utilizing the fashions and frameworks that finest match their necessities, whereas counting on a constant operational basis.
At this time, we’re asserting the general public preview of Fireworks AI on Microsoft Foundry, bringing excessive‑efficiency open mannequin inference into Azure. This integration displays Microsoft Foundry’s broader course: offering a single place the place builders cannot solely run open fashions effectively but additionally customise and operationalize them as a part of an entire enterprise‑prepared AI lifecycle.
Fireworks AI fashions on Microsoft Foundry: A single place for open fashions
Fireworks AI delivers industry-leading inference for open fashions, and Microsoft Foundry is what makes that efficiency usable at enterprise scale. Accessing Fireworks AI by means of Microsoft Foundry offers groups a single, trusted management aircraft to guage, deploy, customise, and function open fashions alongside the remainder of their AI stack.
As open fashions mature, customization more and more extends past coaching. Groups want constant methods to configure, deploy, optimize, govern, and iterate on fashions in manufacturing with out fragmenting instruments or infrastructure. Microsoft Foundry supplies the surroundings the place these customization and operational workflows are standardized, whereas Fireworks AI provides the efficiency and effectivity wanted to run open fashions at scale. This implies groups can transfer from experimentation to manufacturing utilizing open fashions with out stitching collectively separate instruments, contracts, and deployment paths.
Collectively, Fireworks AI and Microsoft Foundry allow a extra full and sustainable method to working with open fashions combining quick, environment friendly inference with a platform designed to assist enterprise open mannequin operations over time.
With Fireworks AI on Foundry, builders can get entry to best-in-class inferencing for open fashions, together with optimized deployments for customized weight fashions. Fireworks AI is a market chief for prime efficiency inference for open fashions. Its engine already runs at web scale processing over 13T tokens each day, sustaining about 180 thousand requests per second, and producing over 1,000 tokens per second on giant fashions, substantiated by main benchmark efficiency on Synthetic Evaluation. This efficiency is now out there on Foundry.
Builders can log into Foundry and entry these open fashions with Fireworks AI right this moment:
- DeepSeek V3.2
- OpenAI gpt-oss-120b
- Kimi K2.5
- MiniMax M2.5 (new)
This brings a brand new open mannequin (MiniMax M2.5) to Foundry with serverless assist and provides optimized inference for already in style open fashions.

With Fireworks AI in Microsoft Foundry, builders can:
- Consider fashions sooner with day‑zero entry and assist: Begin constructing instantly with entry to state-of-the-art open fashions from Fireworks AI by means of a single Azure endpoint by way of Foundry.
- Optimize inference: Requests to open fashions are served by Fireworks’ excessive‑throughput inference stack for quick efficiency with Azure‑grade governance.
- Run the fashions you already belief: With bring-your-own-weights (BYOW), you may add and register quantized or tremendous‑tuned weights skilled elsewhere with out altering the serving stack.

- Select the correct pricing mannequin to your workload: Use serverless, pay-per‑token inference to experiment securely and shortly with Information Zone Commonplace or select provisioned throughput items (PTUs) for predictable, steady-state efficiency with base or customized fashions. Whether or not you’re optimizing for agility or effectivity, you get flexibility with out managing infrastructure.
- Function with enterprise belief and scale: We’re dedicated to enabling clients to construct production-ready AI functions shortly, whereas sustaining the very best ranges of security and safety. Foundry supplies an end-to-end workspace for agent improvement, analysis, and deployment, together with unified governance, observability, and agent-ready tooling.
The way forward for Fireworks and AI use circumstances
Microsoft Foundry is evolving to assist the complete lifecycle of open fashions—from early analysis by means of manufacturing operation and ongoing optimization. As groups scale their use of open fashions, having a constant, enterprise‑prepared basis turns into more and more essential.
By integrating Fireworks AI into Microsoft Foundry, builders achieve entry to excessive‑efficiency inference right this moment whereas constructing on a platform designed to assist deeper customization and enterprise operations over time. This method offers groups the boldness to undertake open fashions not only for what they will do now, however for a way they will develop, adapt, and function reliably as their AI ambitions broaden. We’re trying ahead to seeing how builders and enterprises use Fireworks AI on Microsoft Foundry to energy the following era of clever functions.
To get began:
- Go to Microsoft Foundry fashions and choose Fireworks AI open fashions within the mannequin catalog assortment.
- Choose the open mannequin hosted by Fireworks.
- View the mannequin card.
- Choose your deployment possibility—serverless or PTU—and deploy.
Be taught extra about Fireworks on Microsoft Foundry
