Close Menu
    2digital.news2digital.news
    • News
    • Analytics
    • Interviews
    • About us
    • Editorial board
    • Events
    2digital.news2digital.news
    Home»News»Nvidia cuts AI inference costs by up to tenfold with Blackwell architecture and open-source models
    News

    Nvidia cuts AI inference costs by up to tenfold with Blackwell architecture and open-source models

    Mikolaj LaszkiewiczBy Mikolaj LaszkiewiczFebruary 13, 20262 Mins Read
    LinkedIn Twitter Threads

    In a published blog post, Nvidia highlights that leading inference service providers — including Baseten, DeepInfra, Fireworks AI and Together AI — are able to reduce the unit cost of processing a single token by as much as 10× compared with previous hardware generations such as the Hopper platform, by combining Blackwell with optimized software stacks and open-source models.

    The Blackwell platform, based on Nvidia’s newly designed microarchitecture, was built specifically to handle AI workloads while increasing both throughput and energy efficiency. As a result, a higher number of tokens can be processed using the same amount of infrastructure. It is this increase in throughput that directly drives down the operational cost per token.

    Deployment examples show the broad economic impact of this approach. In healthcare, Sully.ai — using Blackwell together with open-source models — achieved a 90% reduction in inference costs while also shortening response times, improving the viability of automating tasks such as medical coding and clinical documentation workflows.

    Other use cases include gaming platforms and customer-support tools, where companies reported token-cost reductions of between 4× and 10× when running Blackwell with low-precision formats (such as NVFP4) and open-source models instead of relying on expensive proprietary API providers.

    This shift in the cost model is important not only for cloud providers, but also for enterprises and startups that want to scale AI-based applications without massive financial outlays. A substantial drop in cost per token could make AI less exclusive to the largest players and significantly more accessible to smaller organizations.

    Industry analyses indicate that the cost reduction is driven not only by the hardware itself, but by the tight integration of hardware and software — optimized drivers, algorithms and open-source models run more efficiently on the Blackwell platform, maximizing utilization of compute resources.

    The new inference cost structure could have a meaningful impact on the pace of commercialization of AI solutions in sectors such as healthcare, services and entertainment — especially in use cases where every processed token translates directly into operating expenses. Lower costs may also reduce barriers to entry for companies building products on top of large language models.

    Share. Twitter LinkedIn Threads

    Related Posts

    News

    Pentagon pushes major AI companies to deploy their models on classified military networks

    February 13, 2026
    News

    Microsoft patches Windows 10 — more than 50 bugs and 6 zero-day vulnerabilities fixed. This is a critical security update

    February 12, 2026
    News

    Ukraine tests the “Sunray” laser system for neutralising unmanned aerial vehicles

    February 12, 2026
    Read more

    Wearable Waste. The Ecological Price of Medical Devices

    February 4, 2026

    “Despite Automation, Medical AI Research Is Still About Talking to People.” Kaapana Platform Powers Medical AI Research Across University Clinics

    February 3, 2026

    CES 2026: how AI and smart devices redefined everyday life

    January 27, 2026
    Stay in touch
    • Twitter
    • Instagram
    • LinkedIn
    • Threads
    Demo
    X (Twitter) Instagram Threads LinkedIn
    • NEWS
    • ANALYTICS
    • INTERVIEWS
    • ABOUT US
    • EDITORIAL BOARD
    • EVENTS
    • CONTACT US
    • ©2026 2Digital. All rights reserved.
    • Privacy policy.

    Type above and press Enter to search. Press Esc to cancel.