TL;DR
Nvidia announced a software update at CES that it says yields an average 2.5x performance uplift for the DGX Spark and GB10 systems, mainly for compute-heavy parts of generative AI workloads. The update brings expanded software — including AI Enterprise access, Nsight on-device tooling, RTX Remix and a Hugging Face Reachy robotics guide — while questions about distro support and independent verification remain.
What happened
At CES, Nvidia rolled out a software update and new integrations for its DGX Spark and other GB10-based systems. The company claims the release boosts performance by roughly 2.5x on average across multiple libraries and frameworks, with most gains focused on compute-heavy stages of the generative AI pipeline (notably prefill work), rather than the bandwidth-limited token decode phase. The update touches Nvidia components such as its inference engine, TensorRT LLM, Llama.cpp and PyTorch. Nvidia also said it will offer its AI Enterprise suite on the Spark as a subscription later this month, and plans to ship an on-device version of the Nsight CUDA assistant in the spring. Additional integrations include RTX Remix support for modding workloads and a developer guide pairing Spark with Hugging Face’s Reachy robot. Nvidia is exploring support for clusters larger than two Sparks via the ConnectX-7 NIC, though current supported clusters top out at two systems. Independent verification of the performance claims has not been provided.
Why it matters
- Improved prefill and compute performance can shorten turnaround for local model testing, fine-tuning and generation tasks.
- Access to AI Enterprise and local Nsight tools expands the Spark's software ecosystem, potentially making it more useful for privacy-sensitive development.
- Limitations on decode-phase token throughput mean observable token-generation speedups may be limited despite overall compute improvements.
- Unresolved distro and long-term support questions could affect longevity for organizations relying on the Spark as an on-premise development platform.
Key facts
- Nvidia claims an average 2.5x performance improvement across several libraries and frameworks since the Spark's October launch.
- The update improves compute-intensive parts of genAI pipelines (prefill), while decode/token generation remains bandwidth-limited.
- Spark hardware is GB10-based with 128 GB of unified memory and roughly the computational equivalent of an RTX 5070.
- Nvidia plans to make its AI Enterprise suite available on the Spark as a subscription later this month; special Spark pricing has been hinted but not finalized.
- AI Enterprise normally lists at $4,500/year per GPU or about $1/hour per GPU in cloud contexts; developer access is free but production use requires payment.
- Nvidia released a kernel update with security patches and says it is committed to supporting DGX OS on Spark; wider third-party distro support is not official.
- An on-device Nsight CUDA coding assistant for the Spark is expected later this spring.
- RTX Remix integration and a developer guide for Hugging Face's Reachy robot were announced for Spark workflows.
- Sparks include a ConnectX-7 NIC with dual QSFP+ ports offering 200 Gbps; Nvidia currently supports linking up to two Sparks and is exploring larger cluster support.
What to watch next
- Official rollout date and exact pricing details for AI Enterprise on the Spark later this month.
- Arrival and capabilities of the on-device Nsight CUDA assistant, expected in spring.
- Whether Nvidia will base DGX OS for Spark on Ubuntu 26.04 or offer formal support for third-party Linux distributions — not confirmed in the source.
- Any formal announcement enabling clusters larger than two Sparks and related software/firmware support for multi-node setups.
Quick glossary
- DGX Spark: A compact, GB10-based AI workstation from Nvidia designed for local AI development, prototyping and inference.
- Prefill: The stage in LLM inference that processes input prompts and prepares internal states before token generation begins.
- Decode (token generation): The phase of model inference where output tokens are produced; often limited by memory bandwidth rather than compute.
- AI Enterprise: Nvidia’s commercial suite of enterprise applications, frameworks, models and microservices for building and deploying AI applications.
- ConnectX-7 / QSFP+: High-speed networking hardware: ConnectX-7 is a NIC platform and QSFP+ are transceiver ports commonly used for multi-hundred-Gbps links.
Reader FAQ
Has the DGX Spark actually doubled its performance?
Nvidia reports an average 2.5x performance boost across multiple libraries since launch; independent verification was not provided in the source.
Will the Spark generate tokens twice as fast now?
No—the company says decode/token generation is bandwidth-limited, so most gains apply to prefill and other compute-heavy operations.
Is AI Enterprise included with the Spark?
Nvidia plans to offer the AI Enterprise suite on Spark as a subscription later this month; special Spark pricing was mentioned but not officially confirmed.
Will the Spark support other Linux distributions like RHEL?
Nvidia is focusing on DGX OS for now and has released kernel security patches; formal third-party distro support is not confirmed in the source.

SYSTEMS Nvidia says it's more than doubled the DGX Spark’s performance since launch Just maybe not in the way you're thinking Tobias Mann Mon 5 Jan 2026 // 23:00 UTC Nvidia's DGX Spark and…
Sources
- Nvidia says it's more than doubled the DGX Spark’s performance since launch
- Nvidia says DGX Spark is now 2.5x faster than at launch
- DGX Spark Release Updates?
- Dell's version of the DGX Spark fixes pain points
Related posts
- Intel launches Panther Lake Core Ultra Series 3, first chips on 18A
- Nvidia CEO Jensen Huang: Vera Rubin AI Chips Now in Full Production
- Nvidia unveils Vera Rubin rack-scale AI platform built from six chips