Deployment guide

Cheapest way to deploy Devstral-Small-2-24B-Instruct-2512 in 2026

10 providers compared. API token-pricing, dedicated capacity, and rented GPU costs side-by-side, normalized to monthly cost.

Cheapest API

$0.100 / 1M input tokens

at Mistral AI

Cheapest GPU rental

$0.28 / hour

at Verda on V100

10 providers compared

ProviderRegionQuantizationSource
Mistral AI
France$0.100$0.300Source ↗
AWS
Multiple regions$0.400$2.00Source ↗

API vs. GPU rental

The crossover point at which renting a GPU full-time becomes cheaper than paying per token: ~34M tokens/day.

Monthly cost vs. daily token volume — Mistral AI API vs. Verda GPU rental for Devstral-Small-2-24B-Instruct-2512API cost (blue line) scales linearly with token volume; GPU rental (green line) is a flat ~$202 per month. They cross at roughly 34M tokens/day, the break-even point.$0$2.0k$4.0k$6.0k10k100k1M10M100M1BBreak-even · 34M/dayAPI (Mistral AI)GPU (Verda)Tokens per day (log scale)Monthly cost (USD)

Top 5 cheapest for your workload

Adjust the assumptions below — token volume, input/output ratio, days and hours of usage — to see how the cheapest options shift.

Your workload

Tokens vol.
1M1B
Input / Output
100% Input100% Output
Active days / mo
1 day30 days
Active hours / day
1 hr24 hrs
Cache hit rate
0%100%
Batch APIs
RankProviderPricingHardwareMonthly
#1
Verda
GPU · On-demandV100$202
#2
Verda
GPU · On-demandRTX A6000$353
#3
Mistral AI
API$390
#4
Azure
GPU · On-demandRTX PRO 6000$396
#5
Azure
GPU · On-demandK80$570

View full results with your assumptions →

Devstral-Small-2-24B-Instruct-2512 at a glance

VRAM (native precision)
25 GB
Parameters
24.0114B
Native precision
fp8
Context length
License
apache-2.0
Knowledge cutoff
Modalities
Access type
Open source
EU developed
Yes
Origin country
FR

A second opinion on the data

Hardware footprint

Devstral-Small-2-24B-Instruct-2512 is a 24.0114B-parameter model that needs 25 GB VRAM at fp8 when self-hosted at native precision (fits a single A100-40GB or L40S). Quantization to int8 typically halves the VRAM requirement; int4 quarters it, at modest accuracy cost. 9 GPU rental providers in nfer's index currently offer hardware that fits this model at native precision.

Cheapest path today

For Devstral-Small-2-24B-Instruct-2512: The cheapest API offering is Mistral AI at $0.10/1M input + $0.30/1M output tokens. The cheapest GPU rental that fits the model is Verda on V100 at $0.28/hour. The break-even point between paying per token and renting a GPU depends on your daily volume — see the chart above.

Licensing and fit

Released under the apache-2.0 license, Devstral-Small-2-24B-Instruct-2512 ships with a context length not specified; open-source weights are publicly available. European model — relevant for EU-sovereignty filtering.

Common questions

  • What's the cheapest way to host Devstral-Small-2-24B-Instruct-2512?
    The cheapest API option for Devstral-Small-2-24B-Instruct-2512 in nfer's index is Mistral AI at $0.100/1M input + $0.300/1M output tokens. For self-hosted workloads, the cheapest GPU rental that fits is Verda on V100 at $0.28/hour. The right choice depends on your daily token volume — see the break-even chart on this page.
  • How much VRAM does Devstral-Small-2-24B-Instruct-2512 need?
    Devstral-Small-2-24B-Instruct-2512 25 GB at native precision; roughly 13 GB at int8 and 6 GB at int4. Native precision is fp8. Quantization roughly halves (int8) or quarters (int4) the VRAM footprint at modest accuracy cost.
  • Can I use Devstral-Small-2-24B-Instruct-2512 commercially?
    Yes — released under the apache-2.0 license, which permits commercial use.
  • What's the difference between API and GPU rental for Devstral-Small-2-24B-Instruct-2512?
    Token-priced API providers (like Mistral AI) bill per million input/output tokens — best for low or bursty volume. Renting a GPU (e.g. Verda at $0.28/hour) is a flat ~$201.60/month regardless of usage — better economics once you sustain enough tokens per day to justify the fixed cost. The break-even chart on this page shows the exact crossover point.
  • Is Devstral-Small-2-24B-Instruct-2512 available with EU data residency?
    Devstral-Small-2-24B-Instruct-2512 is a European model (origin: FR). 4 EU-owned providers offer hosting in nfer's index — filter on EU sovereignty in the comparator to see them.

Prices last updated · 2026-04-30