Question 1

What is nfer?

Accepted Answer

nfer is an AI model deployment cost comparator. Once you've picked a model - Llama 3, Mistral, Qwen, Claude, or another - nfer shows you every place it can run (token-priced API, dedicated/reserved instance, or rented GPU) and tells you which is cheapest for your specific workload, across several providers worldwide.

Question 2

How does the comparison work?

Accepted Answer

You search for a model, set assumptions about your workload (monthly token volume, input/output ratio, cache hit rate, target utilization, quantization), and nfer normalizes every provider's pricing to a single monthly-cost figure. Results are sortable and filterable by sovereignty, certifications, region, license, and pricing type.

Question 3

Which deployment options does nfer compare?

Accepted Answer

nfer compares the major deployment paths for AI models: API (per-token, rate set by the provider), dedicated/reserved capacity (a committed instance billed hourly with a fixed throughput ceiling - better economics at sustained volume), GPU rent (hourly hardware like an A100 or H100 you operate yourself), co-location (your hardware in a third-party data centre), and self-hosting (your hardware on your own premises). All are normalized to monthly cost on the same model so they compare like-for-like.

Question 4

Can I compare two models?

Accepted Answer

Yes - select more than one model in the search bar and the comparator shows results for each side-by-side. A dedicated A/B view that lines two models up directly is on the roadmap.

Question 5

How can a GPU have a price per million tokens?

Accepted Answer

A GPU's raw price is hourly, not per-token. nfer estimates a blended per-million-token rate by combining the GPU's hourly rate with realistic throughput assumptions (model size, quantization, batch size, target utilization). The number is an estimate, not a guarantee - adjust the assumptions on the home page to see how the blended figure changes for your workload.

Question 6

Why might my preferred provider show a different price than their public site?

Accepted Answer

Two common reasons: (1) for GPU and reserved capacity, provider pages quote hourly rates while nfer translates them to a workload-blended per-token figure for like-for-like comparison; (2) providers frequently run promotional credits or volume discounts that aren't in their public price list. If a price looks wrong, please send the provider URL - we'll investigate as soon as possible.

Question 7

How often are prices updated?

Accepted Answer

Prices are synced from each provider's public pricing source. You can find the latest update time on every model card in the comparator.

Question 8

Where do prices come from?

Accepted Answer

Public, provider-published pricing pages and API price lists. We do not negotiate private pricing.

Question 9

How accurate are the prices? What can I rely on?

Accepted Answer

We pull prices directly from each provider's public pricing pages and aim to match them faithfully. Reserved/committed pricing is approximated when providers don't publish a clear per-hour equivalent - in those cases the page says so explicitly. Always verify the headline number against the provider's own page before signing a contract.

Question 10

Why is provider X or model Y missing? How can I add one?

Accepted Answer

Coverage expands continuously. If something you need is missing, email info@nfercost.com with the provider and model - most additions take under a day once we have a stable pricing source URL.

Question 11

Can I export or query the data?

Accepted Answer

Not yet via a public API. A paid price-tracker API is on the roadmap for teams that need programmatic access. Until it ships, email info@nfercost.com if you need a one-off export.

Question 12

Who built nfer? How can I contact you?

Accepted Answer

nfer is built by M3T, an AI company building distinctive products. Email info@nfercost.com or book a free 30-minute consultation via the link on the home page. We aim to reply to every message quickly.

Frequently asked questions

Popular models

Providers

Learn

About