Skip to content

Chrome’s Prompt API vs Mozilla: What It Means for You

Mozilla just reiterated its opposition to Chrome's Prompt API. Here's what the API actually does, how to test it, and why Firefox won't ship it.

7 min readBeginner

Two ways to read the news that just dropped: (1) Mozilla is being a sore loser about a Chrome feature that’s already shipping, or (2) Mozilla is the only browser vendor publicly worrying about whether your AI app will only work in Chrome. The second framing is more useful, even if you’re a Chrome developer – because it tells you exactly what to test for. If your app prompts Gemini Nano in a way that breaks on a different model, you’ve built a Chrome-only experience whether you meant to or not.

This piece is a tutorial, not a hot-take. Mozilla’s opposition to Chrome’s Prompt API matters because it changes how you should write code – not whether you can. Below: what the API actually does, how to turn it on in 10 minutes, and the gotchas competitor tutorials skip.

What just happened (the 30-second version)

On April 30, 2026, Mozilla reiterated its opposition to the Prompt API on a GitHub thread. Jake Archibald, Mozilla’s web developer relations lead, articulated the org’s concerns, saying the API has “severe negative consequences to the interoperability, updatability, and neutrality of the web platform.”

Rick Byers, the Google Chrome engineer responsible for shipping the Prompt API, replied that he shares some of Mozilla’s concerns but prefers paths that “promote experimentation, learning from mistakes, and competition” over stalling innovation. Both are right. That’s why this is awkward.

The Prompt API in one paragraph

It’s a JavaScript API that gives web pages the ability to directly prompt a browser-provided language model – specifically Google’s Gemini Nano, downloaded for local inference through Chrome. No API key. No network round trip. No bill. The catch: it only exists in Chrome and Chromium-based browsers like Edge (as of mid-2026), and that’s the whole reason Mozilla is upset.

Why Mozilla won’t just ship it in Firefox

The shallow reading is “Mozilla doesn’t want to lose to Google.” The actual technical argument is more interesting. From Mozilla’s GitHub issue #1213: the system prompt becomes tailored to the model, other models have different quirks, and things added to the system prompt for one model may overcorrect on another.

Here’s what that means in practice. Spend a week tuning prompts against Gemini Nano – adjusting temperature hints, system-prompt phrasing, output format instructions – and those tweaks are essentially compiled against Gemini’s tokenizer and instruction-following patterns. Port that to Llama or Phi and you’re not starting from scratch, but you’re not done either. Mozilla’s Brian Grinstead made the downstream point explicit (via The Register, June 2025): web developers will create apps based on Gemini’s behavior, and since those apps may behave differently with a different model, those developers will simply recommend Chrome. That’s lock-in – not through malice, but through accumulated prompt-engineering debt.

Pro tip: Even if you’re shipping a Chrome-only feature today, write your prompts as if a different model will run them tomorrow. Avoid quirks that depend on Gemini Nano’s specific tokenizer or system-prompt format. Your future self (and any user on Firefox) will thank you.

Trying it yourself in 10 minutes

Before you start, check that your machine qualifies. According to Chrome’s official setup docs (mid-2026 – verify for updates), the Prompt API works on Windows 10 or 11, macOS 13+, Linux, or ChromeOS on Chromebook Plus devices, with at least 22 GB of free space on the volume containing your Chrome profile. Mobile Chrome is not supported. For hardware: GPU with strictly more than 4 GB of VRAM, or CPU with 16 GB of RAM and 4+ cores.

  1. Install Chrome 138 or newer – verify the current minimum at developer.chrome.com/docs/ai/get-started, as this may change.
  2. Open chrome://flags/#optimization-guide-on-device-model and set it to Enabled BypassPerfRequirement.
  3. Open chrome://flags/#prompt-api-for-gemini-nano and set it to Enabled.
  4. Relaunch Chrome.
  5. Visit chrome://on-device-internals to confirm the model is downloading or ready.

Then, on any localhost page, drop this in the console:

const availability = await LanguageModel.availability();
console.log(availability); // 'available', 'downloadable', 'downloading', or 'unavailable'

const session = await LanguageModel.create({
 monitor(m) {
 m.addEventListener('downloadprogress', e => {
 console.log(`Downloading: ${Math.round(e.loaded * 100)}%`);
 });
 }
});

const response = await session.prompt('Summarize the plot of Hamlet in one sentence.');
console.log(response);

session.destroy(); // free GPU/RAM when done

That’s it. Per Thinktecture Labs’ testing, inference for a small extraction task takes about 4 seconds on an M1 MacBook Pro – fast enough for non-blocking UI work, slow enough that you don’t want it on a hot keystroke loop.

The gotchas no one warns you about

Most tutorials show the happy path. Here’s what actually trips real apps.

  • The 10GB ghost-deletion. Per Chrome’s official docs: if your free storage falls below 10 GB after the model is downloaded, the model is removed from your device and redownloaded once the requirements are met. Your app will silently flip from available back to downloadable. Always re-check availability per session, not at page load.
  • Audio input demands a GPU. CPU support for Gemini Nano rolled out in Chrome 140, but the Prompt API with audio input still requires a GPU (Chrome official docs, mid-2026). A user on a CPU-only laptop can text-prompt fine and then hit a wall the moment you add microphone input.
  • The context window is small. Per community testing (mid-2026): roughly 4K input tokens and 1K output. Truncate aggressively. Don’t paste a 50K-token JSON document and expect a useful answer.
  • Languages are limited. From Chrome 140: Gemini Nano supports English, Spanish, and Japanese for input and output text. Anything else returns a NotSupportedError. (Check official docs – language support may expand.)
  • No headless CI. You cannot run the on-device model in headless Chromium yet (as of mid-2026). Mock the LanguageModel global in tests and run the cloud fallback path in CI.

Prompt API vs the cloud: when each one wins

Short tasks – sentiment classification, name extraction, paragraph summarization – are where the Prompt API earns its keep. Zero cost, no data leaving the device, no latency spike. The moment you need multi-step reasoning or a 100K-token document processed, reach for a cloud API instead. There’s no tension between the two: use the on-device path opportunistically when it’s available, fall back to cloud when it isn’t.

Concern Chrome Prompt API Cloud API (OpenAI, Anthropic, etc.)
Cost per call $0 Per-token
Privacy Local, no data leaves device Sent to vendor
Context window ~4K in / 1K out (mid-2026) 128K-2M depending on model
Quality on reasoning Limited (small model) Strong
Availability Chrome 138+ desktop only, 22GB free Anywhere with HTTPS
Cross-browser Chrome/Edge today; Firefox opposes Browser-agnostic

What this actually changes for your roadmap

Mozilla’s pushback won’t stop the API from shipping. Their official position – posted to Mastodon – is that Google imposing T&Cs on a web API sets a dangerous precedent, and the interoperability risk is too large. But Chrome and Edge already ship it, and that’s enough market share for sites to use it.

So the practical move is feature detection plus fallback. Check typeof LanguageModel !== 'undefined'. If yes, use it. If no, route to your cloud API. Don’t gate the whole feature on Chrome – that’s the exact pattern Mozilla is warning about, and it’s also just bad product.

FAQ

Is the Prompt API safe to use in production today?

For Chrome Extensions, yes – it’s stable as of mid-2026. For web pages, you need an Origin Trial token, or accept that users without the flag enabled won’t get the feature. Always ship a fallback.

Why doesn’t Mozilla just ship the same API with a different model?

They could technically – but the interoperability problem described in the “Why Mozilla won’t just ship it” section above is exactly why they won’t. The API would be “the same” in name but produce different outputs per browser. Mozilla’s counter-proposal is purpose-built APIs (Translator, Summarizer) where the task is specific enough that model differences don’t surface in the output.

Can I use the Prompt API on mobile?

No – not on mobile Chrome, iOS, Android, or ChromeOS on non-Chromebook-Plus devices, as of this writing.

Next step: Open chrome://on-device-internals right now and check whether the model is already on your machine. If it is, paste the snippet above into DevTools and run one prompt. You’ll know in 30 seconds whether this fits your project – and that’s faster than reading another opinion piece about who’s right.