How to Use Chrome's New Built-in AI

Chrome’s new Prompt API represents a significant advancement in browser-based artificial intelligence capabilities. Developers can now implement AI features directly in Chrome without complex server-side setups. The Prompt API offers easy access to machine learning capabilities while maintaining control over performance and security standards.

Chrome is leading the way with state-of-the-art browser AI technology through its integration of Gemini Nano. The Prompt API for Gemini Nano provides developers with essential tools for natural language processing, content generation, and sentiment analysis directly in the browser. This piece shows you how to set up, implement, and use Chrome’s built-in AI capabilities effectively.

Understanding Chrome’s Prompt API

The Prompt API represents a fundamental shift in browser-based AI implementation. It can handle diverse tasks including classification, text composition, summarization, and translation. Developers can use this API to create text sessions and control model responses through customizable parameters like temperature and topK values.

Benefits of built-in AI in Chrome

Chrome’s built in AI has several benefits:

Enhanced Privacy Protection: Local processing keeps sensitive data on your device and enables AI features with end-to-end encryption.
Improved Performance: Response times and latency improve significantly without server round-trips.
Resource Optimization: The browser handles model distribution and updates based on user’s device capabilities and hardware acceleration.
Offline Functionality: AI capabilities work smoothly even without internet connection.

Comparison with server-side AI solutions

While server-side AI solutions excel in handling complex models and supporting diverse platforms, built-in AI presents distinct advantages for specific use cases. The choice between client-side and server-side implementation depends on several factors:

Aspect	Built-in AI	Server-side AI
Complexity	Optimal for specific, targeted use cases	Better for complex, large-scale operations
Availability	Higher availability with offline support	Dependent on network connectivity
Resource Usage	Leverages device hardware efficiently	Requires substantial server resources
Model Size	Optimized for smaller, efficient models	Supports larger, more complex models

Task complexity, resilience needs, and hardware capabilities play crucial roles in making this choice 1. Developers often get the best results with a hybrid approach that combines built-in and server-side AI capabilities based on what each specific case needs.

Setting Up the Prompt API Environment

Since prompt api is not publicly available and is available in early preview program, Prompt API needs proper technical configurations and system requirements to work.

Hardware and software requirements

At the time of writing, The Prompt API is available (behind an experimental flag) from Chrome 128.0.6545.0+ for desktop platforms. The minimum system requirements vary by operating system:

Operating System	Requirements
Windows	Windows 10 or later
macOS	macOS Ventura 13 and up
Linux	Not specified

Additionally, 4+ GB Video ram and 22+ GB storage on the volume containing your user profile are required on the local machine.

Enabling Gemini Nano and Prompt API flags

The configuration process requires specific steps in Chrome Canary (version 128.0.6545.0 or higher):

Go to chrome://flags/#optimization-guide-on-device-model
Select “Enabled BypassPerfRequirement”
Go to chrome://flags/#prompt-api-for-gemini-nano
Set to “Enabled”
Relaunch Chrome at each step

Developers should follow these key steps to verify their setup:

Go to chrome://components/ and locate “Optimization Guide On Device Model”. Version should be greater or equal to 2024.5.21.1031. If there is no version listed, click on “Check for update” to force the download
Once the model has downloaded and has reached a version greater than shown above, open DevTools and send (await ai.languageModel.capabilities()).available; in the console. If this returns “readily”, then you are all set.

If issues arise during verification, developers can try these solutions:

Disable and re-enable the configured flags
Restart the system to ensure proper initialization

The Prompt API works as an exploratory API designed to support prototyping and experimental implementations. For the chrome extensions, Developers can test the API with actual users in production environments through origin trials during the early access period from Chrome 131 to Chrome 136.

Implementing AI Features with the Built-in AI

Developers can now use Chrome’s built-in AI capabilities through the Prompt API through structured implementation patterns and established best practices.

Basic usage and syntax

The Prompt API implementation requires a text session and prompt management. A simple implementation needs API initialization and response handling:

const session = await window.ai.createTextSession();
const result = await session.prompt('Write me a poem.');

The API handles both synchronous and streaming responses. Performance metrics show that long text generation prompts execute in 3-4 seconds on average.

Execution Time 1: 0h 0m 3s 47ms
Execution Time 2: 0h 0m 3s 870ms
Execution Time 3: 0h 0m 2s 355ms
Execution Time 4: 0h 0m 3s 176ms
Execution Time 5: 0h 0m 7s 103ms

Average Session Execution Time: 0h 0m 3s 910.1999999999998ms

Handling responses and errors

The API provides built-in mechanisms for managing token limitations and session states:

Error Type	Handling Strategy
QuotaExceededError	Monitor token usage with session.tokensSoFar
Session Timeout	Implement automatic session renewal
Invalid Response	Implement fallback mechanisms

Best practices for prompt engineering

Specificity in Prompts: Engineers should craft detailed, unambiguous prompts that clearly define the expected output.
Context Management:
- Utilize contextual information to improve response quality
- Define tone and perspective explicitly
- Provide relevant examples when necessary

const session = await ai.languageModel.create({
  initialPrompts: [
    { role: "system", content: "You are a helpful assistant that completes sentences based on the given context." },
    { role: "user", content: "The capital of France is" },
    { role: "assistant", content: "Paris." },
    { role: "user", content: "The largest planet in our solar system is" },
    { role: "assistant", content: "Jupiter." },
    { role: "user", content: "The author of 'To Kill a Mockingbird' is" },
    { role: "assistant", content: "Harper Lee." }
  ]
});

// Clone an existing session for efficiency, instead of recreating one each time.
async function completeSentence(prompt) {
  const freshSession = await session.clone();
  return await freshSession.prompt(prompt);
}

const result1 = await completeSentence("The inventor of the telephone is");

const result2 = await completeSentence("The chemical symbol for gold is");

Performance Optimization:
- Monitor token usage through session.countPromptTokens()
- Implement streaming for long-form content

Real-World Applications and Use Cases

Web browsers with built-in language models opens up numerous practical applications across various domains. Developers can utilize these capabilities to improve user experiences and handle complex text processing tasks more efficiently.

The Prompt API stands out in basic NLP operations, especially when you have classification and extraction tasks. You can implement advanced text processing workflows through a simplified API interface that handles:

Text classification with customizable categories
Information extraction from unstructured content
Language detection and translation (Language Detector API and Translator API)
Automated content moderation
Text summarization (Summarizer API)
Content writing and paraphrasing(Writer and Rewriter API)

Performance metrics indicate that sentiment analysis tasks typically complete within 4 seconds on standard hardware configurations. The API supports both streaming and non-streaming implementations, allowing developers to choose the most appropriate approach based on their specific use case requirements.

Conclusion

Chrome’s Prompt API represents a fundamental change in browser-based AI development. This new approach brings sophisticated machine learning capabilities right to the client side. Technical implementations show clear advantages through local processing, lower latency, and improved privacy protection. Developers now have reliable tools to build AI-powered features directly within Chrome, thanks to well-laid-out token management and complete error handling mechanisms.

What are you building next with Chrome Built-in APIs?

Be sure to check out more such insightful blogs in my AI Language Lab: Unpacking the Secrets of AI's Language Giants series, for a deeper dive into LLMs. Stay tuned and keep learning!

FAQs

Q: How can I enable AI features in Google Chrome?
A: To use the Chrome API, first join the chrome early preview program. Enable the flags related to the model download and access. Built-in AI Sdk can be used to access ai models.

Q: Is there an AI tool available in Google Chrome?
A: Yes, Google Chrome incorporates AI technology and brings LLM directly to the local client. Developers can access them with Built-in AI Apis.

Q: What is the Chrome Built-in AI (Prompt API) and how does it work?
A: The Chrome Built-in AI, known as the Prompt API, is part of an exploratory initiative aiming to establish a cross-browser standard for embedded AI. It utilizes Gemini Nano model, integrating the AI directly within the browser, allowing large language model (LLM) operations to occur locally in your browser environment.

How to Use Chrome’s New Built-in AI

Understanding Chrome’s Prompt API#

Benefits of built-in AI in Chrome#

Comparison with server-side AI solutions#

Setting Up the Prompt API Environment#

Hardware and software requirements#

Enabling Gemini Nano and Prompt API flags#

Implementing AI Features with the Built-in AI#

Basic usage and syntax#

Handling responses and errors#

Best practices for prompt engineering#

Real-World Applications and Use Cases#

Conclusion#

FAQs#