Getting started with the Gemini API and Web Apps
Learn how to use the Gemini API and the Google AI JavaScript SDK to prototype generative AI for web apps. Use the Google AI Java Script SDK to make your first generative AI call using the Gemini API in your client-side web application. Explore a sample application and learn how to make multimodal prompts (that combine image and text).
Go back
Introduction to the Gemini API and prompt engineering
Explore Google AI Studio and the capabilities of the Gemini generative AI model. Learn how to design and test the different types of prompts (freeform, structured, and chat), get an API key, and build a simple NodeJS application.
This pathway is useful for further experimentation with Gemini and lays the groundwork for integrating its features into a web application. Optionally, you can also try out the Gemini API using a simple NodeJS web application. Feel free to skip this step and return back to client-side web development in this pathway.
There are additional security considerations for using API keys in mobile and web client applications. Migrate to Vertex AI for Firebase for production use beyond prototyping for client apps.
Try out the Gemini API template on Project IDX
Try out the Gemini API template in Project IDX to quickly get started and experiment with a JavaScript-based web app that uses GenAI. The template contains a fully functioning app for you to quickly prototype with the Gemini API on the web.
Project IDX is an experimental, web-based integrated development environment. It supports a variety of frameworks, including development for both web and cross-platform applications. It is currently available in Public Preview.
The template uses the Vite framework to build a web app that makes multimodal prompts to the Gemini API using the Google AI SDK directly or using LangChain.
To get started, follow the steps to create a new workspace using the "Gemini API" template. Select the "JavaScript Web App" environment and follow the guide to add your Google AI API key and run the application.
Introduction to the Google AI JavaScript SDK
The Google AI JavaScript SDK enables you to build your generative AI integration with the Google AI Gemini API.
Note that calling the Gemini API directly from a client-side web app using the Google AI JavaScript SDK is only recommended for prototyping. For production and enterprise-scale apps, access the Gemini API server-side to keep your API key safe or migrate to Vertex AI for Firebase instead.
To get started with the SDK, set up a project in Google AI Studio, which
includes obtaining an API key for the Gemini API. Next, add the required
dependency for the SDK to your build configuration or import it directly using
@google/generative-ai
. Then, you can initialize the library with
your API key and make your first API call.
Explore the JavaScript sample app
Explore more advanced use cases for the Google AI JavaScript SDK with the sample app on GitHub.
This example app demonstrates three key use cases in more detail: generating text, photo reasoning (using multimodal inputs), and multi-turn conversations (chat). It also shows how to use content streaming to improve response time by displaying partial results.
Follow the steps in the README to get started, which includes configuring your Google AI API Key and providing it to the http server included in this sample app.
Multimodal prompting using the Google AI Javascript SDK
Multimodal prompts combine different types of media together, such as text, images and audio. For example, you could create prompts that identify objects in an image, extract text from a photo, or reference a picture.
To get started, read this guide about file prompting strategies and multimodal concepts, which includes best practices for designing multimodal prompts.
Next, explore the multimodal capabilities of the Gemini models in Google AI Studio by uploading or selecting a file as part of your prompt.
Learn how to use multimodal inputs using the Google AI JavaScript SDK, find image requirements for prompts for prompts, and explore the multimodal image reasoning demo in the sample app .
For further reading, see the solution Leveraging the Gemini Pro Vision model for image understanding, multimodal prompts and accessibility.
Launch your app and move to production with Vertex AI for Firebase
When you are ready to launch your application in production, transition to Vertex AI for Firebase and its SDK for JavaScript.
When you want to call the Gemini API client-side from your web app, we recommend using Vertex AI for Firebase. It offers enhanced security features for mobile and web apps, including integration with Google Cloud IAM for API authorization and Firebase App Check for integrity verification. It integrates with Google's wider Firebase developer platform (like Cloud Storage and Remote Config), while also giving you access to the tools, workflows and scale offered through Google Cloud. Among other differences, Vertex AI also supports increased request quotas and enterprise features.
Note that using the Google AI JavaScript SDK to call the Gemini API directly from a mobile or web client is only recommended for prototyping and experimentation. Follow this guide to migrate to the Vertex AI for Firebase SDK by updating your package dependencies, imports, and changing how the AI model is initialized.
Vertex AI for Firebase is currently in public preview. Make sure to check the docs for the current release status of the SDKs since the team is committed to making these SDKs generally available as soon as possible!