Getting started with the Gemini API and Swift

Learn how to use the Gemini API and the Google AI Swift SDK to prototype generative AI with Swift. Use the Google AI Swift SDK to make your first generative AI call using the Gemini API in your application. Explore a sample application and learn how to make multimodal prompts (that combine image and text).

 

Introduction to the Gemini API and prompt engineering

Pathway

The Gemini API and the Google AI SDKs are recommended for prototyping and exploring the Gemini generative AI models. For production or enterprise-scale apps, use Vertex AI for Firebase instead. It offers additional security features and integration into the Google Cloud ecosystem.

Explore Google AI Studio and the capabilities of the Gemini generative AI model. Learn how to design and test the different types of prompts (freeform, structured, and chat) and get an API key for the Gemini API.

This pathway can be useful for further experimentation with the Gemini API and lays the groundwork for integrating its features into your application. Optionally, you can also try out the API using a simple NodeJS web application. If you don't already have NodeJS and NPM on your machine, feel free to skip this step and return back to Swift in this pathway.

There are additional security considerations for using API keys in mobile and web client applications. Migrate to Vertex AI SDK migration guide for production use beyond prototyping for client apps.

Introduction to the Google AI Swift SDK

Article

The Google AI Swift SDK is a Swift-first, cross-platform SDK that gives you access to the Gemini API for prototyping your generative AI integration. This SDK supports Swift on most platforms.

To get started with the Google AI Swift SDK, set up a project in Google AI Studio, which includes obtaining an API key for the Gemini API. Next, add the GoogleGenerativeAI package to your app's project configuration. Then, you can initialize the Generative Model and make your first API call.

Walkthrough: Build LLM-powered apps with Swift

Video

Learn how to use multi-turn conversations with the Gemini model to build an interactive text adventure in Swift.

This practical example walks through building a game with Swift UI and the Google AI Swift SDK to call the Gemini API. Learn more about designing prompts for interactive experiences and building mobile applications that seamlessly integrate generative AI.

Explore the Swift SDK and sample apps

Code sample

Explore more advanced use cases for the Google AI Swift SDK with code samples and demo apps on GitHub.

The Swift code samples demonstrate three key use cases: generating text , photo reasoning (using multimodal inputs) , and multi-turn conversations (chat) . They also show how to use content streaming to improve response time by displaying partial results.

Follow the steps in the README to get started, which includes setting up your Google AI API Key and providing it in a .plist file.

Walkthrough: Building an LLM-powered Text Adventure with Swift

Video Optional

Learn how to use multi-turn conversations with the Gemini Model to build an interactive text adventure in Swift.

This practical example walks through building a game with Swift UI and the Google AI Swift SDK to call the Gemini API. Learn more about designing prompts for interactive experiences and building responsive mobile applications that seamlessly integrate generative AI.

Multimodal prompting using the Google AI Swift SDK

Article

Multimodal prompts combine different types of media together, such as text, images and audio. For example, you could create prompts that identify objects in an image, extract text from a photo, or reference a picture.

To get started, read this guide about file prompting strategies and multimodal concepts, which includes best practices for designing multimodal prompts.

Next, explore the multimodal capabilities of the Gemini models in Google AI Studio by uploading or selecting a file as part of your prompt.

Learn how to use multimodal inputs using the Google AI Swift SDK, find image requirements for prompts , and explore the multimodal image demo in the Swift sample app .

For further reading, see the solution Leveraging the Gemini Pro Vision model for image understanding, multimodal prompts and accessibility .

Launch your app and move to production with Vertex AI for Firebase

Article

When you are ready to launch your application in production, transition to Vertex AI for Firebase and its SDK for Swift.

When you want to call the Gemini API directly from your Swift app, we recommend using Vertex AI for Firebase. It offers enhanced security features for mobile and web apps, including integration with Google Cloud IAM for API authorization and Firebase App Check for integrity verification. It integrates with Google's wider Firebase developer platform (like Cloud Storage and Remote Config), while also giving you access to the tools, workflows and scale offered through Google Cloud. Among other differences , Vertex AI also supports increased request quotas and enterprise features.

Note that using the Google AI Swift SDK to call the Gemini API directly from a mobile/web client is only recommended for prototyping and experimentation. Follow this guide to migrate to the Vertex AI for Firebase SDK by updating your app's dependencies, imports, and changing how the AI model is initialized.

Vertex AI for Firebase is currently in public preview. Make sure to check the docs for the current release status of the SDKs since the team is committed to making these SDKs generally available as soon as possible!