Conversational Actions were deprecated on June 13, 2023. For more information, see Conversational Actions sunset.

Build conversation models

A conversation model defines what users can say to your Actions and how your Actions respond to users. The main building blocks of your conversation model are intents, types, scenes, and prompts. After one of your Actions is invoked, Google Assistant hands the user off to that Action, and the Action begins a conversation with the user, based on your conversation model, which consists of:

Valid user requests - To define what users can say to your Actions, you create a collection of intents that augment the Assistant NLU, so it can understand requests that are specific to your Actions. Each intent defines training phrases that describe what users can say to match that intent. The Assistant NLU expands these training phrases to include similar phrases, and the aggregation of those phrases results in the intent's language model.
Action logic and responses - Scenes process intents, carry out required logic, and generate prompts to return to the user.

**Figure 1.** A conversation model consists of intents, types, scenes, and prompts that define your user experience. Intents that are eligible for invocation are also valid for matching in your conversations.

Define valid user requests

To define what users can say to your Actions, you use a combination of intents and types. User intents and types let you augment the Assistant NLU with your own language models. System intents and types let you take advantage of built-in language models and event detection like users wanting to quit your Action or Assistant detecting no input at all.

Create user intents

User intents let you define your own training phrases that define what users might say to your Actions. The Assistant NLU uses these phrases to train itself to understand what your users say. When users say something that matches a user intent's language model, Assistant matches the intent and notifies your Action, so you can carry out logic and respond back to users.

Create system intents

System intents let you take advantage of intents with pre-defined language models for common events like users wanting to quit your Action or when user input times out. To create system intents:

Create custom types

Custom types let you create your own type specification to train the NLU to understand a set of values that should map to a single key.

To create a custom type:

Build Action logic and responses

The Assistant NLU matches user requests to intents, so that your Action can process them in scenes. Scenes are powerful logic executors that let you process events during a conversation.

Create a scene

The following sections describe how to create scenes and define functionality for each scene's lifecycle stage.

To create a scene:

Define one-time setup

When a scene first becomes active, you can carry out one time tasks in the On enter stage. The On enter stage executes only once, and is the only stage that doesn't run inside a scene's execution loop.

Check conditions

Conditions let you check slot filling, session storage, user storage, and home storage parameters to control scene execution flow.

Define slot filling

Slots let you extract typed parameters from user input.

Slot value mapping

In many cases, a previous intent match can include parameters that partially or entirely fill a corresponding scene's slot values. In these cases, all slots filled by intent parameters map to the scene's slot filling if the slot name matches the intent parameter name.

For example, if a user matches an intent to order a beverage by saying "I want to order a large vanilla coffee", existing slots for size, flavor, and beverage type are considered filled in the corresponding scene if that scene defines same slots.

Process input

During this stage, you can have the Assistant NLU match user input to intents. You can scope intent matching to a specific scene by adding the desired intents to the scene. This lets you control conversation flow by telling Assistant to match specific intents when specific scenes are active.