Webhooks allow the Gemini API to push real-time notifications to your server when asynchronous or Long-Running Operations (LROs) complete. This replaces the need to poll the API for status updates, reducing latency and overhead.
CreateWebhook
Creates a new Webhook.
Request body
The request body contains data with the following structure:
Optional. The user-provided name of the webhook.
Required. The URI to which webhook events will be sent.
Required. The events that the webhook is subscribed to. Available events: - batch.succeeded - batch.expired - batch.failed - interaction.requires_action - interaction.completed - interaction.failed - video.generated
Possible values:
-
batch.expiredBatch has not been processed within the 48h timeframe.
-
batch.failedBatch job failed.
-
batch.succeededBatch processing finished successfully.
-
interaction.completedInteraction completed successfully.
-
interaction.failedInteraction failed.
-
interaction.requires_actionInteraction requires action (e.g., function calling).
-
video.generatedVideo generation completed.
Response
If successful, the response body contains data with the following structure:
Optional. The user-provided name of the webhook.
Required. The URI to which webhook events will be sent.
Required. The events that the webhook is subscribed to. Available events: - batch.succeeded - batch.expired - batch.failed - interaction.requires_action - interaction.completed - interaction.failed - video.generated
Possible values:
-
batch.expiredBatch has not been processed within the 48h timeframe.
-
batch.failedBatch job failed.
-
batch.succeededBatch processing finished successfully.
-
interaction.completedInteraction completed successfully.
-
interaction.failedInteraction failed.
-
interaction.requires_actionInteraction requires action (e.g., function calling).
-
video.generatedVideo generation completed.
Output only. The timestamp when the webhook was created.
Output only. The timestamp when the webhook was last updated.
signing_secrets SigningSecret (optional)
Output only. The signing secrets associated with this webhook.
Fields
Output only. The truncated version of the signing secret.
Output only. The expiration date of the signing secret.
Output only. The state of the webhook.
Possible values:
-
enabledThe webhook is enabled.
-
disabledThe webhook is disabled by the user.
-
disabled_due_to_failed_deliveriesThe webhook is disabled due to failed deliveries.
Output only. The new signing secret for the webhook. Only populated on create.
Output only. The ID of the webhook.
Example
Example Response
{ "name": "string", "uri": "string", "subscribed_events": [ "string" ], "create_time": "string", "update_time": "string", "signing_secrets": [ { "truncated_secret": "string", "expire_time": "string" } ], "state": "enabled", "new_signing_secret": "string", "id": "string" }
PingWebhook
Sends a ping event to a Webhook.
Path / Query Parameters
Required. The ID of the webhook to ping. Format: `{webhook_id}`
Request body
The request body contains data with the following structure:
Response
If successful, the response is empty.
Example
RotateSigningSecret
Generates a new signing secret for a Webhook.
Path / Query Parameters
Required. The ID of the webhook for which to generate a signing secret. Format: `{webhook_id}`
Request body
The request body contains data with the following structure:
Optional. The revocation behavior for previous signing secrets.
Possible values:
-
revoke_previous_secrets_after_h24Generate a new signing secret and revoke all previous secrets after 24 hours. Default and safest option for migrations.
-
revoke_previous_secrets_immediatelyRevoke all previous secrets immediately. Use with caution as this can interrupt ongoing notifications.
Response
If successful, the response body contains data with the following structure:
Output only. The newly generated signing secret.
Example
Example Response
{ "secret": "string" }
ListWebhooks
Lists all Webhooks.
Path / Query Parameters
Optional. The maximum number of webhooks to return. The service may return fewer than this value. If unspecified, at most 50 webhooks will be returned. The maximum value is 1000.
Optional. A page token, received from a previous `ListWebhooks` call. Provide this to retrieve the subsequent page.
Response
If successful, the response body contains data with the following structure:
The webhooks.
A token, which can be sent as `page_token` to retrieve the next page. If this field is omitted, there are no subsequent pages.
Example
Example Response
{ "webhooks": [ { "name": "string", "uri": "string", "subscribed_events": [ "string" ], "create_time": "string", "update_time": "string", "signing_secrets": [ { "truncated_secret": "string", "expire_time": "string" } ], "state": "enabled", "new_signing_secret": "string", "id": "string" } ], "next_page_token": "string" }
GetWebhook
Gets a specific Webhook.
Path / Query Parameters
Required. The ID of the webhook to retrieve.
Response
If successful, the response body contains data with the following structure:
Optional. The user-provided name of the webhook.
Required. The URI to which webhook events will be sent.
Required. The events that the webhook is subscribed to. Available events: - batch.succeeded - batch.expired - batch.failed - interaction.requires_action - interaction.completed - interaction.failed - video.generated
Possible values:
-
batch.expiredBatch has not been processed within the 48h timeframe.
-
batch.failedBatch job failed.
-
batch.succeededBatch processing finished successfully.
-
interaction.completedInteraction completed successfully.
-
interaction.failedInteraction failed.
-
interaction.requires_actionInteraction requires action (e.g., function calling).
-
video.generatedVideo generation completed.
Output only. The timestamp when the webhook was created.
Output only. The timestamp when the webhook was last updated.
signing_secrets SigningSecret (optional)
Output only. The signing secrets associated with this webhook.
Fields
Output only. The truncated version of the signing secret.
Output only. The expiration date of the signing secret.
Output only. The state of the webhook.
Possible values:
-
enabledThe webhook is enabled.
-
disabledThe webhook is disabled by the user.
-
disabled_due_to_failed_deliveriesThe webhook is disabled due to failed deliveries.
Output only. The new signing secret for the webhook. Only populated on create.
Output only. The ID of the webhook.
Example
Example Response
{ "name": "string", "uri": "string", "subscribed_events": [ "string" ], "create_time": "string", "update_time": "string", "signing_secrets": [ { "truncated_secret": "string", "expire_time": "string" } ], "state": "enabled", "new_signing_secret": "string", "id": "string" }
UpdateWebhook
Updates an existing Webhook.
Path / Query Parameters
Required. The ID of the webhook to update.
Optional. The list of fields to update.
Request body
The request body contains data with the following structure:
Optional. The user-provided name of the webhook.
Optional. The URI to which webhook events will be sent.
Optional. The events that the webhook is subscribed to. Available events: - batch.succeeded - batch.expired - batch.failed - interaction.requires_action - interaction.completed - interaction.failed - video.generated
Possible values:
-
batch.expiredBatch has not been processed within the 48h timeframe.
-
batch.failedBatch job failed.
-
batch.succeededBatch processing finished successfully.
-
interaction.completedInteraction completed successfully.
-
interaction.failedInteraction failed.
-
interaction.requires_actionInteraction requires action (e.g., function calling).
-
video.generatedVideo generation completed.
Optional. The state of the webhook.
Possible values:
-
enabledThe webhook is enabled.
-
disabledThe webhook is disabled by the user.
-
disabled_due_to_failed_deliveriesThe webhook is disabled due to failed deliveries.
Response
If successful, the response body contains data with the following structure:
Optional. The user-provided name of the webhook.
Required. The URI to which webhook events will be sent.
Required. The events that the webhook is subscribed to. Available events: - batch.succeeded - batch.expired - batch.failed - interaction.requires_action - interaction.completed - interaction.failed - video.generated
Possible values:
-
batch.expiredBatch has not been processed within the 48h timeframe.
-
batch.failedBatch job failed.
-
batch.succeededBatch processing finished successfully.
-
interaction.completedInteraction completed successfully.
-
interaction.failedInteraction failed.
-
interaction.requires_actionInteraction requires action (e.g., function calling).
-
video.generatedVideo generation completed.
Output only. The timestamp when the webhook was created.
Output only. The timestamp when the webhook was last updated.
signing_secrets SigningSecret (optional)
Output only. The signing secrets associated with this webhook.
Fields
Output only. The truncated version of the signing secret.
Output only. The expiration date of the signing secret.
Output only. The state of the webhook.
Possible values:
-
enabledThe webhook is enabled.
-
disabledThe webhook is disabled by the user.
-
disabled_due_to_failed_deliveriesThe webhook is disabled due to failed deliveries.
Output only. The new signing secret for the webhook. Only populated on create.
Output only. The ID of the webhook.
Example
Example Response
{ "name": "string", "uri": "string", "subscribed_events": [ "string" ], "create_time": "string", "update_time": "string", "signing_secrets": [ { "truncated_secret": "string", "expire_time": "string" } ], "state": "enabled", "new_signing_secret": "string", "id": "string" }
DeleteWebhook
Deletes a Webhook.
Path / Query Parameters
Required. The ID of the webhook to delete. Format: `{webhook_id}`
Response
If successful, the response is empty.
Example
Resources
Webhook
A Webhook resource.
Fields
Optional. The user-provided name of the webhook.
Required. The URI to which webhook events will be sent.
Required. The events that the webhook is subscribed to. Available events: - batch.succeeded - batch.expired - batch.failed - interaction.requires_action - interaction.completed - interaction.failed - video.generated
Possible values:
-
batch.expiredBatch has not been processed within the 48h timeframe.
-
batch.failedBatch job failed.
-
batch.succeededBatch processing finished successfully.
-
interaction.completedInteraction completed successfully.
-
interaction.failedInteraction failed.
-
interaction.requires_actionInteraction requires action (e.g., function calling).
-
video.generatedVideo generation completed.
Output only. The timestamp when the webhook was created.
Output only. The timestamp when the webhook was last updated.
signing_secrets SigningSecret (optional)
Output only. The signing secrets associated with this webhook.
Fields
Output only. The truncated version of the signing secret.
Output only. The expiration date of the signing secret.
Output only. The state of the webhook.
Possible values:
-
enabledThe webhook is enabled.
-
disabledThe webhook is disabled by the user.
-
disabled_due_to_failed_deliveriesThe webhook is disabled due to failed deliveries.
Output only. The new signing secret for the webhook. Only populated on create.
Output only. The ID of the webhook.
Data Models
InteractionSseEvent
Possible Types
Polymorphic discriminator: event_type
ErrorEvent
No description provided.
Always set to "error".
error Error (optional)
No description provided.
Fields
A URI that identifies the error type.
A human-readable error message.
The event_id token to be used to resume the interaction stream, from this event.
metadata StreamMetadata (optional)
Optional metadata accompanying ANY streamed event.
Fields
total_usage Usage (optional)
No description provided.
Fields
Number of tokens in the prompt (context).
input_tokens_by_modality ModalityTokens (optional)
A breakdown of input token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens in the cached part of the prompt (the cached content).
cached_tokens_by_modality ModalityTokens (optional)
A breakdown of cached token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Total number of tokens across all the generated responses.
output_tokens_by_modality ModalityTokens (optional)
A breakdown of output token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens present in tool-use prompt(s).
tool_use_tokens_by_modality ModalityTokens (optional)
A breakdown of tool-use token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens of thoughts for thinking models.
Total token count for the interaction request (prompt + responses + other internal tokens).
grounding_tool_count GroundingToolCount (optional)
Grounding tool count.
Fields
The grounding tool type associated with the count.
Possible values:
-
google_searchGrounding with Google Web Search and Image Search, & Web Grounding for Enterprise.
-
google_mapsGrounding with Google Maps.
-
retrievalGrounding with customer's data, for example, VertexAISearch.
The number of grounding tool counts.
InteractionCompletedEvent
No description provided.
Always set to "interaction.completed".
interaction Interaction (required)
Required. The completed interaction with empty outputs to reduce the payload size. Use the preceding ContentDelta events for the actual output.
Fields
model ModelOption (optional)
The name of the `Model` used for generating the interaction.
Possible values:
-
gemini-2.5-computer-use-preview-10-2025An agentic capability model designed for direct interface interaction, allowing Gemini to perceive and navigate digital environments.
-
gemini-3.1-flash-tts-previewGemini 3.1 Flash TTS: Powerful, low-latency speech generation. Enjoy natural outputs, steerable prompts, and new expressive audio tags for precise narration control.
-
gemini-2.5-flash-preview-ttsOur 2.5 Flash text-to-speech model optimized for powerful, low-latency controllable speech generation.
-
gemini-2.5-pro-preview-ttsOur 2.5 Pro text-to-speech audio model optimized for powerful, low-latency speech generation for more natural outputs and easier to steer prompts.
-
lyria-3-pro-previewOur advanced, full-song generative model with deep compositional understanding, optimized for precise structural control and complex transitions across diverse musical styles.
-
gemini-2.5-flashOur first hybrid reasoning model which supports a 1M token context window and has thinking budgets.
-
gemini-3.1-pro-previewOur latest SOTA reasoning model with unprecedented depth and nuance, and powerful multimodal understanding and coding capabilities.
-
lyria-3-clip-previewOur low-latency, music generation model optimized for high-fidelity audio clips and precise rhythmic control.
-
gemini-3.1-flash-liteOur most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.
-
gemini-3.1-flash-lite-previewOur most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.
-
gemini-3-flash-previewOur most intelligent model built for speed, combining frontier intelligence with superior search and grounding.
-
gemini-3.5-flashOur most intelligent model for sustained frontier performance in agentic and coding tasks.
-
gemini-3-pro-previewOur most intelligent model with SOTA reasoning and multimodal understanding, and powerful agentic and vibe coding capabilities.
-
gemini-2.5-flash-native-audio-preview-12-2025Our native audio models optimized for higher quality audio outputs with better pacing, voice naturalness, verbosity, and mood.
-
gemini-2.5-flash-imageOur native image generation model, optimized for speed, flexibility, and contextual understanding. Text input and output is priced the same as 2.5 Flash.
-
gemini-2.5-flash-liteOur smallest and most cost effective model, built for at scale usage.
-
gemini-2.5-proOur state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks.
-
gemini-3.1-flash-image-previewPro-level visual intelligence with Flash-speed efficiency and reality-grounded generation capabilities.
-
gemini-3-pro-image-previewState-of-the-art image generation and editing model.
-
gemini-2.5-flash-lite-preview-09-2025The latest model based on Gemini 2.5 Flash lite optimized for cost-efficiency, high throughput and high quality.
-
gemini-2.5-flash-preview-09-2025The latest model based on the 2.5 Flash model. 2.5 Flash Preview is best for large scale processing, low-latency, high volume tasks that require thinking, and agentic use cases.
Possible values
-
gemini-2.5-computer-use-preview-10-2025An agentic capability model designed for direct interface interaction, allowing Gemini to perceive and navigate digital environments.
-
gemini-3.1-flash-tts-previewGemini 3.1 Flash TTS: Powerful, low-latency speech generation. Enjoy natural outputs, steerable prompts, and new expressive audio tags for precise narration control.
-
gemini-2.5-flash-preview-ttsOur 2.5 Flash text-to-speech model optimized for powerful, low-latency controllable speech generation.
-
gemini-2.5-pro-preview-ttsOur 2.5 Pro text-to-speech audio model optimized for powerful, low-latency speech generation for more natural outputs and easier to steer prompts.
-
lyria-3-pro-previewOur advanced, full-song generative model with deep compositional understanding, optimized for precise structural control and complex transitions across diverse musical styles.
-
gemini-2.5-flashOur first hybrid reasoning model which supports a 1M token context window and has thinking budgets.
-
gemini-3.1-pro-previewOur latest SOTA reasoning model with unprecedented depth and nuance, and powerful multimodal understanding and coding capabilities.
-
lyria-3-clip-previewOur low-latency, music generation model optimized for high-fidelity audio clips and precise rhythmic control.
-
gemini-3.1-flash-liteOur most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.
-
gemini-3.1-flash-lite-previewOur most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.
-
gemini-3-flash-previewOur most intelligent model built for speed, combining frontier intelligence with superior search and grounding.
-
gemini-3.5-flashOur most intelligent model for sustained frontier performance in agentic and coding tasks.
-
gemini-3-pro-previewOur most intelligent model with SOTA reasoning and multimodal understanding, and powerful agentic and vibe coding capabilities.
-
gemini-2.5-flash-native-audio-preview-12-2025Our native audio models optimized for higher quality audio outputs with better pacing, voice naturalness, verbosity, and mood.
-
gemini-2.5-flash-imageOur native image generation model, optimized for speed, flexibility, and contextual understanding. Text input and output is priced the same as 2.5 Flash.
-
gemini-2.5-flash-liteOur smallest and most cost effective model, built for at scale usage.
-
gemini-2.5-proOur state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks.
-
gemini-3.1-flash-image-previewPro-level visual intelligence with Flash-speed efficiency and reality-grounded generation capabilities.
-
gemini-3-pro-image-previewState-of-the-art image generation and editing model.
-
gemini-2.5-flash-lite-preview-09-2025The latest model based on Gemini 2.5 Flash lite optimized for cost-efficiency, high throughput and high quality.
-
gemini-2.5-flash-preview-09-2025The latest model based on the 2.5 Flash model. 2.5 Flash Preview is best for large scale processing, low-latency, high volume tasks that require thinking, and agentic use cases.
agent AgentOption (optional)
The name of the `Agent` used for generating the interaction.
Possible values:
-
deep-research-preview-04-2026Gemini Deep Research Agent
-
deep-research-pro-preview-12-2025Gemini Deep Research Agent
-
deep-research-max-preview-04-2026Gemini Deep Research Max Agent
-
antigravity-preview-05-2026Use the Antigravity managed agent to perform multi-step tasks that require reasoning, file operations, and tool use.
Possible values
-
deep-research-preview-04-2026Gemini Deep Research Agent
-
deep-research-pro-preview-12-2025Gemini Deep Research Agent
-
deep-research-max-preview-04-2026Gemini Deep Research Max Agent
-
antigravity-preview-05-2026Use the Antigravity managed agent to perform multi-step tasks that require reasoning, file operations, and tool use.
Required. Output only. A unique identifier for the interaction completion.
Required. Output only. The status of the interaction.
Possible values:
-
in_progressThe interaction is in progress.
-
requires_actionThe interaction requires action/input from the user.
-
completedThe interaction is completed.
-
failedThe interaction failed.
-
cancelledThe interaction was cancelled.
-
incompleteThe interaction is completed, but contains incomplete results (e.g. hitting max_tokens).
-
budget_exceededThe interaction was halted because the token budget was exceeded.
Required. Output only. The time at which the response was created in ISO 8601 format (YYYY-MM-DDThh:mm:ssZ).
Required. Output only. The time at which the response was last updated in ISO 8601 format (YYYY-MM-DDThh:mm:ssZ).
System instruction for the interaction.
tools Tool (optional)
A list of tool declarations the model may call during interaction.
Possible Types
Polymorphic discriminator: type
CodeExecution
A tool that can be used by the model to execute code.
No description provided.
Always set to "code_execution".
ComputerUse
A tool that can be used by the model to interact with the computer.
No description provided.
Always set to "computer_use".
The environment being operated.
Possible values:
-
browserOperates in a web browser.
The list of predefined functions that are excluded from the model call.
FileSearch
A tool that can be used by the model to search files.
No description provided.
Always set to "file_search".
The file search store names to search.
The number of semantic retrieval chunks to retrieve.
Metadata filter to apply to the semantic retrieval documents and chunks.
Function
A tool that can be used by the model.
No description provided.
Always set to "function".
The name of the function.
A description of the function.
The JSON Schema for the function's parameters.
GoogleMaps
A tool that can be used by the model to call Google Maps.
No description provided.
Always set to "google_maps".
Whether to return a widget context token in the tool call result of the response.
The latitude of the user's location.
The longitude of the user's location.
GoogleSearch
A tool that can be used by the model to search Google.
No description provided.
Always set to "google_search".
The types of search grounding to enable.
Possible values:
-
web_searchSetting this field enables web search. Only text results are returned.
-
image_searchSetting this field enables image search. Image bytes are returned.
-
enterprise_web_searchSetting this field enables enterprise web search.
McpServer
A MCPServer is a server that can be called by the model to perform actions.
No description provided.
Always set to "mcp_server".
The name of the MCPServer.
The full URL for the MCPServer endpoint. Example: "https://api.example.com/mcp"
Optional: Fields for authentication headers, timeouts, etc., if needed.
allowed_tools AllowedTools (optional)
The allowed tools.
Fields
The mode of the tool choice.
Possible values:
-
autoAuto tool choice.
-
anyAny tool choice.
-
noneNo tool choice.
-
validatedValidated tool choice.
The names of the allowed tools.
Retrieval
A tool that can be used by the model to retrieve files.
No description provided.
Always set to "retrieval".
The types of file retrieval to enable.
Possible values:
-
rag_store -
exa_ai_search -
parallel_ai_search
exa_ai_search_config ExaAISearchConfig (optional)
Used to specify configuration for ExaAISearch.
Fields
Required. The API key for ExaAiSearch.
Optional. This field can be used to pass any parameter from the Exa.ai Search API.
parallel_ai_search_config ParallelAISearchConfig (optional)
Used to specify configuration for ParallelAISearch.
Fields
Optional. The API key for ParallelAiSearch.
Optional. Custom configs for ParallelAiSearch.
rag_store_config RagStoreConfig (optional)
Used to specify configuration for RagStore.
Fields
rag_resources RagResource (optional)
Optional. The representation of the rag source.
Fields
Optional. RagCorpora resource name.
Optional. rag_file_id. The files should be in the same rag_corpus set in rag_corpus field.
rag_retrieval_config RagRetrievalConfig (optional)
Optional. The retrieval config for the Rag query.
Fields
Optional. The number of contexts to retrieve.
hybrid_search HybridSearch (optional)
Optional. Config for Hybrid Search.
Fields
Optional. Alpha value controls the weight between dense and sparse vector search results.
filter Filter (optional)
Optional. Config for filters.
Fields
Optional. Only returns contexts with vector distance smaller than the threshold.
Optional. Only returns contexts with vector similarity larger than the threshold.
Optional. String for metadata filtering.
ranking Ranking (optional)
Optional. Config for ranking and reranking.
UrlContext
A tool that can be used by the model to fetch URL context.
No description provided.
Always set to "url_context".
usage Usage (optional)
Output only. Statistics on the interaction request's token usage.
Fields
Number of tokens in the prompt (context).
input_tokens_by_modality ModalityTokens (optional)
A breakdown of input token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens in the cached part of the prompt (the cached content).
cached_tokens_by_modality ModalityTokens (optional)
A breakdown of cached token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Total number of tokens across all the generated responses.
output_tokens_by_modality ModalityTokens (optional)
A breakdown of output token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens present in tool-use prompt(s).
tool_use_tokens_by_modality ModalityTokens (optional)
A breakdown of tool-use token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens of thoughts for thinking models.
Total token count for the interaction request (prompt + responses + other internal tokens).
grounding_tool_count GroundingToolCount (optional)
Grounding tool count.
Fields
The grounding tool type associated with the count.
Possible values:
-
google_searchGrounding with Google Web Search and Image Search, & Web Grounding for Enterprise.
-
google_mapsGrounding with Google Maps.
-
retrievalGrounding with customer's data, for example, VertexAISearch.
The number of grounding tool counts.
response_modalities ResponseModality (optional)
The requested modalities of the response (TEXT, IMAGE, AUDIO).
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
The ID of the previous interaction, if any.
Output only. The environment ID for the interaction. Only populated if environment config is set in the request.
service_tier ServiceTier (optional)
The service tier for the interaction.
Possible values:
-
flexFlex service tier.
-
standardStandard service tier.
-
priorityPriority service tier.
Possible values
-
flexFlex service tier.
-
standardStandard service tier.
-
priorityPriority service tier.
webhook_config WebhookConfig (optional)
Optional. Webhook configuration for receiving notifications when the interaction completes.
Fields
Optional. If set, these webhook URIs will be used for webhook events instead of the registered webhooks.
Optional. The user metadata that will be returned on each event emission to the webhooks.
steps Step (optional)
Required. Output only. The steps that make up the interaction.
Possible Types
Polymorphic discriminator: type
CodeExecutionCallStep
Code execution call step.
No description provided.
Always set to "code_execution_call".
arguments CodeExecutionCallStepArguments (required)
Required. The arguments to pass to the code execution.
Fields
Programming language of the `code`.
Possible values:
-
pythonPython >= 3.10, with numpy and simpy available.
The code to be executed.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
CodeExecutionResultStep
Code execution result step.
No description provided.
Always set to "code_execution_result".
Required. The output of the code execution.
Whether the code execution resulted in an error.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
FileSearchCallStep
File Search call step.
No description provided.
Always set to "file_search_call".
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
FileSearchResultStep
File Search result step.
No description provided.
Always set to "file_search_result".
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
FunctionCallStep
A function tool call step.
No description provided.
Always set to "function_call".
Required. The name of the tool to call.
Required. The arguments to pass to the function.
Required. A unique ID for this specific tool call.
FunctionResultStep
Result of a function tool call.
No description provided.
Always set to "function_result".
The name of the tool that was called.
Whether the tool call resulted in an error.
Required. ID to match the ID from the function call block.
The result of the tool call.
GoogleMapsCallStep
Google Maps call step.
No description provided.
Always set to "google_maps_call".
arguments GoogleMapsCallStepArguments (optional)
The arguments to pass to the Google Maps tool.
Fields
The queries to be executed.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
GoogleMapsResultStep
Google Maps result step.
No description provided.
Always set to "google_maps_result".
result GoogleMapsResultItem (required)
No description provided.
Fields
places GoogleMapsResultPlaces (optional)
No description provided.
Fields
No description provided.
No description provided.
No description provided.
review_snippets ReviewSnippet (optional)
No description provided.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
No description provided.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
GoogleSearchCallStep
Google Search call step.
No description provided.
Always set to "google_search_call".
arguments GoogleSearchCallStepArguments (required)
Required. The arguments to pass to Google Search.
Fields
Web search queries for the following-up web search.
The type of search grounding enabled.
Possible values:
-
web_searchSetting this field enables web search. Only text results are returned.
-
image_searchSetting this field enables image search. Image bytes are returned.
-
enterprise_web_searchSetting this field enables enterprise web search.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
GoogleSearchResultStep
Google Search result step.
No description provided.
Always set to "google_search_result".
result GoogleSearchResultItem (required)
Required. The results of the Google Search.
Fields
Web content snippet that can be embedded in a web page or an app webview.
Whether the Google Search resulted in an error.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
McpServerToolCallStep
MCPServer tool call step.
No description provided.
Always set to "mcp_server_tool_call".
Required. The name of the tool which was called.
Required. The name of the used MCP server.
Required. The JSON object of arguments for the function.
Required. A unique ID for this specific tool call.
McpServerToolResultStep
MCPServer tool result step.
No description provided.
Always set to "mcp_server_tool_result".
Name of the tool which is called for this specific tool call.
The name of the used MCP server.
Required. ID to match the ID from the function call block.
The output from the MCP server call. Can be simple text or rich content.
ModelOutputStep
Output generated by the model.
No description provided.
Always set to "model_output".
content Content (optional)
No description provided.
Possible Types
Polymorphic discriminator: type
AudioContent
An audio content block.
No description provided.
Always set to "audio".
The audio content.
The URI of the audio.
The mime type of the audio.
Possible values:
-
audio/wavWAV audio format
-
audio/mp3MP3 audio format
-
audio/aiffAIFF audio format
-
audio/aacAAC audio format
-
audio/oggOGG audio format
-
audio/flacFLAC audio format
-
audio/mpegMPEG audio format
-
audio/m4aM4A audio format
-
audio/l16L16 audio format
-
audio/opusOPUS audio format
-
audio/alawALAW audio format
-
audio/mulawMULAW audio format
The number of audio channels.
The sample rate of the audio.
DocumentContent
A document content block.
No description provided.
Always set to "document".
The document content.
The URI of the document.
The mime type of the document.
Possible values:
-
application/pdfPDF document format
-
text/csvCSV document format
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
VideoContent
A video content block.
No description provided.
Always set to "video".
The video content.
The URI of the video.
The mime type of the video.
Possible values:
-
video/mp4MP4 video format
-
video/mpegMPEG video format
-
video/mpgMPG video format
-
video/movMOV video format
-
video/aviAVI video format
-
video/x-flvFLV video format
-
video/webmWebM video format
-
video/wmvWMV video format
-
video/3gpp3GPP video format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
ThoughtStep
A thought step.
No description provided.
Always set to "thought".
A signature hash for backend validation.
summary ThoughtSummaryContent (optional)
A summary of the thought.
Possible Types
Polymorphic discriminator: type
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlContextCallStep
URL context call step.
No description provided.
Always set to "url_context_call".
arguments UrlContextCallStepArguments (required)
Required. The arguments to pass to the URL context.
Fields
The URLs to fetch.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
UrlContextResultStep
URL context result step.
No description provided.
Always set to "url_context_result".
result UrlContextResultItem (required)
Required. The results of the URL context.
Fields
The URL that was fetched.
The status of the URL retrieval.
Possible values:
-
successThe status of the URL retrieval.
-
errorThe status of the URL retrieval.
-
paywallThe status of the URL retrieval.
-
unsafeThe status of the URL retrieval.
Whether the URL context resulted in an error.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
UserInputStep
Input provided by the user.
content Content (optional)
No description provided.
Possible Types
Polymorphic discriminator: type
AudioContent
An audio content block.
No description provided.
Always set to "audio".
The audio content.
The URI of the audio.
The mime type of the audio.
Possible values:
-
audio/wavWAV audio format
-
audio/mp3MP3 audio format
-
audio/aiffAIFF audio format
-
audio/aacAAC audio format
-
audio/oggOGG audio format
-
audio/flacFLAC audio format
-
audio/mpegMPEG audio format
-
audio/m4aM4A audio format
-
audio/l16L16 audio format
-
audio/opusOPUS audio format
-
audio/alawALAW audio format
-
audio/mulawMULAW audio format
The number of audio channels.
The sample rate of the audio.
DocumentContent
A document content block.
No description provided.
Always set to "document".
The document content.
The URI of the document.
The mime type of the document.
Possible values:
-
application/pdfPDF document format
-
text/csvCSV document format
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
VideoContent
A video content block.
No description provided.
Always set to "video".
The video content.
The URI of the video.
The mime type of the video.
Possible values:
-
video/mp4MP4 video format
-
video/mpegMPEG video format
-
video/mpgMPG video format
-
video/movMOV video format
-
video/aviAVI video format
-
video/x-flvFLV video format
-
video/webmWebM video format
-
video/wmvWMV video format
-
video/3gpp3GPP video format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
No description provided.
Always set to "user_input".
The input for the interaction.
Enforces that the generated response is a JSON object that complies with the JSON schema specified in this field.
The environment configuration for the interaction. Can be an object specifying remote environment sources or a string referencing an existing environment ID.
The name of the cached content used as context to serve the prediction. Note: only used in explicit caching, where users can have control over caching (e.g. what content to cache) and enjoy guaranteed cost savings. Format: `projects/{project}/locations/{location}/cachedContents/{cachedContent}`
agent_config object (optional)
Configuration parameters for the agent interaction.
Possible Types
Polymorphic discriminator: type
DeepResearchAgentConfig
Configuration for the Deep Research agent.
No description provided.
Always set to "deep-research".
thinking_summaries ThinkingSummaries (optional)
Whether to include thought summaries in the response.
Possible values:
-
autoAuto thinking summaries.
-
noneNo thinking summaries.
Possible values
-
autoAuto thinking summaries.
-
noneNo thinking summaries.
Whether to include visualizations in the response.
Possible values:
-
offDo not include visualizations.
-
autoAutomatically include visualizations.
Enables human-in-the-loop planning for the Deep Research agent. If set to true, the Deep Research agent will provide a research plan in its response. The agent will then proceed only if the user confirms the plan in the next turn.
Enables bigquery tool for the Deep Research agent.
DynamicAgentConfig
Configuration for dynamic agents.
No description provided.
Always set to "dynamic".
FindRequest
Request parameters specific to FIND sessions, used for discovering vulnerabilities in a codebase.
No description provided.
Always set to "find_request".
source_files FileContent (optional)
A list of source files to provide as context for the scan.
Fields
The relative path of the file from the project root.
The UTF-8 encoded text content of the file.
The identifier of a specific finding to verify. This is primarily used in VERIFY mode to focus the agent's execution-based validation on a single vulnerability.
Additional context or custom instructions provided by the user to guide the vulnerability analysis.
Parameter for grouping multiple interactions that belong to the same CodeMender session.
session_config SessionConfig (optional)
Optional session-specific configurations to override default agent behavior.
Fields
The pipeline mode of a CodeMender session. It can only be used for a find session.
Possible values:
-
scanFast scan using only the initial classifier.
-
verifyPerforms classification followed by detailed investigation.
The cognitive architecture or "thinking" topology used by the agent (e.g. "default", "deep").
The maximum number of interaction rounds the agent is allowed to perform before reaching a timeout.
FixRequest
Request parameters specific to FIX sessions, used for generating and validating security patches.
No description provided.
Always set to "fix_request".
source_files FileContent (optional)
A list of source files providing context for the remediation. These files are typically the ones containing the identified vulnerability.
Fields
The relative path of the file from the project root.
The UTF-8 encoded text content of the file.
The identifier of the specific security finding to be remediated. This ID maps to a previously discovered vulnerability.
Additional context or custom instructions provided by the user to guide the patch generation process.
Parameter for grouping multiple interactions that belong to the same CodeMender session.
session_config SessionConfig (optional)
Optional session-specific configurations to override default agent behavior.
Fields
The pipeline mode of a CodeMender session. It can only be used for a find session.
Possible values:
-
scanFast scan using only the initial classifier.
-
verifyPerforms classification followed by detailed investigation.
The cognitive architecture or "thinking" topology used by the agent (e.g. "default", "deep").
The maximum number of interaction rounds the agent is allowed to perform before reaching a timeout.
The event_id token to be used to resume the interaction stream, from this event.
metadata StreamMetadata (optional)
Optional metadata accompanying ANY streamed event.
Fields
total_usage Usage (optional)
No description provided.
Fields
Number of tokens in the prompt (context).
input_tokens_by_modality ModalityTokens (optional)
A breakdown of input token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens in the cached part of the prompt (the cached content).
cached_tokens_by_modality ModalityTokens (optional)
A breakdown of cached token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Total number of tokens across all the generated responses.
output_tokens_by_modality ModalityTokens (optional)
A breakdown of output token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens present in tool-use prompt(s).
tool_use_tokens_by_modality ModalityTokens (optional)
A breakdown of tool-use token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens of thoughts for thinking models.
Total token count for the interaction request (prompt + responses + other internal tokens).
grounding_tool_count GroundingToolCount (optional)
Grounding tool count.
Fields
The grounding tool type associated with the count.
Possible values:
-
google_searchGrounding with Google Web Search and Image Search, & Web Grounding for Enterprise.
-
google_mapsGrounding with Google Maps.
-
retrievalGrounding with customer's data, for example, VertexAISearch.
The number of grounding tool counts.
InteractionCreatedEvent
No description provided.
Always set to "interaction.created".
interaction Interaction (required)
No description provided.
Fields
model ModelOption (optional)
The name of the `Model` used for generating the interaction.
Possible values:
-
gemini-2.5-computer-use-preview-10-2025An agentic capability model designed for direct interface interaction, allowing Gemini to perceive and navigate digital environments.
-
gemini-3.1-flash-tts-previewGemini 3.1 Flash TTS: Powerful, low-latency speech generation. Enjoy natural outputs, steerable prompts, and new expressive audio tags for precise narration control.
-
gemini-2.5-flash-preview-ttsOur 2.5 Flash text-to-speech model optimized for powerful, low-latency controllable speech generation.
-
gemini-2.5-pro-preview-ttsOur 2.5 Pro text-to-speech audio model optimized for powerful, low-latency speech generation for more natural outputs and easier to steer prompts.
-
lyria-3-pro-previewOur advanced, full-song generative model with deep compositional understanding, optimized for precise structural control and complex transitions across diverse musical styles.
-
gemini-2.5-flashOur first hybrid reasoning model which supports a 1M token context window and has thinking budgets.
-
gemini-3.1-pro-previewOur latest SOTA reasoning model with unprecedented depth and nuance, and powerful multimodal understanding and coding capabilities.
-
lyria-3-clip-previewOur low-latency, music generation model optimized for high-fidelity audio clips and precise rhythmic control.
-
gemini-3.1-flash-liteOur most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.
-
gemini-3.1-flash-lite-previewOur most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.
-
gemini-3-flash-previewOur most intelligent model built for speed, combining frontier intelligence with superior search and grounding.
-
gemini-3.5-flashOur most intelligent model for sustained frontier performance in agentic and coding tasks.
-
gemini-3-pro-previewOur most intelligent model with SOTA reasoning and multimodal understanding, and powerful agentic and vibe coding capabilities.
-
gemini-2.5-flash-native-audio-preview-12-2025Our native audio models optimized for higher quality audio outputs with better pacing, voice naturalness, verbosity, and mood.
-
gemini-2.5-flash-imageOur native image generation model, optimized for speed, flexibility, and contextual understanding. Text input and output is priced the same as 2.5 Flash.
-
gemini-2.5-flash-liteOur smallest and most cost effective model, built for at scale usage.
-
gemini-2.5-proOur state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks.
-
gemini-3.1-flash-image-previewPro-level visual intelligence with Flash-speed efficiency and reality-grounded generation capabilities.
-
gemini-3-pro-image-previewState-of-the-art image generation and editing model.
-
gemini-2.5-flash-lite-preview-09-2025The latest model based on Gemini 2.5 Flash lite optimized for cost-efficiency, high throughput and high quality.
-
gemini-2.5-flash-preview-09-2025The latest model based on the 2.5 Flash model. 2.5 Flash Preview is best for large scale processing, low-latency, high volume tasks that require thinking, and agentic use cases.
Possible values
-
gemini-2.5-computer-use-preview-10-2025An agentic capability model designed for direct interface interaction, allowing Gemini to perceive and navigate digital environments.
-
gemini-3.1-flash-tts-previewGemini 3.1 Flash TTS: Powerful, low-latency speech generation. Enjoy natural outputs, steerable prompts, and new expressive audio tags for precise narration control.
-
gemini-2.5-flash-preview-ttsOur 2.5 Flash text-to-speech model optimized for powerful, low-latency controllable speech generation.
-
gemini-2.5-pro-preview-ttsOur 2.5 Pro text-to-speech audio model optimized for powerful, low-latency speech generation for more natural outputs and easier to steer prompts.
-
lyria-3-pro-previewOur advanced, full-song generative model with deep compositional understanding, optimized for precise structural control and complex transitions across diverse musical styles.
-
gemini-2.5-flashOur first hybrid reasoning model which supports a 1M token context window and has thinking budgets.
-
gemini-3.1-pro-previewOur latest SOTA reasoning model with unprecedented depth and nuance, and powerful multimodal understanding and coding capabilities.
-
lyria-3-clip-previewOur low-latency, music generation model optimized for high-fidelity audio clips and precise rhythmic control.
-
gemini-3.1-flash-liteOur most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.
-
gemini-3.1-flash-lite-previewOur most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.
-
gemini-3-flash-previewOur most intelligent model built for speed, combining frontier intelligence with superior search and grounding.
-
gemini-3.5-flashOur most intelligent model for sustained frontier performance in agentic and coding tasks.
-
gemini-3-pro-previewOur most intelligent model with SOTA reasoning and multimodal understanding, and powerful agentic and vibe coding capabilities.
-
gemini-2.5-flash-native-audio-preview-12-2025Our native audio models optimized for higher quality audio outputs with better pacing, voice naturalness, verbosity, and mood.
-
gemini-2.5-flash-imageOur native image generation model, optimized for speed, flexibility, and contextual understanding. Text input and output is priced the same as 2.5 Flash.
-
gemini-2.5-flash-liteOur smallest and most cost effective model, built for at scale usage.
-
gemini-2.5-proOur state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks.
-
gemini-3.1-flash-image-previewPro-level visual intelligence with Flash-speed efficiency and reality-grounded generation capabilities.
-
gemini-3-pro-image-previewState-of-the-art image generation and editing model.
-
gemini-2.5-flash-lite-preview-09-2025The latest model based on Gemini 2.5 Flash lite optimized for cost-efficiency, high throughput and high quality.
-
gemini-2.5-flash-preview-09-2025The latest model based on the 2.5 Flash model. 2.5 Flash Preview is best for large scale processing, low-latency, high volume tasks that require thinking, and agentic use cases.
agent AgentOption (optional)
The name of the `Agent` used for generating the interaction.
Possible values:
-
deep-research-preview-04-2026Gemini Deep Research Agent
-
deep-research-pro-preview-12-2025Gemini Deep Research Agent
-
deep-research-max-preview-04-2026Gemini Deep Research Max Agent
-
antigravity-preview-05-2026Use the Antigravity managed agent to perform multi-step tasks that require reasoning, file operations, and tool use.
Possible values
-
deep-research-preview-04-2026Gemini Deep Research Agent
-
deep-research-pro-preview-12-2025Gemini Deep Research Agent
-
deep-research-max-preview-04-2026Gemini Deep Research Max Agent
-
antigravity-preview-05-2026Use the Antigravity managed agent to perform multi-step tasks that require reasoning, file operations, and tool use.
Required. Output only. A unique identifier for the interaction completion.
Required. Output only. The status of the interaction.
Possible values:
-
in_progressThe interaction is in progress.
-
requires_actionThe interaction requires action/input from the user.
-
completedThe interaction is completed.
-
failedThe interaction failed.
-
cancelledThe interaction was cancelled.
-
incompleteThe interaction is completed, but contains incomplete results (e.g. hitting max_tokens).
-
budget_exceededThe interaction was halted because the token budget was exceeded.
Required. Output only. The time at which the response was created in ISO 8601 format (YYYY-MM-DDThh:mm:ssZ).
Required. Output only. The time at which the response was last updated in ISO 8601 format (YYYY-MM-DDThh:mm:ssZ).
System instruction for the interaction.
tools Tool (optional)
A list of tool declarations the model may call during interaction.
Possible Types
Polymorphic discriminator: type
CodeExecution
A tool that can be used by the model to execute code.
No description provided.
Always set to "code_execution".
ComputerUse
A tool that can be used by the model to interact with the computer.
No description provided.
Always set to "computer_use".
The environment being operated.
Possible values:
-
browserOperates in a web browser.
The list of predefined functions that are excluded from the model call.
FileSearch
A tool that can be used by the model to search files.
No description provided.
Always set to "file_search".
The file search store names to search.
The number of semantic retrieval chunks to retrieve.
Metadata filter to apply to the semantic retrieval documents and chunks.
Function
A tool that can be used by the model.
No description provided.
Always set to "function".
The name of the function.
A description of the function.
The JSON Schema for the function's parameters.
GoogleMaps
A tool that can be used by the model to call Google Maps.
No description provided.
Always set to "google_maps".
Whether to return a widget context token in the tool call result of the response.
The latitude of the user's location.
The longitude of the user's location.
GoogleSearch
A tool that can be used by the model to search Google.
No description provided.
Always set to "google_search".
The types of search grounding to enable.
Possible values:
-
web_searchSetting this field enables web search. Only text results are returned.
-
image_searchSetting this field enables image search. Image bytes are returned.
-
enterprise_web_searchSetting this field enables enterprise web search.
McpServer
A MCPServer is a server that can be called by the model to perform actions.
No description provided.
Always set to "mcp_server".
The name of the MCPServer.
The full URL for the MCPServer endpoint. Example: "https://api.example.com/mcp"
Optional: Fields for authentication headers, timeouts, etc., if needed.
allowed_tools AllowedTools (optional)
The allowed tools.
Fields
The mode of the tool choice.
Possible values:
-
autoAuto tool choice.
-
anyAny tool choice.
-
noneNo tool choice.
-
validatedValidated tool choice.
The names of the allowed tools.
Retrieval
A tool that can be used by the model to retrieve files.
No description provided.
Always set to "retrieval".
The types of file retrieval to enable.
Possible values:
-
rag_store -
exa_ai_search -
parallel_ai_search
exa_ai_search_config ExaAISearchConfig (optional)
Used to specify configuration for ExaAISearch.
Fields
Required. The API key for ExaAiSearch.
Optional. This field can be used to pass any parameter from the Exa.ai Search API.
parallel_ai_search_config ParallelAISearchConfig (optional)
Used to specify configuration for ParallelAISearch.
Fields
Optional. The API key for ParallelAiSearch.
Optional. Custom configs for ParallelAiSearch.
rag_store_config RagStoreConfig (optional)
Used to specify configuration for RagStore.
Fields
rag_resources RagResource (optional)
Optional. The representation of the rag source.
Fields
Optional. RagCorpora resource name.
Optional. rag_file_id. The files should be in the same rag_corpus set in rag_corpus field.
rag_retrieval_config RagRetrievalConfig (optional)
Optional. The retrieval config for the Rag query.
Fields
Optional. The number of contexts to retrieve.
hybrid_search HybridSearch (optional)
Optional. Config for Hybrid Search.
Fields
Optional. Alpha value controls the weight between dense and sparse vector search results.
filter Filter (optional)
Optional. Config for filters.
Fields
Optional. Only returns contexts with vector distance smaller than the threshold.
Optional. Only returns contexts with vector similarity larger than the threshold.
Optional. String for metadata filtering.
ranking Ranking (optional)
Optional. Config for ranking and reranking.
UrlContext
A tool that can be used by the model to fetch URL context.
No description provided.
Always set to "url_context".
usage Usage (optional)
Output only. Statistics on the interaction request's token usage.
Fields
Number of tokens in the prompt (context).
input_tokens_by_modality ModalityTokens (optional)
A breakdown of input token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens in the cached part of the prompt (the cached content).
cached_tokens_by_modality ModalityTokens (optional)
A breakdown of cached token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Total number of tokens across all the generated responses.
output_tokens_by_modality ModalityTokens (optional)
A breakdown of output token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens present in tool-use prompt(s).
tool_use_tokens_by_modality ModalityTokens (optional)
A breakdown of tool-use token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens of thoughts for thinking models.
Total token count for the interaction request (prompt + responses + other internal tokens).
grounding_tool_count GroundingToolCount (optional)
Grounding tool count.
Fields
The grounding tool type associated with the count.
Possible values:
-
google_searchGrounding with Google Web Search and Image Search, & Web Grounding for Enterprise.
-
google_mapsGrounding with Google Maps.
-
retrievalGrounding with customer's data, for example, VertexAISearch.
The number of grounding tool counts.
response_modalities ResponseModality (optional)
The requested modalities of the response (TEXT, IMAGE, AUDIO).
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
The ID of the previous interaction, if any.
Output only. The environment ID for the interaction. Only populated if environment config is set in the request.
service_tier ServiceTier (optional)
The service tier for the interaction.
Possible values:
-
flexFlex service tier.
-
standardStandard service tier.
-
priorityPriority service tier.
Possible values
-
flexFlex service tier.
-
standardStandard service tier.
-
priorityPriority service tier.
webhook_config WebhookConfig (optional)
Optional. Webhook configuration for receiving notifications when the interaction completes.
Fields
Optional. If set, these webhook URIs will be used for webhook events instead of the registered webhooks.
Optional. The user metadata that will be returned on each event emission to the webhooks.
steps Step (optional)
Required. Output only. The steps that make up the interaction.
Possible Types
Polymorphic discriminator: type
CodeExecutionCallStep
Code execution call step.
No description provided.
Always set to "code_execution_call".
arguments CodeExecutionCallStepArguments (required)
Required. The arguments to pass to the code execution.
Fields
Programming language of the `code`.
Possible values:
-
pythonPython >= 3.10, with numpy and simpy available.
The code to be executed.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
CodeExecutionResultStep
Code execution result step.
No description provided.
Always set to "code_execution_result".
Required. The output of the code execution.
Whether the code execution resulted in an error.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
FileSearchCallStep
File Search call step.
No description provided.
Always set to "file_search_call".
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
FileSearchResultStep
File Search result step.
No description provided.
Always set to "file_search_result".
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
FunctionCallStep
A function tool call step.
No description provided.
Always set to "function_call".
Required. The name of the tool to call.
Required. The arguments to pass to the function.
Required. A unique ID for this specific tool call.
FunctionResultStep
Result of a function tool call.
No description provided.
Always set to "function_result".
The name of the tool that was called.
Whether the tool call resulted in an error.
Required. ID to match the ID from the function call block.
The result of the tool call.
GoogleMapsCallStep
Google Maps call step.
No description provided.
Always set to "google_maps_call".
arguments GoogleMapsCallStepArguments (optional)
The arguments to pass to the Google Maps tool.
Fields
The queries to be executed.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
GoogleMapsResultStep
Google Maps result step.
No description provided.
Always set to "google_maps_result".
result GoogleMapsResultItem (required)
No description provided.
Fields
places GoogleMapsResultPlaces (optional)
No description provided.
Fields
No description provided.
No description provided.
No description provided.
review_snippets ReviewSnippet (optional)
No description provided.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
No description provided.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
GoogleSearchCallStep
Google Search call step.
No description provided.
Always set to "google_search_call".
arguments GoogleSearchCallStepArguments (required)
Required. The arguments to pass to Google Search.
Fields
Web search queries for the following-up web search.
The type of search grounding enabled.
Possible values:
-
web_searchSetting this field enables web search. Only text results are returned.
-
image_searchSetting this field enables image search. Image bytes are returned.
-
enterprise_web_searchSetting this field enables enterprise web search.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
GoogleSearchResultStep
Google Search result step.
No description provided.
Always set to "google_search_result".
result GoogleSearchResultItem (required)
Required. The results of the Google Search.
Fields
Web content snippet that can be embedded in a web page or an app webview.
Whether the Google Search resulted in an error.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
McpServerToolCallStep
MCPServer tool call step.
No description provided.
Always set to "mcp_server_tool_call".
Required. The name of the tool which was called.
Required. The name of the used MCP server.
Required. The JSON object of arguments for the function.
Required. A unique ID for this specific tool call.
McpServerToolResultStep
MCPServer tool result step.
No description provided.
Always set to "mcp_server_tool_result".
Name of the tool which is called for this specific tool call.
The name of the used MCP server.
Required. ID to match the ID from the function call block.
The output from the MCP server call. Can be simple text or rich content.
ModelOutputStep
Output generated by the model.
No description provided.
Always set to "model_output".
content Content (optional)
No description provided.
Possible Types
Polymorphic discriminator: type
AudioContent
An audio content block.
No description provided.
Always set to "audio".
The audio content.
The URI of the audio.
The mime type of the audio.
Possible values:
-
audio/wavWAV audio format
-
audio/mp3MP3 audio format
-
audio/aiffAIFF audio format
-
audio/aacAAC audio format
-
audio/oggOGG audio format
-
audio/flacFLAC audio format
-
audio/mpegMPEG audio format
-
audio/m4aM4A audio format
-
audio/l16L16 audio format
-
audio/opusOPUS audio format
-
audio/alawALAW audio format
-
audio/mulawMULAW audio format
The number of audio channels.
The sample rate of the audio.
DocumentContent
A document content block.
No description provided.
Always set to "document".
The document content.
The URI of the document.
The mime type of the document.
Possible values:
-
application/pdfPDF document format
-
text/csvCSV document format
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
VideoContent
A video content block.
No description provided.
Always set to "video".
The video content.
The URI of the video.
The mime type of the video.
Possible values:
-
video/mp4MP4 video format
-
video/mpegMPEG video format
-
video/mpgMPG video format
-
video/movMOV video format
-
video/aviAVI video format
-
video/x-flvFLV video format
-
video/webmWebM video format
-
video/wmvWMV video format
-
video/3gpp3GPP video format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
ThoughtStep
A thought step.
No description provided.
Always set to "thought".
A signature hash for backend validation.
summary ThoughtSummaryContent (optional)
A summary of the thought.
Possible Types
Polymorphic discriminator: type
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlContextCallStep
URL context call step.
No description provided.
Always set to "url_context_call".
arguments UrlContextCallStepArguments (required)
Required. The arguments to pass to the URL context.
Fields
The URLs to fetch.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
UrlContextResultStep
URL context result step.
No description provided.
Always set to "url_context_result".
result UrlContextResultItem (required)
Required. The results of the URL context.
Fields
The URL that was fetched.
The status of the URL retrieval.
Possible values:
-
successThe status of the URL retrieval.
-
errorThe status of the URL retrieval.
-
paywallThe status of the URL retrieval.
-
unsafeThe status of the URL retrieval.
Whether the URL context resulted in an error.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
UserInputStep
Input provided by the user.
content Content (optional)
No description provided.
Possible Types
Polymorphic discriminator: type
AudioContent
An audio content block.
No description provided.
Always set to "audio".
The audio content.
The URI of the audio.
The mime type of the audio.
Possible values:
-
audio/wavWAV audio format
-
audio/mp3MP3 audio format
-
audio/aiffAIFF audio format
-
audio/aacAAC audio format
-
audio/oggOGG audio format
-
audio/flacFLAC audio format
-
audio/mpegMPEG audio format
-
audio/m4aM4A audio format
-
audio/l16L16 audio format
-
audio/opusOPUS audio format
-
audio/alawALAW audio format
-
audio/mulawMULAW audio format
The number of audio channels.
The sample rate of the audio.
DocumentContent
A document content block.
No description provided.
Always set to "document".
The document content.
The URI of the document.
The mime type of the document.
Possible values:
-
application/pdfPDF document format
-
text/csvCSV document format
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
VideoContent
A video content block.
No description provided.
Always set to "video".
The video content.
The URI of the video.
The mime type of the video.
Possible values:
-
video/mp4MP4 video format
-
video/mpegMPEG video format
-
video/mpgMPG video format
-
video/movMOV video format
-
video/aviAVI video format
-
video/x-flvFLV video format
-
video/webmWebM video format
-
video/wmvWMV video format
-
video/3gpp3GPP video format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
No description provided.
Always set to "user_input".
The input for the interaction.
Enforces that the generated response is a JSON object that complies with the JSON schema specified in this field.
The environment configuration for the interaction. Can be an object specifying remote environment sources or a string referencing an existing environment ID.
The name of the cached content used as context to serve the prediction. Note: only used in explicit caching, where users can have control over caching (e.g. what content to cache) and enjoy guaranteed cost savings. Format: `projects/{project}/locations/{location}/cachedContents/{cachedContent}`
agent_config object (optional)
Configuration parameters for the agent interaction.
Possible Types
Polymorphic discriminator: type
DeepResearchAgentConfig
Configuration for the Deep Research agent.
No description provided.
Always set to "deep-research".
thinking_summaries ThinkingSummaries (optional)
Whether to include thought summaries in the response.
Possible values:
-
autoAuto thinking summaries.
-
noneNo thinking summaries.
Possible values
-
autoAuto thinking summaries.
-
noneNo thinking summaries.
Whether to include visualizations in the response.
Possible values:
-
offDo not include visualizations.
-
autoAutomatically include visualizations.
Enables human-in-the-loop planning for the Deep Research agent. If set to true, the Deep Research agent will provide a research plan in its response. The agent will then proceed only if the user confirms the plan in the next turn.
Enables bigquery tool for the Deep Research agent.
DynamicAgentConfig
Configuration for dynamic agents.
No description provided.
Always set to "dynamic".
FindRequest
Request parameters specific to FIND sessions, used for discovering vulnerabilities in a codebase.
No description provided.
Always set to "find_request".
source_files FileContent (optional)
A list of source files to provide as context for the scan.
Fields
The relative path of the file from the project root.
The UTF-8 encoded text content of the file.
The identifier of a specific finding to verify. This is primarily used in VERIFY mode to focus the agent's execution-based validation on a single vulnerability.
Additional context or custom instructions provided by the user to guide the vulnerability analysis.
Parameter for grouping multiple interactions that belong to the same CodeMender session.
session_config SessionConfig (optional)
Optional session-specific configurations to override default agent behavior.
Fields
The pipeline mode of a CodeMender session. It can only be used for a find session.
Possible values:
-
scanFast scan using only the initial classifier.
-
verifyPerforms classification followed by detailed investigation.
The cognitive architecture or "thinking" topology used by the agent (e.g. "default", "deep").
The maximum number of interaction rounds the agent is allowed to perform before reaching a timeout.
FixRequest
Request parameters specific to FIX sessions, used for generating and validating security patches.
No description provided.
Always set to "fix_request".
source_files FileContent (optional)
A list of source files providing context for the remediation. These files are typically the ones containing the identified vulnerability.
Fields
The relative path of the file from the project root.
The UTF-8 encoded text content of the file.
The identifier of the specific security finding to be remediated. This ID maps to a previously discovered vulnerability.
Additional context or custom instructions provided by the user to guide the patch generation process.
Parameter for grouping multiple interactions that belong to the same CodeMender session.
session_config SessionConfig (optional)
Optional session-specific configurations to override default agent behavior.
Fields
The pipeline mode of a CodeMender session. It can only be used for a find session.
Possible values:
-
scanFast scan using only the initial classifier.
-
verifyPerforms classification followed by detailed investigation.
The cognitive architecture or "thinking" topology used by the agent (e.g. "default", "deep").
The maximum number of interaction rounds the agent is allowed to perform before reaching a timeout.
The event_id token to be used to resume the interaction stream, from this event.
metadata StreamMetadata (optional)
Optional metadata accompanying ANY streamed event.
Fields
total_usage Usage (optional)
No description provided.
Fields
Number of tokens in the prompt (context).
input_tokens_by_modality ModalityTokens (optional)
A breakdown of input token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens in the cached part of the prompt (the cached content).
cached_tokens_by_modality ModalityTokens (optional)
A breakdown of cached token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Total number of tokens across all the generated responses.
output_tokens_by_modality ModalityTokens (optional)
A breakdown of output token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens present in tool-use prompt(s).
tool_use_tokens_by_modality ModalityTokens (optional)
A breakdown of tool-use token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens of thoughts for thinking models.
Total token count for the interaction request (prompt + responses + other internal tokens).
grounding_tool_count GroundingToolCount (optional)
Grounding tool count.
Fields
The grounding tool type associated with the count.
Possible values:
-
google_searchGrounding with Google Web Search and Image Search, & Web Grounding for Enterprise.
-
google_mapsGrounding with Google Maps.
-
retrievalGrounding with customer's data, for example, VertexAISearch.
The number of grounding tool counts.
InteractionStatusUpdate
No description provided.
Always set to "interaction.status_update".
No description provided.
No description provided.
Possible values:
-
in_progressThe interaction is in progress.
-
requires_actionThe interaction requires action/input from the user.
-
completedThe interaction is completed.
-
failedThe interaction failed.
-
cancelledThe interaction was cancelled.
-
incompleteThe interaction is completed, but contains incomplete results (e.g. hitting max_tokens).
-
budget_exceededThe interaction was halted because the token budget was exceeded.
The event_id token to be used to resume the interaction stream, from this event.
metadata StreamMetadata (optional)
Optional metadata accompanying ANY streamed event.
Fields
total_usage Usage (optional)
No description provided.
Fields
Number of tokens in the prompt (context).
input_tokens_by_modality ModalityTokens (optional)
A breakdown of input token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens in the cached part of the prompt (the cached content).
cached_tokens_by_modality ModalityTokens (optional)
A breakdown of cached token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Total number of tokens across all the generated responses.
output_tokens_by_modality ModalityTokens (optional)
A breakdown of output token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens present in tool-use prompt(s).
tool_use_tokens_by_modality ModalityTokens (optional)
A breakdown of tool-use token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens of thoughts for thinking models.
Total token count for the interaction request (prompt + responses + other internal tokens).
grounding_tool_count GroundingToolCount (optional)
Grounding tool count.
Fields
The grounding tool type associated with the count.
Possible values:
-
google_searchGrounding with Google Web Search and Image Search, & Web Grounding for Enterprise.
-
google_mapsGrounding with Google Maps.
-
retrievalGrounding with customer's data, for example, VertexAISearch.
The number of grounding tool counts.
StepDelta
No description provided.
Always set to "step.delta".
No description provided.
delta StepDeltaData (required)
No description provided.
Possible Types
Polymorphic discriminator: type
ArgumentsDelta
No description provided.
Always set to "arguments_delta".
No description provided.
AudioDelta
No description provided.
Always set to "audio".
No description provided.
No description provided.
No description provided.
Possible values:
-
audio/wavWAV audio format
-
audio/mp3MP3 audio format
-
audio/aiffAIFF audio format
-
audio/aacAAC audio format
-
audio/oggOGG audio format
-
audio/flacFLAC audio format
-
audio/mpegMPEG audio format
-
audio/m4aM4A audio format
-
audio/l16L16 audio format
-
audio/opusOPUS audio format
-
audio/alawALAW audio format
-
audio/mulawMULAW audio format
The sample rate of the audio.
The number of audio channels.
CodeExecutionCallDelta
No description provided.
Always set to "code_execution_call".
arguments CodeExecutionCallArguments (required)
No description provided.
Fields
Programming language of the `code`.
Possible values:
-
pythonPython >= 3.10, with numpy and simpy available.
The code to be executed.
A signature hash for backend validation.
CodeExecutionResultDelta
No description provided.
Always set to "code_execution_result".
No description provided.
No description provided.
A signature hash for backend validation.
DocumentDelta
No description provided.
Always set to "document".
No description provided.
No description provided.
No description provided.
Possible values:
-
application/pdfPDF document format
-
text/csvCSV document format
FileSearchCallDelta
No description provided.
Always set to "file_search_call".
A signature hash for backend validation.
FileSearchResultDelta
No description provided.
Always set to "file_search_result".
result FileSearchResult (required)
No description provided.
A signature hash for backend validation.
FunctionResultDelta
No description provided.
Always set to "function_result".
No description provided.
No description provided.
Required. ID to match the ID from the function call block.
No description provided.
GoogleMapsCallDelta
No description provided.
Always set to "google_maps_call".
arguments GoogleMapsCallArguments (optional)
The arguments to pass to the Google Maps tool.
Fields
The queries to be executed.
A signature hash for backend validation.
GoogleMapsResultDelta
No description provided.
Always set to "google_maps_result".
result GoogleMapsResult (optional)
The results of the Google Maps.
Fields
places Places (optional)
The places that were found.
Fields
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Resource name of the Google Maps widget context token.
A signature hash for backend validation.
GoogleSearchCallDelta
No description provided.
Always set to "google_search_call".
arguments GoogleSearchCallArguments (required)
No description provided.
Fields
Web search queries for the following-up web search.
A signature hash for backend validation.
GoogleSearchResultDelta
No description provided.
Always set to "google_search_result".
result GoogleSearchResult (required)
No description provided.
Fields
Web content snippet that can be embedded in a web page or an app webview.
No description provided.
A signature hash for backend validation.
ImageDelta
No description provided.
Always set to "image".
No description provided.
No description provided.
No description provided.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
McpServerToolCallDelta
No description provided.
Always set to "mcp_server_tool_call".
No description provided.
No description provided.
No description provided.
McpServerToolResultDelta
No description provided.
Always set to "mcp_server_tool_result".
No description provided.
No description provided.
No description provided.
TextAnnotationDelta
No description provided.
Always set to "text_annotation_delta".
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
TextDelta
No description provided.
Always set to "text".
No description provided.
ThoughtSignatureDelta
No description provided.
Always set to "thought_signature".
Signature to match the backend source to be part of the generation.
ThoughtSummaryDelta
No description provided.
Always set to "thought_summary".
content Content (optional)
A new summary item to be added to the thought.
Possible Types
Polymorphic discriminator: type
AudioContent
An audio content block.
No description provided.
Always set to "audio".
The audio content.
The URI of the audio.
The mime type of the audio.
Possible values:
-
audio/wavWAV audio format
-
audio/mp3MP3 audio format
-
audio/aiffAIFF audio format
-
audio/aacAAC audio format
-
audio/oggOGG audio format
-
audio/flacFLAC audio format
-
audio/mpegMPEG audio format
-
audio/m4aM4A audio format
-
audio/l16L16 audio format
-
audio/opusOPUS audio format
-
audio/alawALAW audio format
-
audio/mulawMULAW audio format
The number of audio channels.
The sample rate of the audio.
DocumentContent
A document content block.
No description provided.
Always set to "document".
The document content.
The URI of the document.
The mime type of the document.
Possible values:
-
application/pdfPDF document format
-
text/csvCSV document format
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
VideoContent
A video content block.
No description provided.
Always set to "video".
The video content.
The URI of the video.
The mime type of the video.
Possible values:
-
video/mp4MP4 video format
-
video/mpegMPEG video format
-
video/mpgMPG video format
-
video/movMOV video format
-
video/aviAVI video format
-
video/x-flvFLV video format
-
video/webmWebM video format
-
video/wmvWMV video format
-
video/3gpp3GPP video format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
UrlContextCallDelta
No description provided.
Always set to "url_context_call".
arguments UrlContextCallArguments (required)
No description provided.
Fields
The URLs to fetch.
A signature hash for backend validation.
UrlContextResultDelta
No description provided.
Always set to "url_context_result".
result UrlContextResult (required)
No description provided.
Fields
The URL that was fetched.
The status of the URL retrieval.
Possible values:
-
successUrl retrieval is successful.
-
errorUrl retrieval is failed due to error.
-
paywallUrl retrieval is failed because the content is behind paywall.
-
unsafeUrl retrieval is failed because the content is unsafe.
No description provided.
A signature hash for backend validation.
VideoDelta
No description provided.
Always set to "video".
No description provided.
No description provided.
No description provided.
Possible values:
-
video/mp4MP4 video format
-
video/mpegMPEG video format
-
video/mpgMPG video format
-
video/movMOV video format
-
video/aviAVI video format
-
video/x-flvFLV video format
-
video/webmWebM video format
-
video/wmvWMV video format
-
video/3gpp3GPP video format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
The event_id token to be used to resume the interaction stream, from this event.
metadata StreamMetadata (optional)
Optional metadata accompanying ANY streamed event.
Fields
total_usage Usage (optional)
No description provided.
Fields
Number of tokens in the prompt (context).
input_tokens_by_modality ModalityTokens (optional)
A breakdown of input token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens in the cached part of the prompt (the cached content).
cached_tokens_by_modality ModalityTokens (optional)
A breakdown of cached token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Total number of tokens across all the generated responses.
output_tokens_by_modality ModalityTokens (optional)
A breakdown of output token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens present in tool-use prompt(s).
tool_use_tokens_by_modality ModalityTokens (optional)
A breakdown of tool-use token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens of thoughts for thinking models.
Total token count for the interaction request (prompt + responses + other internal tokens).
grounding_tool_count GroundingToolCount (optional)
Grounding tool count.
Fields
The grounding tool type associated with the count.
Possible values:
-
google_searchGrounding with Google Web Search and Image Search, & Web Grounding for Enterprise.
-
google_mapsGrounding with Google Maps.
-
retrievalGrounding with customer's data, for example, VertexAISearch.
The number of grounding tool counts.
StepStart
No description provided.
Always set to "step.start".
No description provided.
step Step (required)
No description provided.
Possible Types
Polymorphic discriminator: type
CodeExecutionCallStep
Code execution call step.
No description provided.
Always set to "code_execution_call".
arguments CodeExecutionCallStepArguments (required)
Required. The arguments to pass to the code execution.
Fields
Programming language of the `code`.
Possible values:
-
pythonPython >= 3.10, with numpy and simpy available.
The code to be executed.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
CodeExecutionResultStep
Code execution result step.
No description provided.
Always set to "code_execution_result".
Required. The output of the code execution.
Whether the code execution resulted in an error.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
FileSearchCallStep
File Search call step.
No description provided.
Always set to "file_search_call".
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
FileSearchResultStep
File Search result step.
No description provided.
Always set to "file_search_result".
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
FunctionCallStep
A function tool call step.
No description provided.
Always set to "function_call".
Required. The name of the tool to call.
Required. The arguments to pass to the function.
Required. A unique ID for this specific tool call.
FunctionResultStep
Result of a function tool call.
No description provided.
Always set to "function_result".
The name of the tool that was called.
Whether the tool call resulted in an error.
Required. ID to match the ID from the function call block.
The result of the tool call.
GoogleMapsCallStep
Google Maps call step.
No description provided.
Always set to "google_maps_call".
arguments GoogleMapsCallStepArguments (optional)
The arguments to pass to the Google Maps tool.
Fields
The queries to be executed.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
GoogleMapsResultStep
Google Maps result step.
No description provided.
Always set to "google_maps_result".
result GoogleMapsResultItem (required)
No description provided.
Fields
places GoogleMapsResultPlaces (optional)
No description provided.
Fields
No description provided.
No description provided.
No description provided.
review_snippets ReviewSnippet (optional)
No description provided.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
No description provided.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
GoogleSearchCallStep
Google Search call step.
No description provided.
Always set to "google_search_call".
arguments GoogleSearchCallStepArguments (required)
Required. The arguments to pass to Google Search.
Fields
Web search queries for the following-up web search.
The type of search grounding enabled.
Possible values:
-
web_searchSetting this field enables web search. Only text results are returned.
-
image_searchSetting this field enables image search. Image bytes are returned.
-
enterprise_web_searchSetting this field enables enterprise web search.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
GoogleSearchResultStep
Google Search result step.
No description provided.
Always set to "google_search_result".
result GoogleSearchResultItem (required)
Required. The results of the Google Search.
Fields
Web content snippet that can be embedded in a web page or an app webview.
Whether the Google Search resulted in an error.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
McpServerToolCallStep
MCPServer tool call step.
No description provided.
Always set to "mcp_server_tool_call".
Required. The name of the tool which was called.
Required. The name of the used MCP server.
Required. The JSON object of arguments for the function.
Required. A unique ID for this specific tool call.
McpServerToolResultStep
MCPServer tool result step.
No description provided.
Always set to "mcp_server_tool_result".
Name of the tool which is called for this specific tool call.
The name of the used MCP server.
Required. ID to match the ID from the function call block.
The output from the MCP server call. Can be simple text or rich content.
ModelOutputStep
Output generated by the model.
No description provided.
Always set to "model_output".
content Content (optional)
No description provided.
Possible Types
Polymorphic discriminator: type
AudioContent
An audio content block.
No description provided.
Always set to "audio".
The audio content.
The URI of the audio.
The mime type of the audio.
Possible values:
-
audio/wavWAV audio format
-
audio/mp3MP3 audio format
-
audio/aiffAIFF audio format
-
audio/aacAAC audio format
-
audio/oggOGG audio format
-
audio/flacFLAC audio format
-
audio/mpegMPEG audio format
-
audio/m4aM4A audio format
-
audio/l16L16 audio format
-
audio/opusOPUS audio format
-
audio/alawALAW audio format
-
audio/mulawMULAW audio format
The number of audio channels.
The sample rate of the audio.
DocumentContent
A document content block.
No description provided.
Always set to "document".
The document content.
The URI of the document.
The mime type of the document.
Possible values:
-
application/pdfPDF document format
-
text/csvCSV document format
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
VideoContent
A video content block.
No description provided.
Always set to "video".
The video content.
The URI of the video.
The mime type of the video.
Possible values:
-
video/mp4MP4 video format
-
video/mpegMPEG video format
-
video/mpgMPG video format
-
video/movMOV video format
-
video/aviAVI video format
-
video/x-flvFLV video format
-
video/webmWebM video format
-
video/wmvWMV video format
-
video/3gpp3GPP video format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
ThoughtStep
A thought step.
No description provided.
Always set to "thought".
A signature hash for backend validation.
summary ThoughtSummaryContent (optional)
A summary of the thought.
Possible Types
Polymorphic discriminator: type
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlContextCallStep
URL context call step.
No description provided.
Always set to "url_context_call".
arguments UrlContextCallStepArguments (required)
Required. The arguments to pass to the URL context.
Fields
The URLs to fetch.
Required. A unique ID for this specific tool call.
A signature hash for backend validation.
UrlContextResultStep
URL context result step.
No description provided.
Always set to "url_context_result".
result UrlContextResultItem (required)
Required. The results of the URL context.
Fields
The URL that was fetched.
The status of the URL retrieval.
Possible values:
-
successThe status of the URL retrieval.
-
errorThe status of the URL retrieval.
-
paywallThe status of the URL retrieval.
-
unsafeThe status of the URL retrieval.
Whether the URL context resulted in an error.
Required. ID to match the ID from the function call block.
A signature hash for backend validation.
UserInputStep
Input provided by the user.
content Content (optional)
No description provided.
Possible Types
Polymorphic discriminator: type
AudioContent
An audio content block.
No description provided.
Always set to "audio".
The audio content.
The URI of the audio.
The mime type of the audio.
Possible values:
-
audio/wavWAV audio format
-
audio/mp3MP3 audio format
-
audio/aiffAIFF audio format
-
audio/aacAAC audio format
-
audio/oggOGG audio format
-
audio/flacFLAC audio format
-
audio/mpegMPEG audio format
-
audio/m4aM4A audio format
-
audio/l16L16 audio format
-
audio/opusOPUS audio format
-
audio/alawALAW audio format
-
audio/mulawMULAW audio format
The number of audio channels.
The sample rate of the audio.
DocumentContent
A document content block.
No description provided.
Always set to "document".
The document content.
The URI of the document.
The mime type of the document.
Possible values:
-
application/pdfPDF document format
-
text/csvCSV document format
ImageContent
An image content block.
No description provided.
Always set to "image".
The image content.
The URI of the image.
The mime type of the image.
Possible values:
-
image/pngPNG image format
-
image/jpegJPEG image format
-
image/webpWebP image format
-
image/heicHEIC image format
-
image/heifHEIF image format
-
image/gifGIF image format
-
image/bmpBMP image format
-
image/tiffTIFF image format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
TextContent
A text content block.
No description provided.
Always set to "text".
Required. The text content.
annotations Annotation (optional)
Citation information for model-generated content.
Possible Types
Polymorphic discriminator: type
FileCitation
A file citation annotation.
No description provided.
Always set to "file_citation".
The URI of the file.
The name of the file.
Source attributed for a portion of the text.
User provided metadata about the retrieved context.
Page number of the cited document, if applicable.
Media ID in-case of image citations, if applicable.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
PlaceCitation
A place citation annotation.
No description provided.
Always set to "place_citation".
The ID of the place, in `places/{place_id}` format.
Title of the place.
URI reference of the place.
review_snippets ReviewSnippet (optional)
Snippets of reviews that are used to generate answers about the features of a given place in Google Maps.
Fields
Title of the review.
A link that corresponds to the user review on Google Maps.
The ID of the review snippet.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
UrlCitation
A URL citation annotation.
No description provided.
Always set to "url_citation".
The URL.
The title of the URL.
Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
End of the attributed segment, exclusive.
VideoContent
A video content block.
No description provided.
Always set to "video".
The video content.
The URI of the video.
The mime type of the video.
Possible values:
-
video/mp4MP4 video format
-
video/mpegMPEG video format
-
video/mpgMPG video format
-
video/movMOV video format
-
video/aviAVI video format
-
video/x-flvFLV video format
-
video/webmWebM video format
-
video/wmvWMV video format
-
video/3gpp3GPP video format
resolution MediaResolution (optional)
The resolution of the media.
Possible values:
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
Possible values
-
lowLow resolution.
-
mediumMedium resolution.
-
highHigh resolution.
-
ultra_highUltra high resolution.
No description provided.
Always set to "user_input".
The event_id token to be used to resume the interaction stream, from this event.
metadata StreamMetadata (optional)
Optional metadata accompanying ANY streamed event.
Fields
total_usage Usage (optional)
No description provided.
Fields
Number of tokens in the prompt (context).
input_tokens_by_modality ModalityTokens (optional)
A breakdown of input token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens in the cached part of the prompt (the cached content).
cached_tokens_by_modality ModalityTokens (optional)
A breakdown of cached token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Total number of tokens across all the generated responses.
output_tokens_by_modality ModalityTokens (optional)
A breakdown of output token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens present in tool-use prompt(s).
tool_use_tokens_by_modality ModalityTokens (optional)
A breakdown of tool-use token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens of thoughts for thinking models.
Total token count for the interaction request (prompt + responses + other internal tokens).
grounding_tool_count GroundingToolCount (optional)
Grounding tool count.
Fields
The grounding tool type associated with the count.
Possible values:
-
google_searchGrounding with Google Web Search and Image Search, & Web Grounding for Enterprise.
-
google_mapsGrounding with Google Maps.
-
retrievalGrounding with customer's data, for example, VertexAISearch.
The number of grounding tool counts.
StepStop
No description provided.
Always set to "step.stop".
No description provided.
The event_id token to be used to resume the interaction stream, from this event.
metadata StreamMetadata (optional)
Optional metadata accompanying ANY streamed event.
Fields
total_usage Usage (optional)
No description provided.
Fields
Number of tokens in the prompt (context).
input_tokens_by_modality ModalityTokens (optional)
A breakdown of input token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens in the cached part of the prompt (the cached content).
cached_tokens_by_modality ModalityTokens (optional)
A breakdown of cached token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Total number of tokens across all the generated responses.
output_tokens_by_modality ModalityTokens (optional)
A breakdown of output token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens present in tool-use prompt(s).
tool_use_tokens_by_modality ModalityTokens (optional)
A breakdown of tool-use token usage by modality.
Fields
modality ResponseModality (optional)
The modality associated with the token count.
Possible values:
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Possible values
-
textIndicates the model should return text.
-
imageIndicates the model should return images.
-
audioIndicates the model should return audio.
-
videoIndicates the model should return video.
-
documentIndicates the model should return documents.
Number of tokens for the modality.
Number of tokens of thoughts for thinking models.
Total token count for the interaction request (prompt + responses + other internal tokens).
grounding_tool_count GroundingToolCount (optional)
Grounding tool count.
Fields
The grounding tool type associated with the count.
Possible values:
-
google_searchGrounding with Google Web Search and Image Search, & Web Grounding for Enterprise.
-
google_mapsGrounding with Google Maps.
-
retrievalGrounding with customer's data, for example, VertexAISearch.
The number of grounding tool counts.
Examples
Error Event
{ "event_type": "error", "error": { "message": "Failed to get completed interaction: Result not found.", "code": "not_found" } }
Interaction Completed
{ "event_type": "interaction.completed", "interaction": { "id": "v1_ChdXS0l4YWZXTk9xbk0xZThQczhEcmlROBIXV0tJeGFmV05PcW5NMWU4UHM4RHJpUTg", "model": "gemini-3.5-flash", "status": "completed", "created": "2025-12-04T15:01:45Z", "updated": "2025-12-04T15:01:45Z" }, "event_id": "evt_123" }
Interaction Created
{ "event_type": "interaction.created", "interaction": { "id": "v1_ChdXS0l4YWZXTk9xbk0xZThQczhEcmlROBIXV0tJeGFmV05PcW5NMWU4UHM4RHJpUTg", "model": "gemini-3.5-flash", "status": "in_progress", "created": "2025-12-04T15:01:45Z", "updated": "2025-12-04T15:01:45Z" }, "event_id": "evt_123" }
Interaction Status Update
{ "event_type": "interaction.status_update", "interaction_id": "v1_ChdTMjQ0YWJ5TUF1TzcxZThQdjRpcnFRcxIXUzI0NGFieU1BdU83MWU4UHY0aXJxUXM", "status": "in_progress" }
Step Delta
{ "event_type": "step.delta", "index": 0, "delta": { "type": "text", "text": "Hello" } }
Step Start
{ "event_type": "step.start", "index": 0, "step": { "type": "model_output" } }
Step Stop
{ "event_type": "step.stop", "index": 0 }