Context caching allows you to save and reuse precomputed input tokens that you wish to use repeatedly, for example when asking different questions about the same media file. This can lead to cost and speed savings, depending on the usage. For a detailed introduction, see the Context caching guide.
Method: cachedContents.create
Creates CachedContent resource.
Endpoint
posthttps: / /generativelanguage.googleapis.com /v1beta /cachedContents
The URL uses gRPC Transcoding syntax.
Request body
The request body contains an instance of CachedContent
.
Optional. Input only. Immutable. The content to cache.
Optional. Input only. Immutable. A list of Tools
the model may use to generate the next response
expiration
Union type
expiration
can be only one of the following:Timestamp in UTC of when this resource is considered expired. This is always provided on output, regardless of what was sent on input.
A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z"
and "2014-10-02T15:01:23.045123456Z"
.
Input only. New TTL for this resource, input only.
A duration in seconds with up to nine fractional digits, ending with 's
'. Example: "3.5s"
.
name
string
Optional. Identifier. The resource name referring to the cached content. Format: cachedContents/{id}
displayName
string
Optional. Immutable. The user-generated meaningful display name of the cached content. Maximum 128 Unicode characters.
model
string
Required. Immutable. The name of the Model
to use for cached content Format: models/{model}
Optional. Input only. Immutable. Developer set system instruction. Currently text only.
Optional. Input only. Immutable. Tool config. This config is shared for all tools.
Example request
Basic
Python
Node.js
Go
Shell
From name
Python
Node.js
Go
From chat
Python
Node.js
Go
Response body
If successful, the response body contains a newly created instance of CachedContent
.
Method: cachedContents.list
Lists CachedContents.
Endpoint
gethttps: / /generativelanguage.googleapis.com /v1beta /cachedContents
The URL uses gRPC Transcoding syntax.
Query parameters
pageSize
integer
Optional. The maximum number of cached contents to return. The service may return fewer than this value. If unspecified, some default (under maximum) number of items will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000.
pageToken
string
Optional. A page token, received from a previous cachedContents.list
call. Provide this to retrieve the subsequent page.
When paginating, all other parameters provided to cachedContents.list
must match the call that provided the page token.
Request body
The request body must be empty.
Response body
Response with CachedContents list.
If successful, the response body contains data with the following structure:
List of cached contents.
nextPageToken
string
A token, which can be sent as pageToken
to retrieve the next page. If this field is omitted, there are no subsequent pages.
JSON representation |
---|
{
"cachedContents": [
{
object ( |
Method: cachedContents.get
Reads CachedContent resource.
Endpoint
gethttps: / /generativelanguage.googleapis.com /v1beta /{name=cachedContents /*}
The URL uses gRPC Transcoding syntax.
Path parameters
name
string
Required. The resource name referring to the content cache entry. Format: cachedContents/{id}
It takes the form cachedContents/{cachedcontent}
.
Request body
The request body must be empty.
Example request
Python
Node.js
Go
Shell
Response body
If successful, the response body contains an instance of CachedContent
.
Method: cachedContents.patch
Updates CachedContent resource (only expiration is updatable).
Endpoint
patchhttps: / /generativelanguage.googleapis.com /v1beta /{cachedContent.name=cachedContents /*}
PATCH https://generativelanguage.googleapis.com/v1beta/{cachedContent.name=cachedContents/*}
The URL uses gRPC Transcoding syntax.
Path parameters
cachedContent.name
string
Optional. Identifier. The resource name referring to the cached content. Format: cachedContents/{id}
It takes the form cachedContents/{cachedcontent}
.
Query parameters
The list of fields to update.
This is a comma-separated list of fully qualified names of fields. Example: "user.displayName,photo"
.
Request body
The request body contains an instance of CachedContent
.
expiration
Union type
expiration
can be only one of the following:Timestamp in UTC of when this resource is considered expired. This is always provided on output, regardless of what was sent on input.
A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z"
and "2014-10-02T15:01:23.045123456Z"
.
Input only. New TTL for this resource, input only.
A duration in seconds with up to nine fractional digits, ending with 's
'. Example: "3.5s"
.
name
string
Optional. Identifier. The resource name referring to the cached content. Format: cachedContents/{id}
Example request
Python
Node.js
Go
Shell
Response body
If successful, the response body contains an instance of CachedContent
.
Method: cachedContents.delete
Deletes CachedContent resource.
Endpoint
deletehttps: / /generativelanguage.googleapis.com /v1beta /{name=cachedContents /*}
The URL uses gRPC Transcoding syntax.
Path parameters
name
string
Required. The resource name referring to the content cache entry Format: cachedContents/{id}
It takes the form cachedContents/{cachedcontent}
.
Request body
The request body must be empty.
Example request
Python
Node.js
Go
Shell
Response body
If successful, the response body is empty.
REST Resource: cachedContents
- Resource: CachedContent
- Content
- Part
- Blob
- FunctionCall
- FunctionResponse
- FileData
- ExecutableCode
- Language
- CodeExecutionResult
- Outcome
- Tool
- FunctionDeclaration
- Schema
- Type
- GoogleSearchRetrieval
- DynamicRetrievalConfig
- Mode
- CodeExecution
- ToolConfig
- FunctionCallingConfig
- Mode
- UsageMetadata
- Methods
Resource: CachedContent
Content that has been preprocessed and can be used in subsequent request to GenerativeService.
Cached content can be only used with model it was created for.
Optional. Input only. Immutable. The content to cache.
Optional. Input only. Immutable. A list of Tools
the model may use to generate the next response
Output only. Creation time of the cache entry.
A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z"
and "2014-10-02T15:01:23.045123456Z"
.
Output only. When the cache entry was last updated in UTC time.
A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z"
and "2014-10-02T15:01:23.045123456Z"
.
Output only. Metadata on the usage of the cached content.
expiration
Union type
expiration
can be only one of the following:Timestamp in UTC of when this resource is considered expired. This is always provided on output, regardless of what was sent on input.
A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z"
and "2014-10-02T15:01:23.045123456Z"
.
Input only. New TTL for this resource, input only.
A duration in seconds with up to nine fractional digits, ending with 's
'. Example: "3.5s"
.
name
string
Optional. Identifier. The resource name referring to the cached content. Format: cachedContents/{id}
displayName
string
Optional. Immutable. The user-generated meaningful display name of the cached content. Maximum 128 Unicode characters.
model
string
Required. Immutable. The name of the Model
to use for cached content Format: models/{model}
Optional. Input only. Immutable. Developer set system instruction. Currently text only.
Optional. Input only. Immutable. Tool config. This config is shared for all tools.
JSON representation |
---|
{ "contents": [ { object ( |
Content
The base structured datatype containing multi-part content of a message.
A Content
includes a role
field designating the producer of the Content
and a parts
field containing multi-part data that contains the content of the message turn.
Ordered Parts
that constitute a single message. Parts may have different MIME types.
role
string
Optional. The producer of the content. Must be either 'user' or 'model'.
Useful to set for multi-turn conversations, otherwise can be left blank or unset.
JSON representation |
---|
{
"parts": [
{
object ( |
Part
A datatype containing media that is part of a multi-part Content
message.
A Part
consists of data which has an associated datatype. A Part
can only contain one of the accepted types in Part.data
.
A Part
must have a fixed IANA MIME type identifying the type and subtype of the media if the inlineData
field is filled with raw bytes.
data
Union type
data
can be only one of the following:text
string
Inline text.
Inline media bytes.
A predicted FunctionCall
returned from the model that contains a string representing the FunctionDeclaration.name
with the arguments and their values.
The result output of a FunctionCall
that contains a string representing the FunctionDeclaration.name
and a structured JSON object containing any output from the function is used as context to the model.
URI based data.
Code generated by the model that is meant to be executed.
Result of executing the ExecutableCode
.
JSON representation |
---|
{ // data "text": string, "inlineData": { object ( |
Blob
Raw media bytes.
Text should not be sent as raw bytes, use the 'text' field.
mimeType
string
The IANA standard MIME type of the source data. Examples: - image/png - image/jpeg If an unsupported MIME type is provided, an error will be returned. For a complete list of supported types, see Supported file formats.
Raw bytes for media formats.
A base64-encoded string.
JSON representation |
---|
{ "mimeType": string, "data": string } |
FunctionCall
A predicted FunctionCall
returned from the model that contains a string representing the FunctionDeclaration.name
with the arguments and their values.
name
string
Required. The name of the function to call. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 63.
Optional. The function parameters and values in JSON object format.
JSON representation |
---|
{ "name": string, "args": { object } } |
FunctionResponse
The result output from a FunctionCall
that contains a string representing the FunctionDeclaration.name
and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of aFunctionCall
made based on model prediction.
name
string
Required. The name of the function to call. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 63.
Required. The function response in JSON object format.
JSON representation |
---|
{ "name": string, "response": { object } } |
FileData
URI based data.
mimeType
string
Optional. The IANA standard MIME type of the source data.
fileUri
string
Required. URI.
JSON representation |
---|
{ "mimeType": string, "fileUri": string } |
ExecutableCode
Code generated by the model that is meant to be executed, and the result returned to the model.
Only generated when using the CodeExecution
tool, in which the code will be automatically executed, and a corresponding CodeExecutionResult
will also be generated.
Required. Programming language of the code
.
code
string
Required. The code to be executed.
JSON representation |
---|
{
"language": enum ( |
Language
Supported programming languages for the generated code.
Enums | |
---|---|
LANGUAGE_UNSPECIFIED |
Unspecified language. This value should not be used. |
PYTHON |
Python >= 3.10, with numpy and simpy available. |
CodeExecutionResult
Result of executing the ExecutableCode
.
Only generated when using the CodeExecution
, and always follows a part
containing the ExecutableCode
.
Required. Outcome of the code execution.
output
string
Optional. Contains stdout when code execution is successful, stderr or other description otherwise.
JSON representation |
---|
{
"outcome": enum ( |
Outcome
Enumeration of possible outcomes of the code execution.
Enums | |
---|---|
OUTCOME_UNSPECIFIED |
Unspecified status. This value should not be used. |
OUTCOME_OK |
Code execution completed successfully. |
OUTCOME_FAILED |
Code execution finished but with a failure. stderr should contain the reason. |
OUTCOME_DEADLINE_EXCEEDED |
Code execution ran for too long, and was cancelled. There may or may not be a partial output present. |
Tool
Tool details that the model may use to generate response.
A Tool
is a piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the model.
Optional. A list of FunctionDeclarations
available to the model that can be used for function calling.
The model or system does not execute the function. Instead the defined function may be returned as a FunctionCall
with arguments to the client side for execution. The model may decide to call a subset of these functions by populating FunctionCall
in the response. The next conversation turn may contain a FunctionResponse
with the Content.role
"function" generation context for the next model turn.
Optional. Retrieval tool that is powered by Google search.
Optional. Enables the model to execute code as part of generation.
JSON representation |
---|
{ "functionDeclarations": [ { object ( |
FunctionDeclaration
Structured representation of a function declaration as defined by the OpenAPI 3.03 specification. Included in this declaration are the function name and parameters. This FunctionDeclaration is a representation of a block of code that can be used as a Tool
by the model and executed by the client.
name
string
Required. The name of the function. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 63.
description
string
Required. A brief description of the function.
Optional. Describes the parameters to this function. Reflects the Open API 3.03 Parameter Object string Key: the name of the parameter. Parameter names are case sensitive. Schema Value: the Schema defining the type used for the parameter.
JSON representation |
---|
{
"name": string,
"description": string,
"parameters": {
object ( |
Schema
The Schema
object allows the definition of input and output data types. These types can be objects, but also primitives and arrays. Represents a select subset of an OpenAPI 3.0 schema object.
Required. Data type.
format
string
Optional. The format of the data. This is used only for primitive datatypes. Supported formats: for NUMBER type: float, double for INTEGER type: int32, int64 for STRING type: enum
description
string
Optional. A brief description of the parameter. This could contain examples of use. Parameter description may be formatted as Markdown.
nullable
boolean
Optional. Indicates if the value may be null.
enum[]
string
Optional. Possible values of the element of Type.STRING with enum format. For example we can define an Enum Direction as : {type:STRING, format:enum, enum:["EAST", NORTH", "SOUTH", "WEST"]}
Optional. Maximum number of the elements for Type.ARRAY.
Optional. Minimum number of the elements for Type.ARRAY.
Optional. Properties of Type.OBJECT.
An object containing a list of "key": value
pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }
.
required[]
string
Optional. Required properties of Type.OBJECT.
Optional. Schema of the elements of Type.ARRAY.
Type
Type contains the list of OpenAPI data types as defined by https://spec.openapis.org/oas/v3.0.3#data-types
Enums | |
---|---|
TYPE_UNSPECIFIED |
Not specified, should not be used. |
STRING |
String type. |
NUMBER |
Number type. |
INTEGER |
Integer type. |
BOOLEAN |
Boolean type. |
ARRAY |
Array type. |
OBJECT |
Object type. |
GoogleSearchRetrieval
Tool to retrieve public web data for grounding, powered by Google.
Specifies the dynamic retrieval configuration for the given source.
JSON representation |
---|
{
"dynamicRetrievalConfig": {
object ( |
DynamicRetrievalConfig
Describes the options to customize dynamic retrieval.
The mode of the predictor to be used in dynamic retrieval.
dynamicThreshold
number
The threshold to be used in dynamic retrieval. If not set, a system default value is used.
JSON representation |
---|
{
"mode": enum ( |
Mode
The mode of the predictor to be used in dynamic retrieval.
Enums | |
---|---|
MODE_UNSPECIFIED |
Always trigger retrieval. |
MODE_DYNAMIC |
Run retrieval only when system decides it is necessary. |
CodeExecution
This type has no fields.
Tool that executes code generated by the model, and automatically returns the result to the model.
See also ExecutableCode
and CodeExecutionResult
which are only generated when using this tool.
ToolConfig
The Tool configuration containing parameters for specifying Tool
use in the request.
Optional. Function calling config.
JSON representation |
---|
{
"functionCallingConfig": {
object ( |
FunctionCallingConfig
Configuration for specifying function calling behavior.
Optional. Specifies the mode in which function calling should execute. If unspecified, the default value will be set to AUTO.
allowedFunctionNames[]
string
Optional. A set of function names that, when provided, limits the functions the model will call.
This should only be set when the Mode is ANY. Function names should match [FunctionDeclaration.name]. With mode set to ANY, model will predict a function call from the set of function names provided.
JSON representation |
---|
{
"mode": enum ( |
Mode
Defines the execution behavior for function calling by defining the execution mode.
Enums | |
---|---|
MODE_UNSPECIFIED |
Unspecified function calling mode. This value should not be used. |
AUTO |
Default model behavior, model decides to predict either a function call or a natural language response. |
ANY |
Model is constrained to always predicting a function call only. If "allowedFunctionNames" are set, the predicted function call will be limited to any one of "allowedFunctionNames", else the predicted function call will be any one of the provided "functionDeclarations". |
NONE |
Model will not predict any function call. Model behavior is same as when not passing any function declarations. |
UsageMetadata
Metadata on the usage of the cached content.
totalTokenCount
integer
Total number of tokens that the cached content consumes.
JSON representation |
---|
{ "totalTokenCount": integer } |