Generate memories

Memory Bank lets you construct long-term memories from conversations between the user and your agent. This page describes how memory generation works, how you can customize how memories are extracted, and how to trigger memory generation.

To complete the steps demonstrated in this guide, you must first follow the steps in Set up for Memory Bank.

Understanding memory generation

Memory Bank extracts memories from source data and self-curates memories for a specific collection of memories (defined by a scope) by adding, updating, and removing memories over time.

When you trigger memory generation, Memory Bank performs the following operations:

Extraction: Extracts information about the user from their conversations with the agent. Only information that matches at least one of your instance's memory topics will be persisted.
Consolidation: Identifies if existing memories with the same scope should be deleted or updated based on the extracted information. Memory Bank checks that new memories are not duplicative or contradictory before merging them with existing memories. If existing memories don't overlap with the new information, a new memory will be created.

Memory topics

"Memory topics" identify what information Memory Bank considers to be meaningful and should thus be persisted as generated memories. Memory Bank supports two types of memory topics:

Managed topics: Label and instructions are defined by Memory Bank. You only need to provide the name of the managed topic. For example:

Dictionary

memory_topic = {
    "managed_memory_topic": {
        "managed_topic_enum": "USER_PERSONAL_INFO"
    }
}

Class-based

from vertexai.types import ManagedTopicEnum
from vertexai.types import MemoryBankCustomizationConfigMemoryTopic as MemoryTopic
from vertexai.types import MemoryBankCustomizationConfigMemoryTopicManagedMemoryTopic as ManagedMemoryTopic

memory_topic = MemoryTopic(
    managed_memory_topic=ManagedMemoryTopic(
        managed_topic_enum=ManagedTopicEnum.USER_PERSONAL_INFO
    )
)

Custom topics: Label and instructions are defined by you when setting up your Memory Bank instance. They will be used in the prompt for Memory Bank's extraction step. For example:

Dictionary

memory_topic = {
    "custom_memory_topic": {
        "label": "business_feedback",
        "description": """Specific user feedback about their experience at
the coffee shop. This includes opinions on drinks, food, pastries, ambiance,
staff friendliness, service speed, cleanliness, and any suggestions for
improvement."""
    }
}

Class-based

from vertexai.types import MemoryBankCustomizationConfigMemoryTopic as MemoryTopic
from vertexai.types import MemoryBankCustomizationConfigMemoryTopicCustomMemoryTopic as CustomMemoryTopic

memory_topic = MemoryTopic(
    custom_memory_topic=CustomMemoryTopic(
        label="business_feedback",
        description="""Specific user feedback about their experience at
the coffee shop. This includes opinions on drinks, food, pastries, ambiance,
staff friendliness, service speed, cleanliness, and any suggestions for
improvement."""
    )
)

When using custom topics, it's recommended to also provide few-shot examples to demonstrate how memories should be extracted from your conversation.

By default, Memory Bank persists all of the following managed topics:

Personal information (USER_PERSONAL_INFO): Significant personal information about the user, like names, relationships, hobbies, and important dates. For example, "I work at Google" or "My wedding anniversary is on December 31".
User preferences (USER_PREFERENCES): Stated or implied likes, dislikes, preferred styles, or patterns. For example, "I prefer the middle seat."
Key conversation events and task outcomes (KEY_CONVERSATION_DETAILS): Important milestones or conclusions within the dialogue. For example, "I booked plane tickets for a round trip between JFK and SFO. I leave on June 1, 2025 and return on June 7, 2025."
Explicit remember / forget instructions (EXPLICIT_INSTRUCTIONS): Information that the user explicitly asks the agent to remember or forget. For example, if the user says "Remember that I primarily use Python," Memory Bank generates a memory such as "I primarily use Python."

This is equivalent to using the following set of managed memory topics:

Dictionary

  memory_topics = [
      {"managed_memory_topic": {"managed_topic_enum": "USER_PERSONAL_INFO"}},
      {"managed_memory_topic": {"managed_topic_enum": "USER_PREFERENCES"}},
      {"managed_memory_topic": {"managed_topic_enum": "KEY_CONVERSATION_DETAILS"}},
      {"managed_memory_topic": {"managed_topic_enum": "EXPLICIT_INSTRUCTIONS"}},
  ]

Class-based

from vertexai.types import ManagedTopicEnum
from vertexai.types import MemoryBankCustomizationConfigMemoryTopic as MemoryTopic
from vertexai.types import MemoryBankCustomizationConfigMemoryTopicManagedMemoryTopic as ManagedMemoryTopic

memory_topics = [
  MemoryTopic(
      managed_memory_topic=ManagedMemoryTopic(
          managed_topic_enum=ManagedTopicEnum.USER_PERSONAL_INFO)),
  MemoryTopic(
      managed_memory_topic=ManagedMemoryTopic(
          managed_topic_enum=ManagedTopicEnum.USER_PREFERENCES)),
  MemoryTopic(
      managed_memory_topic=ManagedMemoryTopic(
          managed_topic_enum=ManagedTopicEnum.KEY_CONVERSATION_DETAILS)),
  MemoryTopic(
      managed_memory_topic=ManagedMemoryTopic(
          managed_topic_enum=ManagedTopicEnum.EXPLICIT_INSTRUCTIONS)),
]

If you want to customize what topics Memory Bank persists, set the memory topics in your customization configuration when setting up Memory Bank.

Triggering memory generation

You can trigger memory generation using GenerateMemories at the end of a session or at regular intervals within a session. Memory generation extracts key context from source conversations and combines it with existing memories for the same scope. For example, you can create session-level memories by using a scope such as {"user_id": "123", "session_id": "456"}. Memories with the same scope can be consolidated and retrieved together.

GenerateMemories is a long-running operation. Once the operation is done, the AgentEngineGenerateMemoriesOperation contains a list of generated memories, if any are generated:

AgentEngineGenerateMemoriesOperation(
  name="projects/.../locations/.../reasoningEngines/.../operations/...",
  done=True,
  response=GenerateMemoriesResponse(
    generatedMemories=[
      GenerateMemoriesResponseGeneratedMemory(
        memory=Memory(
          "name": "projects/.../locations/.../reasoningEngines/.../memories/..."
        ),
        action="CREATED",
      ),
      GenerateMemoriesResponseGeneratedMemory(
        memory=Memory(
          "name": "projects/.../locations/.../reasoningEngines/.../memories/..."
        ),
        action="UPDATED",
      ),
      GenerateMemoriesResponseGeneratedMemory(
        memory=Memory(
          "name": "projects/.../locations/.../reasoningEngines/.../memories/..."
        ),
        action="DELETED",
      ),
    ]
  )
)

Each generated memory includes the action that was performed on that memory:

CREATED: Indicates that a new memory was added, representing a novel concept that wasn't captured by existing memories.
UPDATED: Indicates that an existing memory was updated, which happens if the memory covered similar concepts as the newly extracted information. The memory's fact may be updated with new information or remain the same.
DELETED: Indicates that the existing memory was deleted, because its information was contradictory to new information extracted from the conversation.

For CREATED or UPDATED memories, you can use GetMemories to retrieve the full content of the memory. Retrieving DELETED memories results in a 404 error.

Generating memories in the background

GenerateMemories is a long-running operation. By default, client.agent_engines.generate_memories is a blocking function that polls the operation until the operation completes. Executing memory generation as a blocking operation is helpful when you want to manually inspect generated memories or notify end users about what memories were generated.

However, for production agents, you generally want to run memory generation in the background as an asynchronous process. In most cases, the client doesn't need to use the output for the current run, so it's unnecessary to incur additional latency waiting for a response. If you want memory generation to execute in the background, set wait_for_completion to False:

client.agent_engines.memories.generate(
    ...,
    config={
        "wait_for_completion": False
    }
)

Data sources

There are multiple way to provide source data for memory generation:

Provide events directly in the payload.
Provide events using Vertex AI Agent Engine Sessions.
Provide pre-extracted facts to consolidate them with existing memories for the same scope.

When you provide events directly in the payload or use Vertex AI Agent Engine Sessions, information is extracted from the conversation and consolidated with existing memories. If you only want to extract information from these data sources, you can disable consolidation:

client.agent_engines.memories.generate(
    ...
    config={
        "disable_consolidation": True
    }
)

Using events in payload as the data source

Use direct_contents_source when you want to generate memories using events provided directly in the payload. Meaningful information is extracted from these events and consolidated with existing information for the same scope. This approach can be used if you're using a different session storage from Vertex AI Agent Engine Sessions.

Dictionary

The events should include Content dictionaries.

events =  [
  {
    "content": {
      "role": "user",
      "parts": [
        {"text": "I work with LLM agents!"}
      ]
    }
  }
]

client.agent_engines.memories.generate(
    name=agent_engine.api_resource.name,
    direct_contents_source={
      "events": EVENTS
    },
    # For example, `scope={"user_id": "123"}`.
    scope=SCOPE,
    config={
        "wait_for_completion": True
    }
)

Replace the following:

SCOPE: A dictionary, representing the scope of the generated memories. For example, {"session_id": "MY_SESSION"}. Only memories with the same scope are considered for consolidation.

Class-based

The events should include Content objects.

from google import genai
import vertexai

events = [
  vertexai.types.GenerateMemoriesRequestDirectContentsSourceEvent(
    content=genai.types.Content(
      role="user",
      parts=[
        genai.types.Part.from_text(text="I work with LLM agents!")
      ]
    )
  )
]

client.agent_engines.memories.generate(
    name=agent_engine.api_resource.name,
    direct_contents_source={
      "events": events
    },
    # For example, `scope={"user_id": "123"}`.
    scope=SCOPE,
    config={
        "wait_for_completion": True
    }
)

Replace the following:

SCOPE: A dictionary, representing the scope of the generated memories. For example, {"session_id": "MY_SESSION"}. Only memories with the same scope are considered for consolidation.

Using Vertex AI Agent Engine Sessions as the data source

With Agent Engine Sessions, Memory Bank uses session events as the source conversation for memory generation.

To scope the generated memories, Memory Bank extracts and uses the user ID from the session by default. For example, the memories' scope is stored as {"user_id": "123"} if the session's user_id is "123". You can also provide a scope directly, which overrides using the session's user_id as the scope.

Dictionary

client.agent_engines.memories.generate(
  name=agent_engine.api_resource.name,
  vertex_session_source={
      # For example, projects/.../locations/.../reasoningEngines/.../sessions/...
      "session": "SESSION_NAME"
  },
  # Optional when using Agent Engine Sessions. Defaults to {"user_id": session.user_id}.
  scope=SCOPE,
  config={
      "wait_for_completion": True
  }
)

Replace the following:

SESSION_NAME: The fully-qualified session name.
(Optional) SCOPE: A dictionary, representing the scope of the generated memories. For example, {"session_id": "MY_SESSION"}. Only memories with the same scope are considered for consolidation. If not provided, {"user_id": session.user_id} is used.

Class-based

client.agent_engines.memories.generate(
  name=agent_engine.api_resource.name,
  vertex_session_source=vertexai.types.GenerateMemoriesRequestVertexSessionSource(
      # For example, projects/.../locations/.../reasoningEngines/.../sessions/...
      session="SESSION_NAME"
  ),
  # Optional when using Agent Engine Sessions. Defaults to {"user_id": session.user_id}.
  scope=SCOPE,
  config={
      "wait_for_completion": True
  }
)

Optionally, you can provide a time range indicating which events in the Session should be included. If not provided, all events in the Session are included.

Dictionary

import datetime

client.agent_engines.memories.generate(
  name=agent_engine.api_resource.name,
  vertex_session_source={
      "session": "SESSION_NAME",
      # Extract memories from the last hour of events.
      "start_time": datetime.datetime.now(tz=datetime.timezone.utc) - datetime.timedelta(seconds=24 * 60),
      "end_time": datetime.datetime.now(tz=datetime.timezone.utc)
  },
  scope=SCOPE
)

Class-based

import datetime

client.agent_engines.memories.generate(
  name=agent_engine.api_resource.name,
  vertex_session_source=vertexai.types.GenerateMemoriesRequestVertexSessionSource(
      session="SESSION_NAME",
      # Extract memories from the last hour of events.
      start_time=datetime.datetime.now(tz=datetime.timezone.utc) - datetime.timedelta(seconds=24 * 60),
      end_time=datetime.datetime.now(tz=datetime.timezone.utc)
  ),
  scope=SCOPE
)

Consolidating pre-extracted memories

As an alternative to using Memory Bank's automatic extraction process, you can directly provide pre-extracted memories. Direct source memories will be consolidated with existing memories for the same scope. This can be useful for when you want your agent or a human-in-the-loop to be responsible for extracting memories, but you still want to take advantage of Memory Bank's consolidation to ensure there are no duplicate or contradictory memories.

client.agent_engines.memories.generate(
    name=agent_engine.api_resource.name,
    direct_memories_source={"direct_memories": [{"fact": "FACT"}]},
    scope=SCOPE
)

Replace the following:

FACT: The pre-extracted fact that should be consolidated with existing memories. You can provide up to 5 pre-extracted facts in a list like the following:
```
{"direct_memories": [{"fact": "fact 1"}, {"fact": "fact 2"}]}
```
SCOPE: A dictionary, representing the scope of the generated memories. For example, {"session_id": "MY_SESSION"}. Only memories with the same scope are considered for consolidation.

Using multimodal input

You can extract memories from multimodal input. However, memories are only extracted from text, inline files, and file data in the source content. All other content, including function calls and responses, are ignored when generating memories.

Memories can be extracted from images, video, and audio provided by the user. If the context provided by the multimodal input is judged by Memory Bank to be meaningful for future interactions, then a textual memory may be created including information extracted from the input. For example, if the user provides an image of a golden retriever with the text "This is my dog" then Memory Bank generates a memory such as "My dog is a golden retriever."

For example, you can provide an image and context for the image in the payload:

Dictionary

with open(file_name, "rb") as f:
    inline_data = f.read()

events =  [
  {
    "content": {
      "role": "user",
      "parts": [
        {"text": "This is my dog"},
        {
          "inline_data": {
            "mime_type": "image/jpeg",
            "data": inline_data
          }
        },
        {
          "file_data": {
            "file_uri": "gs://cloud-samples-data/generative-ai/image/dog.jpg",
            "mime_type": "image/jpeg"
          }
        },
      ]
    }
  }
]

Class-based

from google import genai
import vertexai

with open(file_name, "rb") as f:
    inline_data = f.read()

events = [
  vertexai.types.GenerateMemoriesRequestDirectContentsSourceEvent(
    content=genai.types.Content(
      role="user",
      parts=[
        genai.types.Part.from_text(text="This is my dog"),
        genai.types.Part.from_bytes(
          data=inline_data,
          mime_type="image/jpeg",
        ),
        genai.types.Part.from_uri(
          file_uri="gs://cloud-samples-data/generative-ai/image/dog.jpg",
          mime_type="image/jpeg",
        )
      ]
    )
  )
]

When using Vertex AI Agent Engine Sessions as the data source, the multimodal content is provided directly in the Session's events.

Generate memories

Understanding memory generation

Memory topics

Dictionary

Class-based

Dictionary

Class-based

Dictionary

Class-based

Triggering memory generation

Generating memories in the background

Data sources

Using events in payload as the data source

Dictionary

Class-based

Using Vertex AI Agent Engine Sessions as the data source

Dictionary

Class-based

Dictionary

Class-based

Consolidating pre-extracted memories

Using multimodal input

Dictionary

Class-based

What's next