Generate memories

Memory Bank lets you construct long-term memories from conversations between the user and your agent. This page describes how memory generation works, how you can customize how memories are extracted, and how to trigger memory generation.

Understanding memory generation

Memory Bank extracts memories from source data and self-curates memories for a specific collection of memories (defined by a scope) by adding, updating, and removing memories over time.

When you trigger memory generation, Memory Bank performs the following operations:

  • Extraction: Extracts information about the user from their conversations with the agent. Only information that matches at least one of your instance's memory topics will be persisted.

  • Consolidation: Identifies if existing memories with the same scope should be deleted or updated based on the extracted information. Memory Bank checks that new memories are not duplicative or contradictory before merging them with existing memories. If existing memories don't overlap with the new information, a new memory will be created.

Memories can only be extracted from text, inline files, and file data in the source content. All other content, including function calls and responses, are ignored when generating memories.

Memories can be extracted from images, video, and audio provided by the user. If the context provided by the multimodal input is judged by Memory Bank to be meaningful for future interactions, then a textual memory may be created including information extracted from the input. For example, if the user provides an image of a golden retriever with the text "This is my dog" then Memory Bank generates a memory such as "My dog is a golden retriever."

Memory topics

"Memory topics" identify what information Memory Bank considers to be meaningful and should thus be persisted as generated memories. Memory Bank supports two types of memory topics:

  • Managed topics: Label and instructions are defined by Memory Bank. You only need to provide the name of the managed topic. For example:

    Dictionary

    memory_topic = {
        "managed_memory_topic": {
            "managed_topic_enum": "USER_PERSONAL_INFO"
        }
    }
    

    Class-based

    from vertexai.types import ManagedTopicEnum
    from vertexai.types import MemoryBankCustomizationConfigMemoryTopic as MemoryTopic
    from vertexai.types import MemoryBankCustomizationConfigMemoryTopicManagedMemoryTopic as ManagedMemoryTopic
    
    memory_topic = MemoryTopic(
        managed_memory_topic=ManagedMemoryTopic(
            managed_topic_enum=ManagedTopicEnum.USER_PERSONAL_INFO
        )
    )
    
  • Custom topics: Label and instructions are defined by you when setting up your Memory Bank instance. They will be used in the prompt for Memory Bank's extraction step. For example:

    Dictionary

    memory_topic = {
        "custom_memory_topic": {
            "label": "business_feedback",
            "description": """Specific user feedback about their experience at
    the coffee shop. This includes opinions on drinks, food, pastries, ambiance,
    staff friendliness, service speed, cleanliness, and any suggestions for
    improvement."""
        }
    }
    

    Class-based

    from vertexai.types import MemoryBankCustomizationConfigMemoryTopic as MemoryTopic
    from vertexai.types import MemoryBankCustomizationConfigMemoryTopicCustomMemoryTopic as CustomMemoryTopic
    
    memory_topic = MemoryTopic(
        custom_memory_topic=CustomMemoryTopic(
            label="business_feedback",
            description="""Specific user feedback about their experience at
    the coffee shop. This includes opinions on drinks, food, pastries, ambiance,
    staff friendliness, service speed, cleanliness, and any suggestions for
    improvement."""
        )
    )
    

    When using custom topics, it's recommended to also provide few-shot examples to demonstrate how memories should be extracted from your conversation.

By default, Memory Bank persists all of the following managed topics:

  • Personal information (USER_PERSONAL_INFO): Significant personal information about the user, like names, relationships, hobbies, and important dates. For example, "I work at Google" or "My wedding anniversary is on December 31".
  • User preferences (USER_PREFERENCES): Stated or implied likes, dislikes, preferred styles, or patterns. For example, "I prefer the middle seat."
  • Key conversation events and task outcomes (KEY_CONVERSATION_DETAILS): Important milestones or conclusions within the dialogue. For example, "I booked plane tickets for a round trip between JFK and SFO. I leave on June 1, 2025 and return on June 7, 2025."
  • Explicit remember / forget instructions (EXPLICIT_INSTRUCTIONS): Information that the user explicitly asks the agent to remember or forget. For example, if the user says "Remember that I primarily use Python," Memory Bank generates a memory such as "I primarily use Python."

This is equivalent to using the following set of managed memory topics:

Dictionary

  memory_topics = [
      {"managed_memory_topic": {"managed_topic_enum": "USER_PERSONAL_INFO"}},
      {"managed_memory_topic": {"managed_topic_enum": "USER_PREFERENCES"}},
      {"managed_memory_topic": {"managed_topic_enum": "KEY_CONVERSATION_DETAILS"}},
      {"managed_memory_topic": {"managed_topic_enum": "EXPLICIT_INSTRUCTIONS"}},
  ]

Class-based

from vertexai.types import ManagedTopicEnum
from vertexai.types import MemoryBankCustomizationConfigMemoryTopic as MemoryTopic
from vertexai.types import MemoryBankCustomizationConfigMemoryTopicManagedMemoryTopic as ManagedMemoryTopic

memory_topics = [
  MemoryTopic(
      managed_memory_topic=ManagedMemoryTopic(
          managed_topic_enum=ManagedTopicEnum.USER_PERSONAL_INFO)),
  MemoryTopic(
      managed_memory_topic=ManagedMemoryTopic(
          managed_topic_enum=ManagedTopicEnum.USER_PREFERENCES)),
  MemoryTopic(
      managed_memory_topic=ManagedMemoryTopic(
          managed_topic_enum=ManagedTopicEnum.KEY_CONVERSATION_DETAILS)),
  MemoryTopic(
      managed_memory_topic=ManagedMemoryTopic(
          managed_topic_enum=ManagedTopicEnum.EXPLICIT_INSTRUCTIONS)),
]

If you want to customize what topics Memory Bank persists, set the memory topics in your customization configuration when setting up Memory Bank.

Generating Memories

To complete the steps demonstrated in this guide, you must first follow the steps in Set up for Memory Bank.

You can trigger memory generation using GenerateMemories at the end of a session or at regular intervals within a session. Memory generation extracts key context from source conversations and combines it with existing memories for the same scope. For example, you can create session-level memories by using a scope such as {"user_id": "123", "session_id": "456"}. Memories with the same scope can be consolidated and retrieved together.

When calling GenerateMemories, you must provide the source conversation through Agent Engine Sessions or directly through JSON format:

JSON format

Provide the source conversation directly in JSON format if you're using a different session storage from Agent Engine Sessions:

client.agent_engines.generate_memories(
    name=agent_engine.api_resource.name,
    direct_contents_source={
      "events": EVENTS
    },
    scope=SCOPE,
    config={
        "wait_for_completion": True
    }
)

Replace the following:

  • EVENTS: List of Content dictionaries. For example:
[
  {
    "content": {
      "role": "user",
      "parts": [
        {"text": "I'm work with LLM agents!"}
      ]
    }
  }
]
  • SCOPE: A dictionary, representing the scope of the generated memories. For example, {"session_id": "MY_SESSION"}. Only memories with the same scope are considered for consolidation.

Agent Engine Sessions

With Agent Engine Sessions, Memory Bank uses session events as the source conversation for memory generation.

To scope the generated memories, Memory Bank extracts and uses the user ID from the session by default. For example, the memories' scope is stored as {"user_id": "123"} if the session's user_id is "123". You can also provide a scope directly, which overrides using the session's user_id as the scope.

client.agent_engines.generate_memories(
  name=agent_engine.api_resource.name,
  vertex_session_source={
    "session": "SESSION_NAME"
  },
  # Optional when using Agent Engine Sessions. Defaults to {"user_id": session.user_id}.
  scope=SCOPE,
  config={
      "wait_for_completion": True
  }
)

Replace the following:

  • SESSION_NAME: The session name.

  • (Optional) SCOPE: A dictionary, representing the scope of the generated memories. For example, {"session_id": "MY_SESSION"}. Only memories with the same scope are considered for consolidation. If not provided, {"user_id": session.user_id} is used.

GenerateMemories is a long-running operation. Once the operation is done, the AgentEngineGenerateMemoriesOperation will contain a list of generated memories, if any are generated:

AgentEngineGenerateMemoriesOperation(
  name="projects/.../locations/.../reasoningEngines/.../operations/...",
  done=True,
  response=GenerateMemoriesResponse(
    generatedMemories=[
      GenerateMemoriesResponseGeneratedMemory(
        memory=Memory(
          "name": "projects/.../locations/.../reasoningEngines/.../memories/..."
        ),
        action="CREATED",
      ),
      GenerateMemoriesResponseGeneratedMemory(
        memory=Memory(
          "name": "projects/.../locations/.../reasoningEngines/.../memories/..."
        ),
        action="UPDATED",
      ),
      GenerateMemoriesResponseGeneratedMemory(
        memory=Memory(
          "name": "projects/.../locations/.../reasoningEngines/.../memories/..."
        ),
        action="DELETED",
      ),
    ]
  )
)

Each generated memory includes the action that was performed on that memory:

  • CREATED: Indicates that a new memory was added, representing a novel concept that wasn't captured by existing memories.
  • UPDATED: Indicates that an existing memory was updated, which happens if the memory covered similar concepts as the newly extracted information. The memory's fact may be updated with new information or remain the same.
  • DELETED: Indicates that the existing memory was deleted, because its information was contradictory to new information extracted from the conversation.

For CREATED or UPDATED memories, you can use GetMemories to retrieve the full content of the memory. Retrieving DELETED memories results in a 404 error.

Generating memories in the background

GenerateMemories is a long-running operation. By default, client.agent_engines.generate_memories is a blocking function and will continue polling the operation until it's done. Executing memory generation as a blocking operation is helpful when you want to manually inspect generated memories or when you want to notify end users about what memories were generated.

However, for production agents, you generally want to run memory generation in the background as an asynchronous process. In most cases, the client doesn't need to use the output for the current run, so it's unnecessary to incur additional latency waiting for a response. If you want memory generation to execute in the background, set wait_for_completion to False:

client.agent_engines.generate_memories(
    ...,
    config={
        "wait_for_completion": False
    }
)

Consolidating pre-extracted memories

As an alternative to using Memory Bank's automatic extraction process, you can directly provide pre-extracted memories. Direct source memories will be consolidated with existing memories for the same scope. This can be useful for when you want your agent or a human-in-the-loop to be responsible for extracting memories, but you still want to take advantage of Memory Bank's consolidation to ensure there are no duplicate or contradictory memories.

client.agent_engines.generate_memories(
    name=agent_engine.api_resource.name,
    direct_memories_source={"direct_memories": [{"fact": "FACT"}]},
    scope=SCOPE
)

Replace the following:

  • FACT: The pre-extracted fact that should be consolidated with existing memories. You can provide up to 5 pre-extracted facts in a list like the following:

    {"direct_memories": [{"fact": "fact 1"}, {"fact": "fact 2"}]}
    
  • SCOPE: A dictionary, representing the scope of the generated memories. For example, {"session_id": "MY_SESSION"}. Only memories with the same scope are considered for consolidation.

What's next