Runtimesrc/core/memory.rs ↗

Memory

Long-term memory across sessions: the MemoryService trait, the in-memory backend, and the load_memory and preload_memory tools.

Sessions remember one conversation; memory remembers across them. A MemoryService ingests completed sessions into a long-term store and answers free-text queries with snippets — either on demand through the load_memory tool, or automatically via preload_memory at the start of every turn.

MemoryEntry and SearchMemoryResponse

The data model in adk_rs::core::memory is deliberately small. A MemoryEntry carries the recalled snippet as a Content plus optional provenance mirrored from the original event:

Field	Type	Meaning
`content`	`Content`	The memory content, typically a text part.
`author`	`Option<String>`	Author of the original event (`"user"` or an agent name).
`timestamp`	`Option<f64>`	Original event timestamp in seconds.

Searches return a SearchMemoryResponse { memories: Vec<MemoryEntry> } envelope.

The MemoryService trait

adk_rs::core::MemoryService has exactly two methods. Ingestion is explicit — nothing is written to memory automatically; you (or the server endpoint) decide when a session is worth remembering.

async fn add_session_to_memory(&self, session: &Session) -> Result<()>: Index a session’s events into long-term memory.
async fn search_memory(&self, app_name: &str, user_id: &str, query: &str) -> Result<SearchMemoryResponse>: Search the (app, user) store for entries matching query.

InMemoryMemoryService

The simplest bundled backend, adk_rs::services::mem::InMemoryMemoryService, keeps one bucket per (app_name, user_id). add_session_to_memory walks the session’s events and stores one MemoryEntry per event with non-empty text content, preserving the author and timestamp. search_memory is a case-insensitive substring match over each entry’s text — good enough for tests and quickstarts. For semantic recall, use VectorMemoryService below.

VectorMemoryService (semantic search)

adk_rs::services::mem::VectorMemoryService swaps substring matching for embedding-based retrieval. Entries are embedded once at ingest time through the Embedder trait; each search embeds the query and ranks entries by cosine similarity, returning the top k above an optional similarity floor. Storage is still process-local — the upgrade is retrieval quality, not durability.

trait Embedder { async fn embed(&self, texts: &[String]) -> Result<Vec<Vec<f32>>> }: Batch text → vector. Implementations ship with the provider features: GeminiEmbedder (gemini) and OpenAiEmbedder (openai); implement it yourself to bridge any other backend. Exported from adk_rs::core.
VectorMemoryService::new(embedder: Arc<dyn Embedder>) -> Self: Construct with defaults: top 5 results, no similarity floor.
with_top_k(self, k: usize) -> Self: Maximum results per search.
with_min_score(self, score: f32) -> Self: Minimum cosine similarity (in [-1, 1]) for an entry to be returned.

Semantic memory with a Gemini embedderrust

use adk_rs::providers::gemini::GeminiEmbedder;
use adk_rs::services::mem::VectorMemoryService;
use std::sync::Arc;

let memory = VectorMemoryService::new(
    Arc::new(GeminiEmbedder::from_env("gemini-embedding-001")?),
)
.with_top_k(5)
.with_min_score(0.3);

// Drop-in replacement for InMemoryMemoryService:
let runner = Runner::builder()
    .app_name("hotel")
    .agent(agent)
    .session_service(sessions)
    .memory_service(Arc::new(memory))
    .build()?;

The load_memory tool (active recall)

adk_rs::tools::load_memory_tool() returns a tool the model calls when it decides it needs prior context. It declares a single required query string parameter, runs search_memory for the current (app, user), and returns { "memories": [...] }. It errors with a config error if the runner has no memory service.

The preload_memory tool (passive recall)

adk_rs::tools::preload_memory_tool(max_entries) is a passive tool: its declaration() is None, so it is never advertised to the model and cannot be called. Instead it implements process_llm_request, which runs at turn start: it queries memory with the invocation’s user content as the search text and, when there are hits, appends a Relevant prior context: bullet list (capped at max_entries) to the request’s system text. It silently does nothing when no memory service is configured, the user content is empty, or the search returns no entries.

Tool	Trigger	Effect
`load_memory`	Model issues a function call with a `query`.	Returns matching `MemoryEntry` values as the tool result.
`preload_memory`	Every turn, before the LLM call.	Inlines up to `max_entries` matching snippets into the system prompt.

Ingesting via the HTTP server

With the server feature, PATCH /apps/:app/users/:user/memory with body { "sessionId": "..." } loads the named session and passes it to add_session_to_memory. It returns 400 when no memory service is configured and 404 when the session does not exist. See Server.

Example: wiring it together

Memory service + load_memory + explicit ingestionrust

use adk_rs::agents::LlmAgent;
use adk_rs::core::{GetSessionConfig, MemoryService, SessionService};
use adk_rs::providers::gemini::Gemini;
use adk_rs::runner::Runner;
use adk_rs::services::mem::{InMemoryMemoryService, InMemorySessionService};
use adk_rs::tools::{load_memory_tool, preload_memory_tool};
use futures::StreamExt;
use std::sync::Arc;

#[tokio::main]
async fn main() -> adk_rs::Result<()> {
    let sessions: Arc<dyn SessionService> = Arc::new(InMemorySessionService::new());
    let memory: Arc<dyn MemoryService> = Arc::new(InMemoryMemoryService::new());

    let agent = LlmAgent::builder("concierge")
        .model(Arc::new(Gemini::from_env("gemini-2.5-flash")?))
        .instruction("Recall prior conversations with load_memory when useful.")
        .tool(load_memory_tool())
        .tool(preload_memory_tool(5))
        .build()?;

    let runner = Runner::builder()
        .app_name("hotel")
        .agent(Arc::new(agent))
        .session_service(sessions.clone())
        .memory_service(memory.clone())
        .auto_create_session(true)
        .build()?;

    // First conversation.
    let s1 = runner
        .run("alice", Some("trip-1"), "I prefer rooms on high floors.")
        .await?;
    s1.collect::<Vec<_>>().await;

    // Ingest the finished session into long-term memory.
    let session = sessions
        .get_session("hotel", "alice", "trip-1", GetSessionConfig::default())
        .await?
        .expect("session exists");
    memory.add_session_to_memory(&session).await?;

    // A later session can now recall the preference.
    let mut s2 = runner
        .run("alice", Some("trip-2"), "Book me a room like last time.")
        .await?;
    while let Some(event) = s2.next().await {
        if let Some(content) = event?.response.content {
            println!("{}", content.text_concat());
        }
    }
    Ok(())
}

Sessions & state — short-term, per-conversation storage.
Built-in tools — the rest of the bundled toolset.
The Runner — wiring memory_service into the builder.
Server — the memory ingest endpoint.

Memory

§MemoryEntry and SearchMemoryResponse

§The MemoryService trait

§InMemoryMemoryService

§VectorMemoryService (semantic search)

§The load_memory tool (active recall)

§The preload_memory tool (passive recall)

§Ingesting via the HTTP server

§Example: wiring it together

§Related pages

MemoryEntry and SearchMemoryResponse

The MemoryService trait

InMemoryMemoryService

VectorMemoryService (semantic search)

The load_memory tool (active recall)

The preload_memory tool (passive recall)

Ingesting via the HTTP server

Example: wiring it together

Related pages