Memory
Long-term memory across sessions: the MemoryService trait, the in-memory backend, and the load_memory and preload_memory tools.
Sessions remember one conversation; memory remembers across them. A MemoryService ingests completed sessions into a long-term store and answers free-text queries with snippets — either on demand through the load_memory tool, or automatically via preload_memory at the start of every turn.
MemoryEntry and SearchMemoryResponse
The data model in adk_rs::core::memory is deliberately small. A MemoryEntry carries the recalled snippet as a Content plus optional provenance mirrored from the original event:
| Field | Type | Meaning |
|---|---|---|
content | Content | The memory content, typically a text part. |
author | Option<String> | Author of the original event ("user" or an agent name). |
timestamp | Option<f64> | Original event timestamp in seconds. |
Searches return a SearchMemoryResponse { memories: Vec<MemoryEntry> } envelope.
The MemoryService trait
adk_rs::core::MemoryService has exactly two methods. Ingestion is explicit — nothing is written to memory automatically; you (or the server endpoint) decide when a session is worth remembering.
async fn add_session_to_memory(&self, session: &Session) -> Result<()>- Index a session’s events into long-term memory.
async fn search_memory(&self, app_name: &str, user_id: &str, query: &str) -> Result<SearchMemoryResponse>- Search the
(app, user)store for entries matchingquery.
InMemoryMemoryService
The simplest bundled backend, adk_rs::services::mem::InMemoryMemoryService, keeps one bucket per (app_name, user_id). add_session_to_memory walks the session’s events and stores one MemoryEntry per event with non-empty text content, preserving the author and timestamp. search_memory is a case-insensitive substring match over each entry’s text — good enough for tests and quickstarts. For semantic recall, use VectorMemoryService below.
VectorMemoryService (semantic search)
adk_rs::services::mem::VectorMemoryService swaps substring matching for embedding-based retrieval. Entries are embedded once at ingest time through the Embedder trait; each search embeds the query and ranks entries by cosine similarity, returning the top k above an optional similarity floor. Storage is still process-local — the upgrade is retrieval quality, not durability.
trait Embedder { async fn embed(&self, texts: &[String]) -> Result<Vec<Vec<f32>>> }- Batch text → vector. Implementations ship with the provider features:
GeminiEmbedder(gemini) andOpenAiEmbedder(openai); implement it yourself to bridge any other backend. Exported fromadk_rs::core. VectorMemoryService::new(embedder: Arc<dyn Embedder>) -> Self- Construct with defaults: top 5 results, no similarity floor.
with_top_k(self, k: usize) -> Self- Maximum results per search.
with_min_score(self, score: f32) -> Self- Minimum cosine similarity (in [-1, 1]) for an entry to be returned.
use adk_rs::providers::gemini::GeminiEmbedder;
use adk_rs::services::mem::VectorMemoryService;
use std::sync::Arc;
let memory = VectorMemoryService::new(
Arc::new(GeminiEmbedder::from_env("gemini-embedding-001")?),
)
.with_top_k(5)
.with_min_score(0.3);
// Drop-in replacement for InMemoryMemoryService:
let runner = Runner::builder()
.app_name("hotel")
.agent(agent)
.session_service(sessions)
.memory_service(Arc::new(memory))
.build()?;The load_memory tool (active recall)
adk_rs::tools::load_memory_tool() returns a tool the model calls when it decides it needs prior context. It declares a single required query string parameter, runs search_memory for the current (app, user), and returns { "memories": [...] }. It errors with a config error if the runner has no memory service.
The preload_memory tool (passive recall)
adk_rs::tools::preload_memory_tool(max_entries) is a passive tool: its declaration() is None, so it is never advertised to the model and cannot be called. Instead it implements process_llm_request, which runs at turn start: it queries memory with the invocation’s user content as the search text and, when there are hits, appends a Relevant prior context: bullet list (capped at max_entries) to the request’s system text. It silently does nothing when no memory service is configured, the user content is empty, or the search returns no entries.
| Tool | Trigger | Effect |
|---|---|---|
load_memory | Model issues a function call with a query. | Returns matching MemoryEntry values as the tool result. |
preload_memory | Every turn, before the LLM call. | Inlines up to max_entries matching snippets into the system prompt. |
Ingesting via the HTTP server
With the server feature, PATCH /apps/:app/users/:user/memory with body { "sessionId": "..." } loads the named session and passes it to add_session_to_memory. It returns 400 when no memory service is configured and 404 when the session does not exist. See Server.
Example: wiring it together
use adk_rs::agents::LlmAgent;
use adk_rs::core::{GetSessionConfig, MemoryService, SessionService};
use adk_rs::providers::gemini::Gemini;
use adk_rs::runner::Runner;
use adk_rs::services::mem::{InMemoryMemoryService, InMemorySessionService};
use adk_rs::tools::{load_memory_tool, preload_memory_tool};
use futures::StreamExt;
use std::sync::Arc;
#[tokio::main]
async fn main() -> adk_rs::Result<()> {
let sessions: Arc<dyn SessionService> = Arc::new(InMemorySessionService::new());
let memory: Arc<dyn MemoryService> = Arc::new(InMemoryMemoryService::new());
let agent = LlmAgent::builder("concierge")
.model(Arc::new(Gemini::from_env("gemini-2.5-flash")?))
.instruction("Recall prior conversations with load_memory when useful.")
.tool(load_memory_tool())
.tool(preload_memory_tool(5))
.build()?;
let runner = Runner::builder()
.app_name("hotel")
.agent(Arc::new(agent))
.session_service(sessions.clone())
.memory_service(memory.clone())
.auto_create_session(true)
.build()?;
// First conversation.
let s1 = runner
.run("alice", Some("trip-1"), "I prefer rooms on high floors.")
.await?;
s1.collect::<Vec<_>>().await;
// Ingest the finished session into long-term memory.
let session = sessions
.get_session("hotel", "alice", "trip-1", GetSessionConfig::default())
.await?
.expect("session exists");
memory.add_session_to_memory(&session).await?;
// A later session can now recall the preference.
let mut s2 = runner
.run("alice", Some("trip-2"), "Book me a room like last time.")
.await?;
while let Some(event) = s2.next().await {
if let Some(content) = event?.response.content {
println!("{}", content.text_concat());
}
}
Ok(())
}Related pages
- Sessions & state — short-term, per-conversation storage.
- Built-in tools — the rest of the bundled toolset.
- The Runner — wiring
memory_serviceinto the builder. - Server — the memory ingest endpoint.