Memory Schema

JSON Schema for memory configuration

Memory Configuration Schema
Config defines the structure for a memory resource configuration.
resource
string

Resource type identifier, must be "memory". This field is used by the autoloader system to identify and properly register this configuration as a memory resource.

id
string

ID is the unique identifier for this memory resource within the project. This ID is used by agents to reference the memory in their configuration.

  • Examples: "user_conversation", "session_context", "agent_workspace"
description
string

Description provides a human-readable explanation of the memory resource's purpose. This helps developers understand what kind of data this memory stores and how it should be used within workflows.

version
string

Version allows tracking changes to the memory resource definition. Can be used for migration strategies when memory schema evolves. Format: semantic versioning (e.g., "1.0.0", "2.1.0-beta")

type
string

Type indicates the primary memory management strategy:

  • "token_based": Manages memory based on token count limits (recommended for LLM contexts)
  • "message_count_based": Manages memory based on message count limits
  • "buffer": Simple buffer that stores messages up to a limit without sophisticated eviction
max_tokens
integer

MaxTokens is the hard limit on the number of tokens this memory can hold. Only applicable when Type is "token_based". When this limit is reached, the flushing strategy determines how to make room for new content.

  • Example: 4000 (roughly equivalent to ~3000 words)
max_messages
integer

MaxMessages is the hard limit on the number of messages this memory can store. Applicable for "message_count_based" type or as a secondary limit for "token_based".

  • Example: 100 (keeps last 100 messages in conversation)
max_context_ratio
number

MaxContextRatio specifies the maximum portion of an LLM's context window this memory should use. Value between 0 and 1. Dynamically calculates MaxTokens based on the model's context window.

  • Example: 0.5 means use at most 50% of the model's context window for memory, leaving the rest for system prompts and current task context.
token_allocation
object

TokenAllocation defines how the token budget is distributed across different categories. Only applicable for token_based memory type. All percentages must sum to 1.0.

token_allocation:
  short_term: 0.6  # 60% for recent messages
  long_term: 0.3   # 30% for summarized context
  system: 0.1      # 10% for system prompts

flushing
object

Flushing defines how memory is managed when limits are approached or reached. Available strategies:

  • "simple_fifo": Removes oldest messages first (fastest, no LLM required)
  • "lru": Removes least recently used messages (tracks access patterns)
  • "hybrid_summary": Summarizes old messages before removal (requires LLM, preserves context)
  • "token_aware_lru": LRU that considers token cost of messages (optimizes token usage)

persistence
object

Persistence defines how memory instances are persisted beyond process lifetime. Required field that specifies storage backend and retention policy. Supported backends:

  • "redis": Production-grade persistence with distributed locking and TTL support
  • "in_memory": Testing/development only, data lost on restart

privacy_policy
object

PrivacyPolicy defines rules for handling sensitive data within this memory. Can specify redaction patterns, non-persistable message types, and custom redaction strings for compliance with data protection regulations.

privacy_policy:
  redact_patterns: ["\\b\\d{3}-\\d{2}-\\d{4}\\b"]  # SSN pattern
  non_persistable_message_types: ["payment_info"]
  default_redaction_string: "[REDACTED]"

locking
object

Locking configures distributed lock timeouts for concurrent memory operations. Critical for preventing race conditions when multiple agents access the same memory. Timeouts can be configured per operation type:

  • append_ttl: Timeout for adding new messages (default: 30s)
  • clear_ttl: Timeout for clearing memory (default: 10s)
  • flush_ttl: Timeout for flush operations (default: 5m)

token_provider
object

TokenProvider configures provider-specific token counting for accurate limits. Supports OpenAI, Anthropic, and other providers with their specific tokenizers. Can specify API keys for real-time token counting or fallback strategies.

Resources

On this page