Algorithm A · Retrieval fusion score
Every L3/L4 candidate returned from sqlite-vec ANN gets re-ranked by a weighted sum of 4 signals. The weights are in src/echovessel/memory/retrieve.py.
total = 0.5 · recency + 3.0 · relevance + 2.0 · |impact| + 1.0 · relational_bonus
Signal
Weight
Source
Formula
Range
R
recency
how fresh this memory is
exp(-ln(2) · days_since / 14) · 14-day half-life
[0, 1]
V
relevance
semantic similarity of query → candidate
1 - distance/2 · clamped · min-floor 0.4 drops orthogonal hits
[0, 1]
I
|impact|
how emotionally loaded the memory is
min(|emotional_impact| / 10, 1.0)
[0, 1]
G
relational_bonus
has the node got relational_tags?
0.5 if relational_tags non-empty, else 0
{0, 0.5}
Why these weights? Relevance (3.0) dominates — what a memory is about matters most. Impact (2.0) ensures peak emotional moments surface even when their semantics drift from the query. Recency (0.5) gently prefers fresh memories without drowning old-but-important ones. Relational bonus (1.0) pulls in graph-connected nodes. The min-relevance floor of 0.4 is important: without it, strictly orthogonal candidates would occasionally bubble up on pure |impact| × relational_bonus, causing false-positive recall. With the floor, truly unrelated memories can't enter the ranked set even if they're emotionally loaded.
Algorithm B · Strong-emotion override (consolidation)
When deciding whether a session is "trivial" (skip extraction), the session below threshold can still get promoted to L3 if it contains any strong-emotion keyword. This is a keyword-matched safety net for peak emotional moments that happen in short sessions.
Category
zh keywords
en keywords
Why this list
Effect
💧
Bereavement / loss
走了 · 去世 · 死了 · 离世 · 葬礼 · 没了
died · passed away · funeral
loss events are almost always L3-worthy regardless of session length
override trivial gate
⚠️
Crisis
撑不住 · 不想活 · 活不下去 · 自杀 · 崩溃
can't go on · suicide · breakdown
safety-critical · never silently dropped
override trivial gate
🎭
Major milestones
分手 · 离婚 · 被裁
breakup · divorce · fired
large identity shifts · persona should remember even from a one-liner
override trivial gate
Logic: is_trivial(session, messages) returns True only when BOTH below-threshold AND no strong emotion keyword. _has_strong_emotion(messages) is a case-insensitive substring match — optimized for recall, not precision. False positives occasionally push a mundane sentence through extraction; that's an acceptable cost vs. losing a late-night single line about a breakup.
Algorithm C · Reflection triggers (shock · timer · hard gate)
Once a session clears extraction (L3 events exist), the next question is whether to run reflection (L4 thoughts). Three decision rules converge.
SHOCK_IMPACT_THRESHOLD
≥
If any freshly-extracted L3 event has |emotional_impact| ≥ 8 · force reflection NOW even if the timer hasn't elapsed. Rationale: peak moments should reshape persona's impression immediately, not wait 24 hours.
SHOCK_IMPACT_THRESHOLD = 8
memory/consolidate.py
TIMER_REFLECTION_HOURS
⏲
Even without a shock event · if > 24 hours have passed since the last reflection · run one. Keeps persona's long-term impressions slowly updating even during routine chats.
TIMER_REFLECTION_HOURS = 24
memory/consolidate.py
REFLECTION_HARD_LIMIT_24H
✕
Regardless of shock or timer · no more than 3 reflections per rolling 24-hour window. Prevents L4 explosion on chatty days or debugging sessions. The hardest of the three gates — wins over shock and timer.
REFLECTION_HARD_LIMIT_24H = 3
(configurable via consolidate.reflection_hard_gate_24h)
Decision order:
1. count reflections in last 24h → if ≥ 3, skip (hard gate wins)
2. any fresh event with |impact| ≥ 8? → reflect (shock path)
3. last reflection > 24h ago? → reflect (timer path)
4. otherwise → skip this session
Where does emotional_impact come from?
The emotional_impact signed integer on each L3 event isn't computed by a formula — it's produced by the extraction LLM itself. The prompt in prompts/extract.py instructs the model to rate -10 (catastrophic loss / grief) to +10 (peak joy / breakthrough), with 0 for mood-neutral facts. The algorithm layer here is the prompt engineering rather than a numerical rule, which is why tweaking it requires editing prompt text, not code.