★
EchoVessel
Local-first digital persona engine with long-term memory, voice, and channel integrations — carry an echo long enough for it to become presence.
- Type
- CLI
- Role
- Solo
- Status
- Active
- Tech
- Python 3.11+ FastAPI SQLite + sqlite-vec React 19 Vite TypeScript sentence-transformers FishAudio TTS Whisper discord.py pytest
- Started
- Apr 2026
EchoVessel is an open-source digital persona engine. You define or distill a persona from your own settings and source material, then run it as a long-lived companion that remembers, speaks, and grows with you — instead of resetting after every reply.
The core idea: a persona shouldn’t feel like a new tab every time you open it. It should feel continuous.
Why this exists
Most chat tools treat memory as a vector dump and identity as a system prompt. The result is responsive, but never present. EchoVessel asks a different question: what does it take for a digital persona to feel like the same person tomorrow as it was today? The answer turned out to be a system, not a feature — one where memory, voice, and behavior all serve the same continuity.
What it actually does
Five modules cooperating inside a single local daemon:
- memory — long-term persona memory, hierarchical (L1–L4)
- voice — text-to-speech, speech-to-text, voice cloning
- proactive — autonomous outreach with policy gates
- channels — pluggable transports (Web, Discord, more on the way)
- runtime — the daemon that ties everything together
Open the full architecture diagram →
Memory is the heart of it
Most “AI memory” is a search problem: find the most similar past chunk, paste it into the prompt. EchoVessel treats memory as a structure — four layers, each answering a different question about the persona’s relationship with you.
| Layer | Question it answers | What it stores | Written when | Role at read time |
|---|---|---|---|---|
| L1 · core blocks | ”Who am I right now?” | Short, stable text — persona, self, mood, user, relationship | Manually, on admin edits, or via import | Always injected into the prompt, unconditionally |
| L2 · recall messages | ”What was literally said?” | Every user and persona message, verbatim, FTS5-indexed | On every turn, immediately | Ground-truth archive; expands context around L3 hits |
| L3 · events | ”What happened in that conversation?” | One-line episodic facts, emotional impact, tags, embedding | When a session closes (extraction pass) | The primary target of vector retrieval |
| L4 · thoughts | ”What do I believe about this person across many conversations?” | Longer-term observations distilled from many events | Reflection passes — after high-impact events or every 24h | Also vector-retrieved; keeps pointers back to source events |
How a memory gets picked
When the persona is about to reply, every L3/L4 candidate gets ranked by a four-factor score:
score = 0.5 · recency + 3.0 · relevance + 2.0 · impact + 1.0 · relational_bonus
- recency — exponential decay with a 14-day half-life
- relevance — vector similarity to the current query, normalized to
[0, 1] - impact —
|emotional_impact| / 10, so peak moments outweigh forgettable ones on ties - relational_bonus — a flat
+0.5whenever a memory carries an identity-bearing, vulnerability, turning-point, commitment, or correction tag
A min_relevance floor (default 0.4) drops orthogonal matches
before scoring, so a high-impact unrelated event cannot sneak in
on the back of the impact weight. The shape of this formula owes a
debt to the Stanford “Generative Agents” paper; the relational
tag bonus is the part tailored to persona memory.
The hard problem isn’t storage. It’s deciding what to remember, how to represent it, and when it should wake up and influence the next reply.
Open the interactive memory diagram →
How a single message wakes the system up
Every message triggers a small choreography across layers: which memories surface, which get written, which get distilled into longer-term form. The companion runtime-flow page traces this turn-by-turn against a real conversation.
Open the runtime flow diagram →
Voice as identity
Voice isn’t a TTS afterthought. Each persona has its own voice (cloned or selected) that speaks across every channel — including native Discord voice messages, indistinguishable from the bubble a human friend would send.
Relationships without affection meters
EchoVessel doesn’t have a “likeability score.” A persona’s bond with you is visible in behavior — tone shifts, naming changes, deeper recall, more initiative — not a progress bar.
Local-first by default
Your persona lives on your machine. The data file sits in
~/.echovessel/memory.db. The embedder runs locally. The only
network traffic is to the LLM endpoint you configure. No
telemetry, no phone-home, no gradual creep into the cloud.
Ethics & open source
EchoVessel is for fictional characters, original characters, your own self-persona, consented digital counterparts, and creative or memorial reconstructions. It is not an impersonation tool for pretending to be a real person in external communication.
It stays open-source because digital presence and intimate computing tools should not belong only to closed commercial platforms.
Name
EchoVessel — carry an echo long enough for it to become presence.