Your agent follows instructions perfectly — but doesn't learn from results. OpenExp adds outcome-based learning: approaches that led to commits, closed deals, and shipped code surface first next time.
Every session makes the next one smarter. The same algorithm behind AlphaGo — applied to your AI's working memory.
Top memories injected, ranked by Q-value
Every action captured as observations
Session ends — was it productive?
Good session? Memories get higher scores
You wrote a skill once: "how to work with CRM." The agent follows it perfectly. But it doesn't know that approach A closed deals and approach B didn't. Tomorrow it'll do the same thing as yesterday — even if yesterday didn't work.
Your agent sent 200 emails this month. Which ones got replies? Which formulations closed deals? Which debugging approaches actually fixed bugs on the first try? Your skills don't know. There's no feedback loop.
Mem0, Zep, LangMem store and retrieve. But every memory is equally important. A critical decision and a random grep have the same weight. Storage without learning is just a database.
Your skills say how. OpenExp learns what actually works — from real results.
Every action in your Claude Code session — file edits, commits, commands, decisions — is automatically recorded. Hooks handle it. Zero manual work.
Before each response, the system finds the most relevant memories. Not by similarity alone — by proven usefulness. Five ranking signals.
After every session, the system evaluates what happened. Productive sessions reward the memories that were used. Empty sessions penalize them.
After each session, OpenExp checks what was produced and assigns a reward score.
| Session outcome | Reward |
|---|---|
| Code committed | +0.30 |
| Pull request created | +0.20 |
| Deployed to production | +0.10 |
| Tests passed | +0.10 |
| Deal closed (CRM) | +0.80 |
| Nothing produced | -0.10 |
One memory can be valuable in one context and worthless in another. Define what "productive" means for your workflow.
| Feature | OpenExp | Mem0 | Zep | LangMem |
|---|---|---|---|---|
| Learns from outcomes | Q-learning | No | No | No |
| Process-aware | Pipeline stages + signals | No | No | No |
| Memory type filtering | Reward only decisions | No | No | No |
| Hybrid retrieval | 5 signals | Vector only | Graph + vector | Vector only |
| Claude Code native | Zero-config hooks | Integration required | Integration required | Integration required |
| Fully local | Qdrant + FastEmbed | Cloud API | Cloud or self-hosted | Cloud API |
Not just "find similar text." Five signals weighted together. After 100 sessions, your retrieval is personalized by actual outcomes.
No data leaves your machine. All data lives under ~/.openexp/. You own everything.
Vector DB in a Docker container on your machine
Local embeddings, no API calls needed
JSON file on disk, fully inspectable
5-level audit trail from raw logs to LLM reasoning
Real questions from developers, founders, sales teams, and skeptics.
./setup.sh — the script creates a venv, starts Qdrant in Docker, creates the collection, copies .env, and registers the MCP server and hooks in Claude Code. Requires Python 3.11+ and Docker. No API key needed for core functionality — embeddings run locally via FastEmbed. First launch downloads the model (~1 min), then it’s cached.docker stop/rm the Qdrant container, (2) rm -rf ~/.openexp/, (3) remove the openexp block from ~/.claude/settings.local.json, (4) delete the openexp folder. Nothing installs system-wide, zero leftover files.OPENEXP_EXPERIENCE=sales. But honestly — these profiles are new and haven’t been battle-tested by many users yet. For other workflows you can create your own via openexp experience create.calibrate_experience_q. But by default the system is biased toward “visible productivity.”OPENEXP_EXPLANATION_ENABLED=false, doesn’t affect core functionality.restart: unless-stopped). Q-cache is also on disk. The only thing you might lose is observations from the current unfinished session.search, QCache, add_memory(). But you’ll need to: (1) capture observations instead of the PostToolUse hook, (2) determine session end and its productivity, (3) integrate retrieval into your pipeline. REST API or LangChain package — not available yet.OPENEXP_COLLECTION via .env for different projects.experience_insights shows the most valuable memory types, experience_top_memories shows top by Q-value, explain_q explains in plain language why a specific memory has its rating. But be realistic — the system needs time to accumulate data.Skills say how. OpenExp teaches what works. Open source. MIT license.