Claude Code for Knowledge Work

And other learnings about building agents
And practical advice for building a personal agent

The Type of Work I Do

Understand

Customer calls
Competitive research
Data deep dives
Closed-lost deal reviews
Domain learning
Reading industry trends

→

Choose the Right Bets

Spec & memo writing
Prototypes to stress-test ideas
Prioritization & trade-offs
Data analysis
Stakeholder alignment

→

Bring to Market

Specs for engineering
Unblocking the team
GTM & launch planning
Writing code
Validating outcomes

Goals

No Repeated Context

When I tell the agent something, I don't have to repeat it session to session.

Optimal Context per Task

Each session loads exactly what it needs through progressive disclosure. Research shows LLM performance degrades as context grows.

Intern → Co‑worker

Intern: extends my work — does my tasks faster.

Co-worker: does work I didn't even imagine that pushes me and my team closer to our goals.

Basic Principle

My Space — only I write

~/Obsidian/
├── CLAUDE.md ← the harness
├── projects/
│ ├── construction/
│ ├── dev-api/
│ └── personal/
├── goals/
│ └── current/
├── resources/
│ └── my-ideas/
├── inbox.md
└── journal/

~/.claude/skills/
├── daily-brief/
├── deal-review/
├── process-meeting/
└── … 40+ skills

Agent Space — only agent writes

agent/
├── registry.md ← map of the vault
├── memories/
│ ├── self-assessment.md
│ ├── knowledge-gaps.md
│ └── skills/
├── events/
│ └── 2026-03-04.md
├── projects/ ← self-directed
└── scratch/ ← ephemeral

Tool Layer — MCPs · CLIs · Slack · Obsidian · Granola · Git

Guardrails live in the harness. The agent manages its own space freely within them.

The Harness: Two Layers

Goal: keep total harness under ~4k tokens

Layer 1 — Claude Code itself

Claude Code system prompt

Built-in instructions that ship with the tool

Ramp enterprise CLAUDE.md

Org-wide guardrails pushed to all managed Macs

Layer 2 — mine

~/.claude/CLAUDE.md

Personal global rules — applies everywhere

~/Obsidian/CLAUDE.md

Vault harness — orientation, memory protocol, guardrails

Skills loaded per task

Prompt templates injected on demand

All of these stack into one system prompt. Keep it lean — every token costs attention.

The Memory Layer

Just flat markdown files. No vector store, no database. Portable and inspectable.

Tiered retrieval

L1 — Hot Registry

~20 most-used files. Regenerated nightly. Agent reads this first for fast routing.

L2 — Full Registry

Complete routing table of the vault. Fallback when L1 doesn't cover it.

L3 — Memory Content

The actual files — self-assessment, knowledge gaps, skill calibrations. Loaded only when needed.

How it stays healthy

Provenance tags

observed Agent noticed something
corrected I confirmed or fixed it
Corrected entries never get pruned.

/sleep — nightly consolidation

Runs at midnight. Sets line budgets per memory file, then lets the agent freely organize within those constraints. Goal: optimize for progressive disclosure. Prunes stale, consolidates duplicates, enforces provenance.

Progressive disclosure

The agent decides its own memory structure — what to keep, how to organize — within the constraints. The harness just says: load the least context needed to do the task well.

The Agent Workspace

Event logging

agent/events/2026-03-04.md

sleep — 00:00
Pruned 20 entries, consolidated duplicates

daily-brief — 06:00
3 priorities, 7 Slack drafts, 3 meeting preps

independent-work — 06:00
Shipped construction PM brief

autopilot — 07:00, 10:00
Slack scans, inbox updates, no new urgents

Every session logs topic, actions, outcomes, and files accessed. Builds a queryable history.

/independent-work — self-directed projects

How it works

Scan goals, inbox, Slack, journals for opportunities
Triage candidates — max 3 active at a time
Execute — research, write, build, analyze
Ship artifact, surface to Teddy via DM

Constraints

· One hour budget per session
· Must produce a concrete artifact
· Must trace back to a goal
· Never send messages — draft only

The agent decides what to work on. I review what it shipped async.

Skills: Repeatable Workflows as Code

How to build a skill

01

Tell it what, not how

Define the goal and constraints. Let the model figure out the steps.

02

Build in progress tracking

Let the agent measure its own quality over time so you can see improvement.

03

Monitor closely first

Run the skill 4-5 times manually in real workflows. Give it feedback, let it iterate. Once you trust it, let it run autonomously.

Example: /daily-brief

What it does

Load goals + mental model for prioritization
Run through all inputs — Slack, calendar, inbox, news, weather, sports, lunch order, restaurant recs
Build recommended brief with top priorities and focus areas
Re-filter through original goals lens, then write the brief
Ask for feedback, track accuracy over time

Accuracy over time

Day 1 36%

Day 10 100%

Ran it manually ~10 times, gave feedback every run. Now it auto-runs at 6am.

The Slack Bot

Headless Claude Code processes with an identity layer on top

Architecture

Claude Code harness
├── CLAUDE.md ← vault rules
├── Skills ← 40+ workflows
├── Memories ← learned context
└── MCPs ← Slack, Clockwise, Linear…

+ Identity (a markdown file — voice, personality)
+ Bolt app (routing, scheduling, process mgmt)

What it does

Responds to DMs and @mentions
DMs me results of recurring tasks
Every hour, scans Slack for new signals
Surfaces completed follow-ups
Quick summary: what needs attention

Guardrails

DM: full vault, all skills + memories
Channels: project context only, restricted skills + memories, no personal data

~800 lines of JS glue. All the intelligence lives in the Claude Code harness.

Automated Processes

Independent Work

Runs daily at 9am.

Agent proposes its own tasks based on goals and open work. I review what it did async.

Proactive Scan

Monitors signals without being asked — Slack channels, deal signals, customer asks. Processes Granola transcripts. DMs me the results.

Sleep

Midnight nightly.

Consolidate memories. Prune stale entries. Validate org health.

Coming soon — still manually refining

Closed-lost deal reviews
Monitoring Gong for signals that match roadmap features
Updating Linear projects and Notion
Keeping internal documentation up to date

Limitations

01

Runs Locally

Everything runs on my laptop. If it's closed, the agent is dead. No cloud deployment yet.

02

Slack Permissions

The bot needs explicit approval for every scope — DMs, reading channels, posting messages. Each permission is a separate ask.

03

Cost

All those automated processes use tokens. It adds up.

Best Practices

Handcraft Your Harness

Claude is not good at writing its own harness — it doesn't understand its own limitations well enough. You need to handcraft this yourself. It's the highest-leverage thing you can do.

Tell It What, Not How

Set the goal, set the framing, set constraints, give it the tools it needs. Don't give step-by-step instructions. Always have a final calibration step where it reruns output through the original goals and constraints.

Refine Hands-On First

Your first version won't be good. Run it, watch it, refine it — with real workflows, not test cases. This is the hardest part of building for users too: you need hands-on sessions watching them run skills and refining together.

Measure Progress

During calibration, you need a way to evaluate whether you're hitting the outputs you want. I haven't fully nailed an evals framework, but the agent needs benchmarks to assess itself — like the daily brief accuracy score.

Starter Kit

I'm sharing a repo with everything you need to start.

CLAUDE.md Template

Annotated harness — your role, projects, rules, file layout, memory protocol. ~30 min to customize.

Agent Space Scaffold

Pre-built directory structure — registry, memories, events, scratch. Ready to go.

Four starter skills

Daily Brief

Morning priorities with calibration scoring

Proactive Scan

Hourly signal monitoring across channels

Sleep

Nightly memory consolidation and pruning

Independent Work

Self-directed goal-aligned projects

Start with Claude Code. Write your harness. Refine a handful of skills.
Protect your own space — the agent can draft for you, but your harness is your source of truth.

github.com/riker-t/claude-for-knowledge-work