Case Study

Mental HealthAI0→1

From Crisis Counseling → Vera → Familiar AI

Building everyday emotional support

Beta users in 4 weeks

1K+

Signups

~100%

Onboarding completion

~30%

Day-14 retention

TL;DR

I started as a Crisis Text Line counselor and learned most late-night conversations aren't crises - they're people needing someone to listen. I built Vera (human peer support at ~$12/session vs. $150+ therapy), then Familiar AI once ChatGPT showed AI can carry emotional conversations. We went from team formation to 50 beta users in 4 weeks, grew 1K+ signups, and saw ~100% onboarding completion with ~30% day-14 retention among engaged testers. I paused mainly for team/product fit - I want a clinically experienced partner co-leading the care model, safety, and outcomes. Funding and polish matter; the clinical co-lead is non-negotiable.

The Problem I Couldn't Ignore

At 2 a.m. on Crisis Text Line, I heard the same themes: loneliness, relationship friction, everyday anxiety. Not ER-level - just very real - and nowhere affordable to turn. There's a gap between free crisis lines and $150+ therapy.

(Shout-out to What If VC Accelerator and the On Deck Health Fellowship - both sharpened my pattern recognition on what's viable, what isn't, and why.)

Act I: Vera (Human Peer Support)

Idea

Text or talk to a trained peer today for the price of lunch - $12, not $150.

What we built

Lightweight marketplace (profiles, availability, instant/scheduled sessions, simple follow-ups). 3 peer counselors, ~10 early users, real paid sessions.

Why it didn't scale

True on-demand = dense supply + routing + training/supervision/QA. At $12/session, unit economics didn't work for a tiny team. The need was right; the mechanism wasn't viable at our stage.

Act II: Familiar AI (Multimodal + Memory)

When ChatGPT landed, the path changed. No marketplace constraints. No "who's online?" - just instant, affordable support.

What we built

A friend-in-your-pocket - not therapy, not a game:

Multimodal: start in text, switch to voice when typing feels like too much.
Memory: it remembers what you say matters (privately), so you don't start from zero.
Boundaries: warm, platonic, clearly non-clinical, with crisis-aware language and real resources.
Freemium: voice, memory insights, and light personalization as premium unlocks.

Early signals

team → MVP → 50 testers in 4 weeks; ~100% onboarding (60–90s); ~30% D14 retention for engaged testers; 1K+ signups from simple content.

Users said versions of: "It's ChatGPT but it actually gets me," and "This helps when I don't want to burden friends."

How I Made a Generative, Multimodal Product Feel Trustworthy (the craft)

1) Multimodal = better context and safer flow

• I modeled the counseling arc - rapport → explore → goal → next steps - as conversation phases.
• A lightweight phase classifier (lexical cues + dialog state; for voice: prosody, speech rate, pause length) guided prompt scaffolds: if we're in rapport, bias toward reflection; if goal-setting, bias toward summarizing and concrete next steps.
• This stage awareness reduced whiplash and made the companion feel present, not generic.

2) Memory users can feel (not just store)

• Per-user vector store (dense embeddings) + structured memory (entities: people, places, goals) for hybrid retrieval (semantic + keyword).
• Selective injection: only top-k memories relevant to the current phase to avoid prompt bloat.
• User-visible recap after each session - "what you shared that matters" - which doubles as a private emotional journal.
• PII scrubbing before embedding; encryption at rest; explicit consent before saving sensitive details.

3) Streaming replies > silence

• In emotional moments, a fast "I'm here" beats a perfect sentence three seconds later.
• I optimized for sub-second first token (tight system prompts, retrieval gating, caching) and sent streaming partials + brief "thinking" tokens to avoid dead air.

4) Voice that feels human (and safe)

• Barge-in + VAD (voice activity detection) let users interrupt mid-response - critical for de-escalation and venting.
• Short guided turns and explicit handoff cues (e.g., "I'll summarize what I heard…") kept latency predictable and prevented rambling.
• ASR/TTS ran in low-latency streaming mode; transcripts were ephemeral unless the user chose to save.

5) Guardrails > single filters

• Layered safety: stance + boundary prompts + intent/risk classifiers + deterministic fallbacks for edge phrases.
• Non-clinical defaults: validate, offer options, route to real resources; never diagnose or give medical advice; no romance or role-play.

6) Latency & cost are product decisions

• Prompt compression, retrieval gating, and per-user token budgets kept responsiveness high and costs sane.
• Cost as UX: voice is magical and expensive; I gated it behind premium to keep COGS predictable and to signal value.

7) Evals that mirror reality

• Scenario playbooks ("lonely after a fight") with accept/reject criteria for tone, boundary-keeping, and next-step quality.
• Human-in-the-loop ratings (empathy/helpfulness) + red-team prompts to catch romantic drift/jailbreaks across model updates.

Why I Paused (primary reason: team/product fit)

I want a clinically experienced co-lead at the helm with me to:

Own the care model (boundaries, crisis logic, outcomes)
Co-design the evaluation harness (what "good" looks like beyond DAU)
Guide ladder-up pathways to human care and safe handoffs
Keep us honest on ethics, privacy, and scope

Could we raise $2-3M, add design horsepower, and push forward? Yes - but without the right clinical partner, we'd be optimizing a sensitive product without the person best equipped to steward care standards. That's not how I want to build in mental health.

What I Learned

Network = speed

On Deck Health + What If VC surfaced patterns and mentors that saved months.

Humans validate; AI scales

Vera proved willingness-to-pay; Familiar delivered support sustainably.

Engagement ≠ health

Our most active users weren't always our healthiest - so I tracked quality of next steps, not just usage.

Trust is a product choice

Latency, memory depth, voice tone - every decision either earns or erodes trust.

Healthcare is different

You don't "move fast and break things" with people's mental health.

What's Next (if the fit is right)

Not building this today. If I revisit, I'll want:

A clinical co-founder or deep advisory board + a clear eval harness
Resources for beautiful, trustworthy UX (not just working UX)
Distribution via universities or health plans, not only D2C
Selective on-device processing for truly private moments

Building consumer AI is hard. Building mental-health AI is harder. Building mental-health AI people trust enough to actually use? That takes more than good code and good intentions - it takes the right clinical partner at the helm with me.

Back to Portfolio