
For the EUR hackathon
How HOMING works
under the hood.
Four pipelines move a 90-second voice note into a real-life meet-up. Voice gets captured, language gets understood, a graph finds the people, and verification keeps everyone safe before details are shared.
00 / Summary
From a 90-second voice note to a real meet-up
One slide. Four pipelines. Three providers. One graph. Everything below is a visual you can point at.
Voice in
MediaRecorder · ElevenLabs
Make sense
Ollama · gpt-oss
Find people
Graph DB · Cypher
Meet for real
iDIN · selfie
API + LLM calls
Graph schema
Security & GDPR
Device
- audio
- Whisper
- edits
Server
- topics
- verified
Never
- ID doc
- selfie
- legal name
One source of truth. Every stage writes to the same graph and reads what the others wrote. Any one pipeline can be rewritten in isolation.
01 / Pipeline
Voice → Transcript
We treat the recording as ephemeral. Audio bytes live only as long as they need to.
Tap the mic
MediaRecorder spins up a capture stream from the browser's microphone permission. No native shell, no native SDK.
navigator.mediaDevices.getUserMedia()Stream chunks to memory
Audio is encoded as Opus inside a WebM container. Chunks accumulate in a Blob — never written to disk.
MediaRecorder · audio/webm;codecs=opusStash the blob for handoff
When recording stops, the Blob is held in a module-level variable so the next page can read it without serialising through history state.
lib/audioStash.tsPOST to /api/transcribe
Multipart form-data upload to a Vercel function. The function forwards the audio to ElevenLabs Scribe and proxies the JSON back.
fetch · FormData · app/api/transcribeSpeech-to-text
ElevenLabs Scribe (scribe_v1) returns a transcript with word-level timestamps. We discard timestamps and keep the text.
api.elevenlabs.io/v1/speech-to-textTranscript returned
UTF-8 string handed to the analysis pipeline. The audio Blob is dropped — never persisted server-side.
utf-8 · no audio retentionProduction promise
In the production build, the transcription model runs locally via WebAssembly (Whisper-small). The demo routes to Scribe so we can ship today — same input, same output shape.
What never leaves the device
- The raw audio Blob
- Recording timestamps
- Mic device identifier
02 / Pipeline
Transcript → Atomic topics
The model has one job: split what you said into separable interests so the graph can index each one independently.
Send transcript
JSON POST to /api/analyze. We also pass a demoMode flag that pads short transcripts with sensible context for short hackathon recordings.
POST /api/analyzeCall Ollama Cloud
OpenAI-compatible endpoint, gpt-oss:120b model. response_format=json_object so we never have to parse free text.
ollama.com/v1 · gpt-oss:120bAtomic separation rule
The system prompt drills the model with explicit examples: 'cooking' and 'Korean food' are TWO topics, 'board games' and 'Catan' are TWO topics.
prompt-engineered atomicityStructured response
Returns topics with explanations + tag arrays, plus minor interests, languages, activity types, and three concrete activity suggestions.
JSON · zod-validated client-sideReturn JSON. Split into ATOMIC topics —
each interest standing on its own:
• "Cooking" and "Korean food" are TWO topics.
• "Board games" and "Catan" are TWO topics
("Catan" is more specific).
For each topic emit:
title, explanation, tags[]
Then propose 3 concrete one-off activities.{
"topics": [
{
"title": "Catan",
"explanation": "Wants a chill round again.",
"tags": ["catan", "specific game"]
},
{
"title": "Board games",
"explanation": "Casual strategy, low-pressure.",
"tags": ["board games", "strategy"]
}
],
"activities": [
{ "title": "Start a Catan round", ... }
]
}03 / Pipeline
Topics → People
Interests are many-to-many, friendship is sparse, and avoid-pairs are private. That's a graph problem, not a SQL one.
Hover any node, click to pin, or pick a query. Real graph queries animate the same way under the hood — they walk these edges.
Legend
MATCH (a:Activity {id: $aid})-[:REQUIRES]->(t:Topic)
<-[:LIKES]-(u:User)
WHERE u.id <> $creator
AND any(l IN a.languages WHERE l IN u.languages)
AND NOT (u)-[:AVOID]-(:User {id: $creator})
AND (u)-[:AVAILABLE_AT]->(:TimeSlot {day: a.day})
RETURN u.id, count(DISTINCT t) AS score
ORDER BY score DESC
LIMIT 5;Multi-hop without joins
Friend-of-friend pathing is one extra hop, not a recursive CTE.
Private edges, real privacy
:AVOID edges are user-scoped — never exposed in match output, just filtered.
Sparse and dense both fast
Most users like few topics; index lookups beat join planning.
04 / Boundaries
Where the data lives
We log only what makes the next match better. ID documents and selfies never touch our storage.
On-device
Raw audio recording
MediaRecorder Blob, dropped after upload
Transcription model (prod)
WebAssembly Whisper in browser
Topic edits
Local state until 'Looks right'
Server
Atomic topics (text)
Graph :Topic nodes attached to your :User
Activity records
Graph :Activity nodes with :REQUIRES edges
Verified-true flag
One boolean on your :User node
Never stored
ID document
Provider returns true/false only
Selfie photo
Liveness check happens on-device
Full legal name
We keep first name; nothing else
05 / Principles
What HOMING refuses to do
Each refusal is a design choice. Together they're why this enables human connection instead of replacing or surveilling it — the BCG X brief made flesh.
Won't do
Profile browsing or swiping
Instead
Activity-first matching — you choose what to do, the graph finds people who said they want the same thing.
Won't do
Public popularity scores or 'social capital'
Instead
Match scores stay internal to the matching service. Users never rank or rate each other.
Won't do
An AI chatbot that replaces conversation
Instead
Homi can draft the first message; you read it, edit it, send it. The chat is between humans from line one.
Won't do
Notifying people that someone declined them
Instead
Declines are private and invisible to the declined side. No signal, no read-receipt, no shame.
Won't do
Engagement-time as the success metric
Instead
Success is the group meeting without us. The /group screen literally invites you to start a WhatsApp.
Won't do
Mixing 16-17 year olds with adults in the same pool
Instead
EUR pilot is 18-29 only. A younger track needs separate safeguarding before it ships.
The boundary isn't a feature list. It's the product.
06 / Compliance
GDPR and the AI Act, by design
The architecture above is also the compliance story. Privacy by design means we satisfy these obligations as a side-effect of how the system is built — not as a bolt-on.
GDPR
Granular, revocable per data class — voice recording, topics, matching, availability are separate toggles.
Deleting a :User node cascades :LIKES, :AVAILABLE_AT and :AVOID edges in one transaction.
JSON export of every node and edge the user authored, downloadable from settings.
On-device transcription, anonymous-until-verified flow, no profile browsing.
AI Act
Used to assist matching, not to grade or rank humans. Final action is always the user's.
Every Homi suggestion is labeled. Users see the topics extracted and can edit them before anything is used.
LLM provider contract: no training on prompts, no log retention beyond 24h.
Match outcomes audited by sex / language / faculty cohort. Findings published.
Operations
Designated Data Protection Officer reports directly to the founders.
Data Protection Impact Assessment, reviewed with the EUR data office.
Third-party penetration test + report; remediation tracked publicly.
Every vendor that touches data is named in the privacy policy with a 30-day change notice.
We classify as limited-risk under the AI Act because the model assists with matching but the final action — accept, decline, verify, meet — is always a human decision. We publish the DPIA, the bias audit, and the sub-processor list before opening enrollment.
07 / Delivery
€2M, 18 months, EUR-first
MVP at month 3. Closed EUR pilot at month 6. Three to four Dutch universities by month 18 — under budget and politically survivable.
€2.0M
Total budget
18mo
To 4-uni rollout
6 FTE
Eng · DPO · design
4 uni
Dutch network · M18
Budget · 18 months
Total
€2.0M
Roadmap
Production MVP
The flow you saw — voice in, topics, match, verify — hardened, monitored, documented. No public users yet.
Closed EUR pilot · 50 students
Hand-picked across faculties. DPIA finalised. Weekly cadence. We measure outcomes, not screens.
EUR open + audited
Public to all EUR students. DPIA + bias audit published. Security review by a third party.
3–4 Dutch universities
TU Delft, Leiden, Utrecht, Wageningen. Same product, university-scoped graphs.
Built for the EUR hackathon
Activity-first. Profile never. Built to be deleted.
Next 15 · TS · Tailwind v4 · Ollama · ElevenLabs · graph DB