Personal Project:Home Hub

Whathappenswhenatechiemovesintoanewplaceandprocrastinatesbybuildingasmarthomefromscratch
Started Feb 2026 β€” Active Build
🏠 Self-Hosted StackπŸ€– Local AI (qwen3:8b)πŸ”§ Building a Robot Now

Moved into a new place. Chronic procrastination set in. For a techie, there's really only one acceptable form of procrastination β€” build a smart home from scratch.

Two months later: a self-hosted stack on a Beelink mini-PC, a local AI agent named Nova, a 3D Three.js dashboard, an iPhone-as-house-remote, and a small physical robot being assembled to give Nova a body.

0+
commits
and counting
0
days active
feb 24 β†’ apr 15
0
services
docker compose
0
subsystems
router, dashboard, robot…
0+
api endpoints
fastapi router
0
nova intents
regex first, llm fallback
0
design docs
every feature planned
0
ESP32 firmwares
head, body, effects

how it snowballed

Feb 24Bootstrap
DockerHome AssistantMQTTPi-hole
Mar 2Nova foundation
FastAPITelegrampattern matcher
Mar 29Network + 3D
TR-064InfluxDBThree.jsSpotify TV
Apr 4Voice + iOS
WhisperPiperOpenWakeWordCapacitor
Apr 8Media parity
Spotify searchYouTube Loungealbum art on TV
Apr 12Robot + scenes
ESP32drag-drop blocksffmpeg GIFhello-lizard
β€” Act 2 β€”

The Brain

Local Β· regex-fast Β· increasingly opinionated

Section 01

Meet Nova

The local AI agent that ties the apartment together. FastAPI + Ollama (qwen3:8b), running entirely on the Beelink. No cloud LLM in the hot path.

nova Β· request flow
     Telegram         iPhone App        Samsung TV / Mac
        β”‚                  β”‚                    β”‚
        β–Ό                  β–Ό                    β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  bot   β”‚         β”‚ voice  β”‚         β”‚   http   β”‚
   β”‚polling β”‚         β”‚   ws   β”‚         β”‚  /chat   β”‚
   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”¬β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β–Ό
                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚  NovaDispatcher    β”‚  ← single entry point
                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β–Ό                       β–Ό
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚   regex     β”‚   no    β”‚   ollama     β”‚
      β”‚  pattern    β”‚ ──────► β”‚  qwen3:8b    β”‚
      β”‚  matcher    β”‚  match  β”‚ (local LLM)  β”‚
      β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚ match                  β”‚ classify
             β–Ό                        β–Ό
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚           18 intent handlers         β”‚
      β”‚  HOME_CONTROL Β· TV Β· ROUTINE Β· ...   β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Two-layer classifier

Most messages hit a regex pattern in under a millisecond β€” zero LLM tokens. Only the ambiguous ones fall through to qwen3:8b for classification. Keeps response times under 1ms for ~80% of commands.

Single dispatcher

Telegram, voice WebSocket, and HTTP all funnel into one NovaDispatcher.dispatch(). Channels handle their own UI quirks (keyboards, TTS), routing logic is shared.

Voice & wake word

Self-trained OpenWakeWord model recognizes β€œHey Nova”. Voice replies via Piper TTS using a cloned Jarvis voice from Iron Man. (Iron Man tax: paid.)

live Β· pattern matcher
running
user>
Section 02

Three channels, one dispatcher

Different transports, same brain. Channel-specific UI (inline keyboards, TTS audio) handled at the edge β€” routing logic shared.

Telegram

bot polling

Text messages, /commands, receipt photos (β†’ Gemini OCR β†’ Notion)

πŸ“± chat screenshot

Voice WebSocket

/ws/nova/voice

16kHz PCM audio from the iPhone app. Self-trained wake word β†’ Whisper STT β†’ dispatch β†’ Piper (Jarvis voice) TTS

πŸŽ™ voice activity capture

HTTP

POST /api/nova/chat

Plain JSON for the dashboard and any web client

πŸ–Ό dashboard chat panel
↓ all funnel into ↓
NovaDispatcher.dispatch(text, chat_id, channel)
Section 03

What Nova does

18 intents across 8 domains. Plus pronoun resolution ('turn it off'), confirmation state machine ('Turn off all 6 lights? [Yes][No]'), and a tool-calling PlannerAgent for compound commands.

Home
HOME_CONTROL

Lights, switches, brightness, fans

HOME_VACUUM

Roborock β€” start, pause, dock, room

HOME_STATUS

"Is the kitchen light on?"

PRESENCE_QUERY

Who's home β€” by MAC

Entertainment
TV_CONTROL

Power, apps, volume, d-pad

MEDIA_PLAY

"play X on spotify/youtube"

MEDIA_PLAYBACK

Pause, skip, what's playing

Time & info
WEATHER

Weather, forecast, temperature

TIMER_ALARM

Natural-language timers

CALENDAR

Schedule, events

Memory
MEMORY

Reminders, shopping list, notes

Routines & scenes
ROUTINE

Good morning, bedtime, leaving

SCENE

Movie mode, party, date night

Multi-step
COMPOUND

PlannerAgent β€” tool-calling

Utility
WIFI_QR

Show WiFi QR on TV

REMOTE_QR

Show TV remote link QR

Conversation
CONVERSATION

Open-ended chat (qwen3 /think)

INFO_QUERY

Factual questions (/nothink)

Section 04

Scene Editor

Visual programming for the home β€” drag-and-drop blocks with trigger / condition / action types, branching flowcharts, and live execution tracking. Build a routine without writing code.

scene Β· movie_mode
shipped
β—‰phrase"movie mode"
β–’lightsliving_room β†’ off
β–’tvopen Netflix
β–’goveeambient
executed Β· all actions complete
scene Β· wine_mode
shipped
β—‰phrase"wine mode"
β–’spotifyplay "80s playlist"
β–’lightswarm 30%
β–’goveered waveform
executed Β· all actions complete
scene Β· welcome_home
sun-aware
β—‰eventarrive home
β—‡ifafter sunset
β–’lightsentrance + living on
β–’novabriefing message
executed Β· all actions complete
Drag-and-drop blocks
Trigger, condition, and action types. Connect freely on the canvas.
Branching flowchart
Conditions split into yes / no branches with their own action chains.
Live execution tracking
When a scene runs, the currently-executing block pulses on the canvas.
πŸ–Ό screenshot Β· scene editor canvas with blocks
β€” Act 3 β€”

The Body

Screens Β· lights Β· and, soon, limbs

Section 05

Surfaces

What you actually see and touch. The iframes below are real demo builds of the apps β€” click around, toggle lights, trigger scenes, chat with Nova. Mock state lives in your browser and syncs across iframes.

3D Apartment Dashboard

Three.js + React + Vite Β· entry: mac.html

A 3D scene of the actual apartment β€” rooms, lights, the TV. Click a lamp, the real lamp turns on. First version was for the Mac browser.

Then I wanted it on the Samsung TV. Turns out Samsung doesn't exactly want you uploading custom apps to their TVs β€” but if you find the right magic incantation in dev mode, you can sideload one anyway. I did. Same Vite build, three entry points: /app (mobile), mac.html, tv.html.

apartment.glb (4.7 MB)react-three-fiberdreioffline mode (no HA token)

iPhone Dashboard

React + Capacitor Β· tabbed bottom nav

Three tabs: Lights (5 rooms with toggles + brightness sliders + β€œ2 of 3 on” counter), Remote (Spotify / YouTube / TV β€” try the rickroll), Nova (real chat with chip shortcuts).

About Netflix and Prime: their integration is, quote, painfully thin. You can open the app. That is the entire API surface they expose.

Custom wake wordJarvis voiceWhisper STTAudioWorkletSSE chat

Network Monitor (v3 β†’ v4)

Pi-hole DNS Β· FRITZ!Box TR-064 Β· InfluxDB Β· Grafana

β€œPeople-first” presence dashboard. Detects who's home by MAC (FRITZ!Box TR-064 + Pi-hole DNS queries), tracks usage in InfluxDB, visualizes in Grafana. Drives the morning briefing and Welcome-Home routine.

v3 β†’ v4 rewritepeople-first60s pollingguest detection30 commits / day peak
overview Β· query volume + devices
overview Β· query volume + devices
live Β· real-time DNS feed
live Β· real-time DNS feed
Section 06

A body for Nova

The current procrastination project β€” moving Nova out of a kitchen phone and into a real robot.

Subsystemin progress Β· being assembled

Three ESP32 codebases, one robot

booting…
p5.js Β· canvas
scenes:

First version of always-listening Nova ran on a 10-year-old phone in the kitchen β€” voice app on a permanent charger, microphone open. It worked. It was also, undeniably, a phone on a charger.

The replacement: a small physical robot with an animated face, body effects, and a microphone. The web preview above is the actual implementation β€” same p5.js eye styles (roboeyes / cozmo / kawaii / statusring), same canvas body renderers (waveform / lavalamp / horror / matrix / gauges / status), same scene system. Switch the buttons.

The hardest part: GIF streaming over MQTT β€” RGB565 little-endian frame decoding took longer to debug than I'd like to admit. On the actual ESP32 hardware, GIFs play directly on the body screen via the AnimatedGIF library off SPIFFS. Here's what the GREETING intent renders:

Custom wake wordJarvis voiceGIF over MQTTalways listeningp5.js eyescanvas body
actual ESP32 output240Γ—280 Β· RGB565
Hello lizard GIF that plays on the robot body screen
hello-lizard.gifGREETING intent β†’ say-hello scene
πŸ“· photo Β· robot WIP on the workbench
real components, real solder
esp-head/

Animated eye display β€” blink, look, expressions, GIF playback

Waveshare ESP32-S3 LCD
esp-body/

Body controller β€” movement, scene orchestration

ESP32 + ESPHome
esp-body-effects/

Visual effects library β€” lava lamp, horror, wave standby (15+ board configs)

PlatformIO + HAGL
Recent commits:GIF playback working: AnimatedGIF + SPIFFSWire GREETING intent β†’ say-hello sceneFix RGB565 byte order debugging
β€” Epilogue β€”

The Craft

Details Β· surprises Β· process Β· dead ends

Section 07

The little things

Side quests, surprises, and 'wait, it does that too?' moments.

Receipt OCR β†’ Notion
Snap a receipt to Telegram β†’ Gemini Vision β†’ structured rows in Notion
Presence detection
Polls network-monitor every 60s. "Who's home?" by MAC address
Morning briefing
Proactive Telegram message: weather + presence + lights status
"Lights left on" alerts
After hours, Nova nudges if anything's still on
Self-trained wake word
"Hey Nova" β€” custom OpenWakeWord model, all on-device
Anime watchlist
HiAnime scraper β€” search, watch, track. Why not.
GPS dwell tracking
Where you spent time, how long. Stored in SQLite.
WiFi QR generator
Show WiFi QR on TV or via Telegram
Robot eyes for timers
Set a 10-min timer β†’ robot face shows countdown
Shopping lists & chores
Persistent in SQLite. "I vacuumed" updates status
Pronoun resolution
"Turn it off" β€” Nova remembers what "it" was
Album art on the TV mesh
Spotify cover renders as a texture on the TV in the 3D scene
Section Β· what didn't make it

Dead ends

Mature projects have graveyards. Three paths sketched, started, or experimented with β€” then cut. Knowing what to remove matters as much as knowing what to ship.

consolidated
Legacy executor pipeline
Replaced by unified dispatcher Apr 10 β€” one router for all channels.
unbuilt
OpenClaw AI agent container
Sketched as a separate service, then absorbed into in-process Nova.
experimental
Anime scraper kiosk playback
HiAnime watchlist works, but kiosk playback on the TV never shipped.
Section 08

The full stack

Layered, mostly local, mostly open-source.

Hardware
Beelink SER5 MAXUbuntu 24.04ESP32 (Waveshare)Sonoff ZBDongle-EGovee RGB barSamsung TV
Infra
Docker ComposeTraefikMosquitto MQTTZigbee2MQTTPi-holeHome AssistantInfluxDBGrafanaPortainer
AI / Voice
Ollama (qwen3:8b)Whisper STTPiper TTS (Jarvis)OpenWakeWord (self-trained)Gemini Vision (cloud)OpenRouter (cloud)
Apps
FastAPI (Python)React 18Three.jsViteCapacitor (iOS)TypeScriptTailwindSQLite (aiosqlite)
Firmware
ESPHomePlatformIOAnimatedGIFHAGL graphicsSPIFFS
Section 09

Built one design doc at a time

Every meaningful feature gets a written design plan before any code. 21 documents, ordered by build date. Peak velocity: 35 commits on Apr 8, 30 on Apr 2, 29 on Apr 9.

2026-02-24home-hub v1init
2026-03-02unified dashboard design + implementationdashboard
2026-03-03location dwell trackingdata
2026-03-03sqlite persistent storagedata
2026-03-07beelink hardware setupinfra
2026-03-29network monitor v3network
2026-03-31nova m2 β€” TV controlnova
2026-04-013D dashboarddashboard
2026-04-02network monitor v4network
2026-04-03TV dashboarddashboard
2026-04-04network v5 β€” guests redesignnetwork
2026-04-05HA state subscriptioninfra
2026-04-07nova app β€” swipeable pagesapp
2026-04-09nova chat tab redesignapp
2026-04-10documentation + cleanupmeta
2026-04-12scene editornova
2026-04-15nova bedtimenova
2026-04-15timer eyesrobot
Nova Β· online Β· responding from Beelink :8000

Personal project Β· Andrei Zitti Β· started Feb 2026, still going