
A Raspberry Pi that wakes up before my wife and I, researches the day's AI and tech news, and delivers a personalized morning brief to each of us over Telegram - in Japanese and English, tailored to what the two of us care about.
Berry is a small autonomous agent that lives on a Raspberry Pi 5 in our apartment in Tokyo and runs around the clock. Every morning at 07:30 JST it pulls the latest AI and tech news from a curated set of sources, deduplicates against everything it has already shown us previously, and uses Claude to write two short briefs - one for me in English, one for my wife Miyuki in Japanese - each ranked against a profile of what that person cares about. It runs on the Hermes Agent runtime, is controlled over Telegram, is reachable from anywhere over Tailscale, and costs about 0.05€ a night. It is both a real daily-use tool and a deliberate experiment of local models, orchestration, and agentic workflows.
Email newsletters are so last season. But the AI field moves fast enough that staying current feels mandatory and staying sane feels impossible.
We have a particular version of the problem. I'm a product manager who builds with AI tooling all day; Miyuki is almost a year into a software development career. The same news matters to the two of us but from different angles. Even though we both speak Japanese as well as English, we each read most comfortably in our native language.
One thing we have recently been frustrated about is the increasing phone use. News are a time sink and learning what's going on forces a lot of screen time; we wanted something that did the scrolling for us overnight and handed each of us, in our own language, the three-to-five things that genuinely mattered to us - and then got out of the way.
The pain is information triage under two constraints most tools ignore: personal relevance and language.
Generic AI newsletters optimize for a median reader who is neither of us. An RSS feed reader solves discovery but not ranking - it hands you everything and makes you the filter, which is the work we wanted to remove. Translation tools turn English news into stiff Japanese but don't decide what's worth translating in the first place. And nothing on the market understands that a single household can contain two people with different jobs, different fluencies, and different definitions of "relevant" reading the same morning.
So the job I want done is: When I start my day, help me understand what in AI and tech actually matters to me - in my language, without doomscrolling - so I can stay sharp without drowning in noise.
The opportunity wasn't a market opportunity; it was a personal one to help me keep up with a field that is accelerating fast. A genuinely useful daily tool for two real users, and an honest testbed for a question I was also curious to answer: can I run local AI models on an ~$80 computer and what would that look like today?
Berry is a nightly pipeline wrapped in an always-on agent. The output is three Telegram messages that land in a shared family group every morning:
Under the hood it's a three-tier pipeline:
The whole pipeline is triggered by the Hermes Agent runtime's built-in scheduler as a no-LLM cron job at 07:30 JST; a non-zero exit DMs me privately so a silent failure can't happen. Per-person relevance is driven by two plain Markdown profile files I can edit in a text editor - no retraining, no database. Remote access is over Tailscale; the heavy synthesis runs against my own Anthropic API key.
The design principle throughout: the agent writes; scripts do everything else. More on why below (because I actually wanted to do something different initially but ran into some issues).
The premise I most wanted to test: a Raspberry Pi 5 with 8GB of RAM can run a small language model locally via Ollama, so how much of Berry could run for free on the device, with the cloud reserved only for the heavy lifting?
I benchmarked it honestly before deciding. Gemma 3 4B (quantized) on the Pi ran at ~4.5 tokens/second with an ~8.8-second cold load - fine for an overnight batch, painfully slow for anything interactive, where a single multi-step agent turn would run into minutes. Small models are also unreliable at the tool-calling that an agent loop depends on. Meanwhile the only genuinely model-heavy task in the whole system - the bilingual, per-person synthesis - is exactly the one where quality is non-negotiable, so it seemingly has to be cloud regardless.
The uncomfortable and frankly boring insight that fell out of the benchmark: most of Berry isn't an LLM job at all. Fetching, deduplicating, formatting, and delivering are deterministic code. The "free because it's local" savings I'd imagined mostly came from realizing those steps never needed a model in the first place. So the architecture became: scripts for the mechanical 80%, one constrained Sonnet call for the judgment, and the local model on the bench - not because local is useless, but because in this system its only honest home is non-time-critical batch work (like grading the eval, a planned next step), not the critical path. Documenting that decision, with the number that drove it, was as much the point of the project as the briefs themselves.
(One adjacent decision worth noting: I run synthesis through my own metered API key, not a consumer subscription. Anthropic's consumer terms prohibit automated, programmatic access through Pro/Max accounts - the sanctioned path for an always-on agent is the API. It's also the only path that gives me a clean per-night cost number and a hard spend cap.)
The entire user base is two people in one apartment, which is exactly why it's a good test.
Me - the over-informed builder. A senior Product Manager who lives in the AI tooling and needs actionable knowledge, not headlines: what a release means for the products and roles I think about, digested fast enough that I'm not the last to know (as I feel I often am). Low tolerance for fluff, high tolerance for density.
Miyuki - the early-career navigator. Eight months into software development, reading the same "AI is going to replace engineers" headlines I do but with real career stakes underneath them. Her brief has a deliberate editorial stance baked into her profile: clear-eyed about where the industry is heading, but framed around concrete skills and opportunities - never doom. That framing is a product decision, not sugar-coating: she asked for the honest picture, and the honest picture is more useful when it's actionable.
Berry is days old, so the honest "result" isn't a usage chart — it's a working system, its economics, and a set of decisions and findings I can defend.
The most interesting result, though, is a negative one I didn't expect, and it's the best thing the project taught me - see the first learning below.
What's next is the same instinct that drove the project: close the loop. Run the local-vs-cloud experiment for real by having the Pi's Gemma model grade source-accuracy head-to-head against cloud Haiku (agreement, latency, cost - actual numbers behind the routing decision); widen and tune the source list against a week of relevance data; and revisit the in-the-moment feedback UX now that I know its constraints.