Every time you use a cloud AI service, your data leaves your device. Your queries, your context, your conversation history — all transmitted to servers you don't control. For many use cases, this is an unacceptable tradeoff.
The Problem with Cloud AI
Cloud-based AI services have fundamental privacy limitations:
- Data transmission: Every query is sent over the network
- Server storage: Conversations may be logged, analyzed, or used for training
- Third-party access: Server operators can read your data
- Regulatory risk: Data may cross jurisdictional boundaries
- Single point of failure: One breach exposes all users
The Browser as a Privacy Boundary
Modern browsers provide a natural security sandbox:
- Same-origin policy: Code can only access its own resources
- No filesystem access: Web apps can't read your files
- Network visibility: You can inspect all network requests in DevTools
- Deterministic behavior: The same code produces the same results
How ReLU.chat Achieves Zero Data Collection
ReLU.chat is architecturally incapable of collecting your data:
- No server-side inference: All NLP runs in the browser
- No telemetry: Zero analytics, tracking pixels, or phone-home calls
- No accounts: No login, no email required, no user profiles
- No storage: No server-side database of conversations
- Open source: The code is public — verify the claims yourself
The Technical Stack
Making this work required solving several engineering challenges:
Model Size
Theall-MiniLM-L6-v2 sentence transformer is ~90MB in its original form. Through ONNX quantization, we reduced it to ~22MB — small enough to load on a mobile connection in a few seconds.
Inference Speed
ONNX Runtime WebAssembly provides near-native inference speed. Our benchmark: ~15ms per embedding computation on a mid-range laptop. That's fast enough for real-time conversation.Offline Support
A service worker caches all assets after the first load. The chatbot works offline — no internet required after initial setup.Knowledge Base
Instead of training a massive model on the entire internet, we use curated knowledge fragments. This limits scope but guarantees accuracy — a worthwhile tradeoff for domain-specific chatbots.When Cloud AI Makes Sense
Browser-based AI isn't always the right choice:
- Open-ended generation: LLMs excel at creative, unconstrained text
- Massive knowledge: Cloud models trained on trillions of tokens know more
- Complex reasoning: Multi-step reasoning benefits from larger models
The Future
WebGPU will bring GPU-accelerated inference to browsers, enabling larger models at faster speeds. WebNN provides a native neural network API. The browser is becoming a first-class AI platform.
ReLU.chat is an early example of what's possible. As browser capabilities grow, the range of viable on-device AI applications will expand dramatically.
Try It
Experience privacy-first AI: visit ReLU.chat. Open DevTools, watch the network tab — you'll see zero outgoing requests during conversation. Your data stays yours.
The code is open source at GitHub. Verify the claims yourself.