AI Story Companion Ecosystem — Hybrid Edge-Cloud Architecture
The AI Story Companion platform requires a hybrid edge-cloud architecture. The StoryCube device runs lightweight local inference and audio processing on-device (edge), while complex AI tasks — story generation, illustration, music — are offloaded to cloud services. This approach balances latency, offline capability, cost, and performance.
| # | Stage | Component | Location | Technology |
|---|---|---|---|---|
| 1 | Voice Input | Wake word + VAD | Edge (device) | TensorFlow Lite / Porcupine |
| 2 | Speech-to-Text | ASR Engine | Edge + Cloud fallback | Whisper (local small) / Google STT |
| 3 | Safety Check | Content Filter | Cloud | LLM guard / custom classifier |
| 4 | Story Orchestrator | Narrative Manager | Cloud | LLM (GPT-4o / Claude 3 Sonnet) |
| 5 | Memory Retrieval | Vector Store | Cloud | Pinecone / pgvector |
| 6 | Language Model | Story Generator | Cloud | Anthropic / OpenAI API |
| 7 | Music Engine | Adaptive Audio | Hybrid | MusicGen / curated library |
| 8 | Illustration Engine | Scene Images | Cloud | DALL-E 3 / Stable Diffusion |
| 9 | Text-to-Speech | Voice Synthesis | Cloud + Edge cache | ElevenLabs / local TTS |
| 10 | Output | Audio + Projection | Edge (device) | Custom renderer |
The StoryCube runs a custom Linux-based OS (Buildroot or Yocto). The embedded software stack handles: wake-word detection, audio I/O pipeline, local ASR fallback, BLE/Wi-Fi connectivity, projection control, LED ring animation, OTA firmware updates, and power management.
| Layer | Component | Notes |
|---|---|---|
| OS | Linux (Yocto/Buildroot) | Lightweight, minimal attack surface |
| Runtime | Python 3.11 + asyncio | Main application runtime |
| Audio | PulseAudio / ALSA | Microphone array + speaker management |
| Wake Word | Porcupine SDK (on-device) | Always-on low power mode |
| Local ASR | Whisper.cpp (tiny/base model) | Offline fallback STT |
| Local TTS | Piper / Coqui TTS | Cache common phrases offline |
| Connectivity | NetworkManager + BlueZ | Wi-Fi + BLE management |
| Camera | libcamera + OpenCV | Face/object detection, scene capture |
| OTA | Mender / SWUpdate | Secure firmware update delivery |
| Security | TPM 2.0 + LUKS encryption | Secure boot + storage encryption |
The cloud backend is designed as a set of microservices deployed on Kubernetes, with a focus on horizontal scalability, fault isolation, and independent deployment cycles.
| Service | Responsibility | Tech Stack |
|---|---|---|
| API Gateway | Auth, rate limiting, routing | Kong / AWS API Gateway |
| Auth Service | JWT, OAuth2, device registration | Python/FastAPI + Keycloak |
| Story Orchestrator | Narrative state machine, LLM calls | Python/FastAPI + LangChain |
| Memory Service | Child profile, story history, vector search | PostgreSQL + pgvector / Pinecone |
| Content Safety | AI output moderation, age filter | Python + custom LLM classifier |
| Music Service | Dynamic music selection and generation | Python + MusicGen / S3 library |
| Illustration Service | Scene image generation | Python + DALL-E 3 / SD API |
| TTS Service | Voice synthesis, caching | Python + ElevenLabs / Azure TTS |
| OTA Service | Firmware update orchestration | Go + Mender API |
| Parent App API | Profile mgmt, controls, analytics | Python/FastAPI |
| Analytics Service | Usage tracking, sleep reports | Python + ClickHouse |
| Notification Service | Push notifications, email summaries | Python + FCM / SendGrid |
The platform handles sensitive data about children, requiring careful data architecture with privacy by design.
| Data Type | Storage | Retention | Compliance |
|---|---|---|---|
| Child profiles & preferences | PostgreSQL (encrypted) | Account lifetime | GDPR/COPPA |
| Story history & memory | pgvector + PostgreSQL | 12 months rolling | GDPR |
| Voice recordings | Processed in-flight, NOT stored | Real-time only | COPPA compliant |
| Sleep motion data | Time-series DB (InfluxDB) | 30 days + aggregated | GDPR |
| Usage analytics (anonymised) | ClickHouse | 24 months | GDPR Art. 89 |
| Generated illustrations | S3 with per-child prefix | 30 days | GDPR |
| Firmware binaries | S3 + CDN | All versions retained | Internal |
| Auth tokens | Redis (short TTL) | 15min access / 7d refresh | OWASP |
Given the child-facing nature of the product, security is a first-class concern at every layer.
Secure boot (TPM 2.0), encrypted storage (LUKS), signed firmware updates, no debug ports in production, certificate pinning for cloud comms.
mTLS for device-to-cloud, JWT with short expiry, rate limiting per device/account, OWASP Top 10 hardening.
Multi-layer: system prompt guardrails → LLM content classifier → output post-processing → age-appropriate filter → parent override.
No voice recording storage (COPPA), GDPR data subject rights API, parental consent flows, right-to-erasure implemented.
All traffic TLS 1.3+, VPC isolation for services, WAF on API Gateway, DDoS protection via Cloudflare.