Software Requirements · Dev Needs & Infrastructure · March 2026

Software Development & Infrastructure Analysis

AI Story Companion Ecosystem — Functional requirements, technology stack, infrastructure architecture, and AI API cost estimates.

1. Functional Software Requirements

1.1 Device / Embedded Requirements

ID	Requirement	Priority	Component
D-01	System must detect wake word with <500ms latency in ambient noise up to 60dB	MUS	Firmware
D-02	ASR must transcribe child speech with >90% accuracy (ages 4–10)	MUS	FW + Cloud
D-03	Device must operate in offline mode for basic storytelling (pre-cached stories)	MUS	FW
D-04	OTA firmware update must complete without user interaction; rollback on failure	MUS	FW + DevOps
D-05	Device must boot to ready state in <15 seconds	SHO	FW
D-06	Audio output must support stereo at ≥48kHz / 16-bit	MUS	FW
D-07	Camera must capture 1080p at 30fps for personalisation features	SHO	FW
D-08	BLE pairing must complete within 30 seconds with mobile app	MUS	FW + Mobile
D-09	Battery must support minimum 4 hours continuous playback	MUS	HW + FW
D-10	LED ring must animate in sync with story events (<100ms latency)	SHO	FW
D-11	All microphone audio must be processed and discarded; never stored	MUS	FW + Cloud
D-12	Device must support projection sync with DreamDome over Wi-Fi (<200ms)	SHO	FW

1.2 Cloud Backend Requirements

ID	Requirement	Priority	Component
C-01	Story generation API must return first audio chunk in <800ms (p95)	MUS	Story Orchestrator
C-02	System must enforce child-safe content at every LLM output	MUS	Safety Service
C-03	All child data must be stored encrypted at rest (AES-256)	MUS	All services
C-04	System must support horizontal auto-scaling to 100K concurrent devices	MUS	Infra / K8s
C-05	Parent can delete all child data; deletion propagates within 30 days	MUS	All services
C-06	Story memory must persist across sessions and be retrievable by child ID	MUS	Memory Service
C-07	Music engine must produce contextually appropriate audio within 2s	MUS	Music Service
C-08	Illustration engine must generate scene image within 5 seconds	SHO	Illustration Svc
C-09	System must support multi-language TTS (min: EN, FR, DE, ES at launch)	SHO	TTS Service
C-10	API must return structured error codes; no raw exceptions to client	MUS	API Gateway
C-11	All API endpoints must require authentication; no unauthenticated access	MUS	Auth Service
C-12	System must log all safety filter triggers for audit and review	MUS	Safety Service
C-13	Sleep motion data must be anonymised before analytics processing	MUS	Analytics Svc

1.3 Mobile App Requirements

ID	Requirement	Priority	Component
M-01	App must support iOS 16+ and Android 12+	MUS	Mobile App
M-02	Parent must be able to create and manage up to 5 child profiles	MUS	Mobile App
M-03	App must provide content filtering controls (themes, age level, topics)	MUS	Mobile App
M-04	App must display sleep summary with story and motion timeline	SHO	Mobile App
M-05	App must support bedtime schedule configuration with automatic enforcement	MUS	Mobile App
M-06	App must allow purchase and management of subscriptions	MUS	Mobile App
M-07	App must work offline for settings management (sync when reconnected)	SHO	Mobile App
M-08	Push notifications for sleep summary delivery and usage alerts	COUld	Mobile App
M-09	App must display story history and allow playback of saved stories	COUld	Mobile App

Priority Key: MUS = Must Have (MVP) SHO = Should Have (v1.0) COUld = Could Have (post-launch)

2. Technology Stack

Layer	Technology	Justification
Embedded OS	Linux (Yocto/Buildroot 2024)	Minimal footprint, full control, wide hardware support
Embedded Language	Python 3.11 + asyncio	Rapid prototyping, async I/O for audio/network
Wake Word	Porcupine SDK (on-device)	Privacy-first, no cloud dependency, <5mW
Local STT	Whisper.cpp (tiny/base)	Offline fallback, acceptable accuracy for simple commands
Cloud STT	Google Cloud Speech / AWS Transcribe	High accuracy, multi-language, child voice models
LLM	Anthropic Claude 3 Sonnet (primary), GPT-4o (fallback)	Safety features, quality, cost balance
Orchestration	LangChain + LangGraph	Narrative state machine, tool use, memory integration
Vector DB	pgvector (PostgreSQL) / Pinecone	Story memory, semantic search, low-latency retrieval
TTS	ElevenLabs (primary), Azure Cognitive Services (fallback)	Natural child-friendly voices, low latency
Music	MusicGen (HuggingFace) + S3 curated library	Dynamic generation + reliable fallback
Image Generation	DALL-E 3 / Stable Diffusion XL	Quality illustrations, child-safe safety filters
Backend Language	Python 3.11 + FastAPI	Async, fast, typed, excellent ecosystem
Backend Framework	FastAPI + Pydantic v2	Auto OpenAPI docs, validation, performance
Message Queue	RabbitMQ / AWS SQS	Async task dispatch for generation services
Primary DB	PostgreSQL 16	ACID, pgvector, mature, excellent managed options
Cache	Redis 7	Session tokens, TTS cache, rate limiting
Time-Series DB	InfluxDB 2 / TimescaleDB	Sleep/motion sensor data
Analytics	ClickHouse	High-volume anonymised event analytics
Object Storage	AWS S3 / Cloudflare R2	Illustrations, audio cache, firmware bins
CDN	Cloudflare	Global low-latency media delivery
Container Runtime	Docker + Kubernetes (EKS/GKE)	Scalable microservices, managed K8s
CI/CD	GitHub Actions + ArgoCD	GitOps, automated deploy, rollback
Monitoring	Prometheus + Grafana + Loki	Metrics, dashboards, log aggregation
Alerting	PagerDuty + Slack	On-call rotation, incident management
Mobile	React Native + Expo	Cross-platform iOS + Android, code sharing
Auth	Keycloak + JWT	OIDC/OAuth2, SSO, parental consent flows
IaC	Terraform + Helm	Reproducible infra, version-controlled deployments

3. Infrastructure Architecture

The platform is deployed on AWS (primary) with GCP as a failover/multi-cloud option. All services run containerised on Kubernetes. Environment separation: dev / staging / production.

Service	AWS Service	Sizing (initial)	Scaling
Kubernetes Cluster	EKS (Kubernetes 1.30)	3 × m6i.xlarge nodes	Auto-scale 3–20 nodes
Relational DB	RDS PostgreSQL 16	db.t3.large (Multi-AZ)	Read replicas at 10K DAU
Cache	ElastiCache Redis 7	cache.t3.medium (cluster)	Scale with session volume
Message Queue	Amazon SQS	Standard queues	Managed, auto-scales
Object Storage	S3 Standard + Intelligent Tiering	Unlimited	Managed
CDN	CloudFront + Cloudflare	Global PoPs	Managed
Container Registry	ECR	Private repos per service	Managed
Secrets	AWS Secrets Manager	Per-service secrets	Managed
DNS & Load Balancer	Route53 + ALB	Regional ALB	Managed, auto-scales
Monitoring	CloudWatch + managed Prometheus Workspace	—	Managed
Log Aggregation	CloudWatch Logs + Loki (Grafana Cloud)	Retained 30 days	Managed
CI/CD Runners	GitHub Actions (managed)	8-core runners	Managed
Analytics DB	ClickHouse Cloud (startup tier)	2 shards	Scale with data volume

4. AI API Cost Estimates

Assuming 20 minutes average daily usage, 30 days/month, ~1,200 LLM tokens per story exchange. Estimates at 10,000 MAD (Monthly Active Devices).

Service	Volume / month (10K MAD)	Unit Cost	Monthly Total
LLM (Claude Sonnet / GPT-4o)	360M tokens in + 90M out	$3/$15 per 1M	~$4,700
STT (Google Cloud Speech)	~6,000 hours audio	$0.016/min	~$5,760
TTS (ElevenLabs Pro)	~12,000 hours audio	$0.18/1K chars	~$3,200
DALL-E 3 (illustrations)	~200K images	$0.04/image	~$8,000
AWS Infra (K8s + DB + S3)	Fixed + variable	—	~$2,500
CDN + Storage	Media delivery	$0.02/GB out	~$800
Total estimated			~$25,000/mo
Per active device			~$2.50/device/mo

~64%

Gross AI+Infra Margin at 10K MAD

>75%

Margin at 50K MAD (volume discounts)

€7/mo

Premium Subscription Price

Unit Economics Note: At €7/mo Premium subscription, gross AI+infra margin is ~64% at 10K MAD. At 50K MAD, volume discounts on LLM and STT push margin above 75%. Cost per device reduces significantly with local model upgrades for TTS and simple ASR tasks.