Skip to content

Cost Modeling & FinOps

IoT deployments have a cost structure that is fundamentally different from web applications. Costs scale with device count, message frequency, and data retention — not user count or API calls. Surprises at invoice time are common because the cost model is not intuitive until you have run it at scale. The most common shock: at 10,000+ devices, the broker connection cost dominates everything else on managed IoT services, and teams that sized based on message count alone find their bill 10× higher than expected.

24.1 Per-Device Cost Breakdown

The following estimates use 2024/2025 list pricing. Actual costs vary by region, committed use discounts, and negotiated contracts. Use as a starting point for budget models, not as final numbers.

Assumptions: 10 messages/device/minute, 200 bytes/message, 90 days raw data retention, 2 years aggregate retention.

Cost Component 1,000 Devices 10,000 Devices 100,000 Devices
Managed broker (AWS IoT Core) ~$0.87/device/month ~$0.87/device/month ~$0.87/device/month
Self-hosted broker (EMQX on k8s, 3× m5.xlarge) ~$1.80/device/month ~$0.18/device/month ~$0.06/device/month
Time-series storage (TimescaleDB on managed Postgres) ~$0.15/device/month ~$0.12/device/month ~$0.10/device/month
Data transfer egress (to dashboards/API) ~$0.05/device/month ~$0.04/device/month ~$0.03/device/month
Ingestion compute (Kafka + workers) ~$0.50/device/month ~$0.08/device/month ~$0.03/device/month
Total (managed broker) ~$1.57/device/month ~$1.11/device/month ~$1.03/device/month
Total (self-hosted broker) ~$2.50/device/month ~$0.42/device/month ~$0.22/device/month

Self-hosted broker becomes cheaper than managed around 5,000–10,000 devices for the broker component. For storage, self-hosted TimescaleDB on reserved instances becomes cheaper than managed Postgres around 2,000–5,000 devices.

24.2 AWS IoT Core vs Self-Hosted EMQX — TCO at 10,000 Devices

Detailed calculation at 10,000 devices, 10 messages/device/minute:

Monthly message volume: 10,000 devices × 10 messages/min × 60 min/hr × 24 hr/day × 30 days = 432,000,000 messages/month

AWS IoT Core pricing (us-east-1, 2025): - Messaging: 432M × $0.08/1M = $34.56/month - Connection: $0.0012/device-hour × 10,000 devices × 720 hours = $8,640/month - Rules engine (if used): additional per-rule-execution cost - Total AWS IoT Core: ~$8,674/month (~$0.87/device/month)

Connection pricing dominates at large device count. AWS IoT Core connection cost alone exceeds the total self-hosted cost at this scale.

Self-hosted EMQX (3× m5.xlarge, us-east-1): - EC2 (3× m5.xlarge reserved 1yr): ~$400/month - EBS storage: ~$60/month - Load balancer: ~$50/month - Operations overhead (estimate 0.1 FTE at $150k/year): ~$1,250/month - Total self-hosted EMQX: ~$1,760/month (~$0.18/device/month)

At 10,000 devices, self-hosted EMQX costs approximately 5× less than AWS IoT Core. The crossover point (where managed cost = self-hosted cost including operations overhead) is approximately 1,500–2,000 devices — below this, managed services are cheaper when operational burden is accounted for.

24.3 Storage Cost Optimisation

Data Tier Volume (10,000 devices) Storage Cost Retention
Raw 1s telemetry (hot, TimescaleDB) ~50 GB/day → 350 GB/week ~$35/month (managed Postgres) 7 days
Raw telemetry (cold, Parquet on S3 Standard) ~50 GB/day → 1.5 TB/month ~$34.50/month 12 months
Raw telemetry (archive, S3 Glacier) Historical accumulation ~$4/TB/month 7 years (compliance)
1-min aggregates (TimescaleDB, 90 days) ~3.4 GB/day → 306 GB/90d ~$7.20/month 90 days
1-hr aggregates (TimescaleDB, permanent) ~140 MB/day → 50 GB/year ~$1.20/month/year Permanent

Recommended tiering strategy: 1. Raw data: 7 days on hot TimescaleDB storage (dashboards query this for recent views) 2. Raw data: export to Parquet on S3 Standard as chunks close (queryable with Athena) 3. Raw data: transition to S3 Glacier after 90 days (compliance archive) 4. 1-minute aggregates: retain 90 days in TimescaleDB (the primary dashboard query target) 5. 1-hour aggregates: retain permanently (negligible cost; essential for long-term trend analysis)

Applying this tiering, the blended storage cost for 10,000 devices is approximately $80–120/month rather than $3,500+/month if all raw data were kept in TimescaleDB permanently.

24.4 FinOps Practices for IoT

Resource tagging: Tag every cloud resource with: tenant_id (for multi-tenant), site, device_type, and environment (prod/staging/dev). This enables cost allocation reports by customer, by site, and by device type — essential for chargeback models and for identifying cost outliers.

Per-service budget alerts: Set separate budget alerts for each major service component (Kafka cluster, TimescaleDB, broker, egress) rather than a single total budget. A Kafka cost spike means a consumer is lagging and replaying messages; a storage spike means a device is sending faster than expected. Aggregate budget alerts mask root causes.

Cost per device KPI: Track total_cloud_cost / active_device_count monthly. Alert if it rises more than 20% month-over-month without a corresponding increase in device count. Unexpected cost increases are almost always caused by: a device stuck in a fast-publish loop, a consumer replaying a large Kafka partition, or a new device type with a much higher message rate than the fleet average.

Message loop detection: A device republishing its own messages is a common firmware bug that can generate millions of messages within hours. Set a broker-side rate limit per client_id: maximum 1,000 messages/minute is generous for any industrial sensor — anything above this is either a bug or a security incident. Log and alert when the rate limit is hit; do not silently drop — the device team needs to know.