🟠 In this final article, I connect the embedded AI device to the real world. I describe how inference results leave the MCU, travel through MQTT and HTTP, land in a Rust backend, and become visible through a web dashboard.

This is the layer where a TinyML prototype turns into an actual end-to-end product.


    From inference to communication: Wi-Fi, MQTT, and reliability

Once inference works reliably on-device, the next challenge is not accuracy β€” it’s communication.

The MCU operates in an unstable environment:

  • Wi-Fi may drop;
  • power can fluctuate;
  • the backend may be temporarily unavailable.

Because of that, the device communication layer is built around a few principles:

  • stateless inference, stateful delivery;
  • retry-safe messaging;
  • decoupling inference from transport.

Why MQTT

MQTT is a natural fit for embedded AI devices:

  • low overhead;
  • persistent sessions;
  • QoS guarantees;
  • simple reconnect semantics.

The device publishes inference results as compact messages:

  • device ID;
  • timestamp;
  • prediction score;
  • optional debug metadata.

If Wi-Fi drops, the MCU buffers messages and reconnects automatically. Inference never blocks on network I/O β€” this separation is critical for real-time behavior.


    Backend architecture: Rust, Actix Web, and data ingestion

On the server side, the backend acts as the central nervous system of the product.

Its responsibilities:

  • ingest data from devices via MQTT;
  • expose an HTTP API for clients;
  • store structured data in PostgreSQL;
  • handle authentication and authorization;
  • serve the frontend SPA.

Rust + Actix Web was a deliberate choice:

  • predictable performance;
  • strong typing across the entire stack;
  • explicit async behavior.

MQTT ingestion as a background task

The MQTT listener runs as a dedicated async task inside the backend process:

  • subscribes to device topics;
  • validates incoming payloads;
  • writes inference data into the database.

Because MQTT and HTTP are decoupled, ingestion continues even if no users are connected.

This pattern avoids the classic IoT anti-pattern of β€œHTTP from devices”.


    API layer and authorization model

The HTTP API exposes structured access to collected data.

Key design decisions:

  • JWT-based authentication;
  • protected routes via middleware;
  • strict separation between public and private endpoints.

In Actix Web this maps cleanly to scoped routes:

  • /api/auth/* β€” login, profile, token refresh;
  • /api/predictions/* β€” protected inference data access.

Middleware enforces authorization centrally, not per-handler β€” this keeps business logic clean and auditable.


    Backend as a deployment unit

The backend is built and deployed as a minimal container.

The Docker setup follows a multi-stage pattern:

  • Rust build stage with cached dependencies;
  • slim Debian runtime with only OpenSSL and certificates.

This yields:

  • fast builds;
  • small images;
  • reproducible deployments.

Docker Compose ties everything together:

  • PostgreSQL for persistence;
  • Mosquitto as the MQTT broker;
  • backend API;
  • frontend static server.

At this point, the system can be deployed locally or on a single VPS without modification.


    Frontend dashboard: making data visible

Raw inference data is useless without interpretation.

The dashboard provides:

  • a login-protected UI;
  • real-time and historical predictions;
  • device-level visibility;
  • documentation and onboarding pages.

The frontend is a React SPA served as static assets via Nginx.

Routing is handled client-side:

  • core pages (login, dashboard, purchase);
  • static legal pages;
  • documentation sections;
  • graceful SPA fallback for unknown routes.

Because the backend serves the frontend assets directly, the system behaves as a single cohesive application from the user’s perspective.


    System boundaries and responsibilities

At this stage, the architecture has clear boundaries:

  • MCU

    • real-time inference;
    • sensor interaction;
    • unreliable network handling.
  • MQTT

    • transport layer;
    • buffering and delivery guarantees.
  • Backend

    • data ingestion;
    • persistence;
    • authorization;
    • API surface.
  • Frontend

    • visualization;
    • analysis;
    • user interaction.

Each layer can evolve independently without breaking the others β€” this is what turns a demo into a maintainable product.


    Closing thoughts

This project deliberately spans:

  • TinyML and quantization;
  • low-level firmware concerns;
  • async backend design;
  • frontend product thinking.

The key insight is simple but often missed:

An embedded AI model is only valuable when it is part of a system.

By treating communication, backend, and UI as first-class engineering problems, the device stops being β€œjust a model on a chip” and becomes a real, deployable AI product.

This concludes the end-to-end pipeline β€” from camera frames and INT8 tensors to dashboards and users.