Skip to content

Architecture

System Overview

Inbound Flow

[External MTA] ──► [SES Inbound] ──► [S3: raw/] ──► [SQS: inbound]
                                            │               │
                                            │               ▼
                                            │      [Worker Service]
                                            │               │
                                            │               ▼
                                            └───►  [Postgres]


                                               [Webhook dispatch]


                                               [External systems]

Outbound Flow

[Client] ──► [API Service] ──► [Postgres] ──► [SQS: outbound]
                  │                                  │
                  │                                  ▼
                  │                          [Worker Service]
                  │                                  │
                  │                                  ▼
                  │                          [SES Outbound]
                  │                                  │
                  │                                  ▼
                  │                          [External MTA]

                  │         [SES Notifications] ──► [SQS: telemetry]
                  │                                       │
                  │                                       ▼
                  │                               [Worker Service]
                  │                                       │
                  └──────────────────────────────────────► [Postgres]


                                               [Webhook dispatch]

Deployment Architecture

                       ┌──────────────────────────────────────────┐
                       │                 AWS                       │
                       │                                           │
  Callers ────HTTPS───▶│  ALB ──▶ ECS (API) ──▶ RDS (Postgres)   │
                       │                   ╲                       │
                       │                    ──▶ SQS (outbound)     │
                       │                          │                │
  Internet  ──email───▶│  SES ──▶ S3 ──▶ SQS (inbound) ──▶ ECS (Worker)
                       │                    ╲──▶ SQS (telemetry)  │
                       │                                           │
                       │  S3 (raw + attachments)                   │
                       │  ECR  (container images)                  │
                       │  ACM  (TLS certificate)                   │
                       │  CloudWatch (logs + alarms)               │
                       └──────────────────────────────────────────┘

NOTE

ALB is deployed with HTTPS via ACM certificate. See Infrastructure for details.

Two ECS Fargate services run from a shared Docker image built with multi-stage targets:

ServiceTargetRole
mailman-apiapiStateless HTTP server, handles REST requests
mailman-workerworkerBackground processor, polls SQS, processes inbound email and delivery events

Both services share the same PostgreSQL database, S3 bucket, and SES configuration. They are independently scaled — the API scales on CPU/memory, the worker scales on SQS queue depth.

API Service

Stateless Axum HTTP server. Listens on port 8080. Handles REST API requests — validates input, stores message records, queues outbound email for async delivery via SQS, and returns immediately. Does not poll any queues.

Middleware Stack

Applied outermost to innermost:

  1. Trace — assigns a UUID trace ID to every request, attaches it to logs and response headers. Outermost so every request (including auth failures) gets a trace ID for debugging.
  2. Auth — validates API key from Authorization: Bearer <token> header. Runs before rate limiting so that rate limits are bound to a verified identity, not spoofable headers.
  3. Rate limit — per-key rate limiting via governor (token bucket, 100 req/min by default). After auth so limits apply per authenticated caller.

Endpoints

High-level categories:

  • Email operations — sending (queued via SQS), messages, threads, attachments
  • Configuration — domains, inboxes, auth keys, webhook endpoints
  • Observability — deliverability reports (DMARC), health checks (unauthenticated)

See openapi.yaml for the complete API surface and crates/http/src/routes/ for handler implementations.

Worker Service

Long-running background processor. Runs five concurrent tokio::select! polling loops:

LoopSourceAction
InboundSQS inbound queueParse MIME, scan for malware, resolve thread, store message, emit message.received webhook.
OutboundSQS outbound queueFetch attachments from S3, build MIME, send via SES, record delivery status.
TelemetrySQS telemetry queueProcess SES delivery/bounce/complaint notifications, update delivery status, manage suppression list.
WebhookSQS webhooks.fifo queueDispatch pending webhook deliveries with exponential backoff retries.
Domain Verification60-second intervalPoll SES for domain identity verification status updates (MX, SPF, DKIM, DMARC).

Characteristics:

  • Stateless (all state in Postgres/S3)
  • Horizontally scalable (SQS handles distribution)
  • Runs on ECS Fargate
  • No exposed ports

Data Stores

PostgreSQL (RDS)

Primary persistent store for message metadata, thread state, delivery tracking, webhook configurations, suppression list, and auth keys.

See Schema Reference for full DDL.

S3

Blob storage for raw MIME messages (raw/ prefix, 365-day retention) and extracted attachments (attachments/ prefix, 730-day retention).

SQS

Message queues for async processing:

QueuePurposeDLQ
inboundIncoming email notifications from SESinbound-dlq
outboundPending outbound messages for SES deliveryoutbound-dlq
telemetrySES delivery notificationstelemetry-dlq
webhooks.fifoWebhook event delivery (FIFO, grouped by thread_id)webhooks-dlq.fifo

Crate Organization

Mailman uses a hexagonal (ports & adapters) architecture. Domain logic lives in core with no external dependencies. Adapters implement traits defined in core. See Crate Map for the full breakdown.

bin-api ──► http ──► core
                 ──► adapters-aws
                 ──► adapters-postgres
                 ──► config

bin-worker ──► core
           ──► adapters-aws
           ──► adapters-postgres
           ──► adapters-scan
           ──► config

Error Handling Strategy

Domain Errors

Defined in core::error::DomainError, mapped to HTTP in crates/http/src/error.rs:

ErrorStatus CodeDescription
Validation400 Bad RequestInvalid input ({ field, reason })
NotFound404 Not FoundResource doesn't exist ({ resource, id })
AttachmentTooLarge413 Payload Too LargeExceeds size limit ({ size_bytes, limit_bytes })
Storage500 Internal Server ErrorS3/database failures
Delivery500 Internal Server ErrorSES/SMTP failures

Infrastructure errors are mapped at adapter boundaries (AWS SDK → DomainError::Storage/Delivery, SQLx → Storage/NotFound). Queue failures use visibility timeout with DLQ escalation. See Conventions for the full error handling pattern and JSON envelope format.

Security Boundaries

┌──────────────────────────────────────────────────────────────┐
│                         PUBLIC                                │
│  [Internet] ──► [ALB] ──► [API Service]                      │
└──────────────────────────────────────────────────────────────┘


┌──────────────────────────────────────────────────────────────┐
│                    PRIVATE (VPC)                              │
│  [API Service] ──► [Postgres]                                │
│  [Worker Service] ──► [Postgres]                             │
│  [Worker Service] ──► [SQS]                                  │
│  [Worker Service] ──► [S3]                                   │
└──────────────────────────────────────────────────────────────┘


┌──────────────────────────────────────────────────────────────┐
│                    AWS MANAGED                                │
│  [SES] ◄──► [Internet]                                       │
│  [S3]  ◄──  [SES] (inbound storage)                         │
│  [SQS] ◄──  [SES] (notifications)                           │
└──────────────────────────────────────────────────────────────┘

Key constraints:

  • All API traffic over HTTPS via ALB with ACM certificate (HTTP→HTTPS redirect).
  • Authentication via customer-signed JWTs (ES256, RS256) or API keys.
  • JWT tokens bind to specific inboxes and scopes.
  • Webhook payloads signed with HMAC-SHA256.
  • ClamAV scans all inbound content before storage.
  • Suppression list prevents sending to bounced/complained addresses.
  • Soft-delete pattern across all entities (no hard deletes via API).
  • No public S3 buckets.
  • RDS in private subnet only.
  • VPC endpoints for S3/SQS (no internet egress for data).
  • Secrets injected via ECS task definition from AWS Secrets Manager.