Skip to content

Message Lifecycle

Mailman handles email in two directions: inbound (receiving from the internet) and outbound (sending on behalf of clients). Both directions flow through SQS queues for durable async processing, with the worker service handling the heavy lifting.

Inbound Processing

Inbound emails flow through SES → S3 → SQS → Worker, where they are parsed, validated, threaded, stored, and webhook notifications are dispatched.

Pipeline

[External MTA]


[AWS SES Inbound]

      ├──► [S3: raw/{date}/{hash}.eml]  (store raw MIME)

      └──► [SQS: inbound]  (notification with S3 key)


        [Worker Service]

                ├── 1. Parse SES notification (extract S3 bucket/key)
                ├── 2. Fetch raw email from S3
                ├── 3. Parse MIME (headers, body, attachments)
                ├── 4. Validate DKIM/SPF/DMARC (from SES verdicts)
                ├── 5. ClamAV malware scan
                │     └── Malware detected → reject message
                ├── 6. Extract and upload attachments to S3
                ├── 7. Route to inbox
                ├── 8. Resolve thread (see Threading)
                ├── 9. Store message in Postgres
                ├── 10. Delete SQS message on success
                └── 11. Emit message.received webhook

SES Inbound Configuration

SES receipt rules store the raw MIME in S3 and publish a notification to SNS, which forwards to the SQS inbound queue:

hcl
resource "aws_ses_receipt_rule" "inbound" {
  name          = "mailman-inbound"
  rule_set_name = aws_ses_receipt_rule_set.main.rule_set_name
  enabled       = true
  scan_enabled  = true

  recipients = ["@example.com"]

  s3_action {
    bucket_name       = aws_s3_bucket.raw_email.bucket
    object_key_prefix = "raw/"
    position          = 1
  }

  sns_action {
    topic_arn = aws_sns_topic.inbound_notification.arn
    position  = 2
  }
}

The SQS notification contains the S3 location and SES authentication verdicts:

json
{
  "notificationType": "Received",
  "mail": {
    "messageId": "ses-message-id",
    "source": "sender@external.com",
    "destination": ["recipient@example.com"]
  },
  "receipt": {
    "action": {
      "type": "S3",
      "bucketName": "mailman-raw",
      "objectKey": "raw/2024/01/15/abc123.eml"
    },
    "spfVerdict": { "status": "PASS" },
    "dkimVerdict": { "status": "PASS" },
    "dmarcVerdict": { "status": "PASS" }
  }
}

Worker Processing Steps

MIME Parsing

Uses mail-parser crate for RFC 5322 compliant parsing. Extracted fields: Message-ID (required), From (required), To/Cc/Bcc, Subject, Date, In-Reply-To, References (for threading), body (text/plain and text/html parts), and attachments (MIME parts with Content-Disposition: attachment).

Authentication Validation

SES-provided verdicts are checked from the notification:

DMARCAction
PASSAccept
FAILReject (configurable: quarantine or accept with flag)
NONEAccept with warning

Failed authentication is logged for DMARC aggregate reports.

Malware Scanning

Mailman scans all inbound content using ClamAV via TCP socket (INSTREAM protocol):

  • Both the raw email body and each individual attachment are scanned
  • If malware is detected, the message is rejected and not stored
  • ClamAV is optional — if no host is configured, scanning is skipped (noop scanner)

NOTE

ClamAV is fully implemented. The adapters-scan crate provides a working TCP scanner. Spam scoring (Rspamd) is not implemented and is not in scope.

Attachment Extraction

Each attachment is extracted from the MIME structure and uploaded to S3 with a storage key based on date partitioning and a hashed Message-ID.

Inbox Routing

When an inbound email arrives, Mailman determines which inbox it belongs to:

  1. Exact match — the To: address local part matches an inbox's local_part
  2. Catch-all — if no exact match, route to the domain's catch-all inbox (if one exists)
  3. Reject — if no match and no catch-all, the message is not processed

If addressed to multiple inboxes, the message fans out into independent records per inbox (separate DB rows, separate S3 copies, separate thread chains).

Reply routing: When a reply's In-Reply-To resolves to a thread in a different inbox than the addressed To:, the message fans out to both the addressed inbox and the original thread's inbox.

Inbound Idempotency

Messages are deduplicated by Message-ID:

sql
INSERT INTO messages (id, ...)
VALUES ($1, ...)
ON CONFLICT (id) DO NOTHING
RETURNING id;

If a message already exists, processing is skipped (no error).

Inbound Error Handling

Retryable errors:

  • S3 fetch timeout → Retry with backoff
  • Database connection error → Retry with backoff
  • Webhook delivery failure → Retry (separate from main processing)

Non-retryable errors:

  • Invalid MIME format → Move to DLQ, log error
  • Missing Message-ID → Move to DLQ, log error
  • Malware detected → Reject, alert, do not retry

Dead Letter Queue: Messages that fail after max retries (default: 3) are moved to inbound-dlq. DLQ messages trigger CloudWatch alarms for manual review.

Inbound Performance Targets

MetricTarget
Processing latency (p50)< 500ms
Processing latency (p99)< 2s
Throughput100 messages/second/worker
Attachment size limit25MB per message

Outbound Delivery

Outbound emails are submitted via REST API, queued for async delivery via SQS, sent by the worker via SES, and delivery status is tracked through SES notifications.

WARNING

The send endpoint currently drops attachments silently (TB-543). Attachment content is accepted in the request but not included in the outbound MIME message.

Pipeline

[Client]


[POST /send]

    ├── 1. Authenticate (JWT or API key)
    ├── 2. Validate request
    │     ├── At least one recipient
    │     ├── At least one of body_text or body_html
    │     └── Total attachment size ≤ 10MB
    ├── 3. Check suppression list (reject if any recipient suppressed)
    ├── 4. Check volume rate limit (Redis-backed, recipients/minute)
    ├── 5. Upload attachments to S3
    ├── 6. Generate Message-ID
    ├── 7. Resolve/create thread (reply_to_message_id or create new)
    ├── 8. Store message record (status: pending)
    ├── 9. Enqueue delivery job to SQS outbound queue
    └── 10. Return 202 Accepted

[SQS: outbound]


[Worker Service]

    ├── 1. Build MIME message (with threading headers)
    ├── 2. DKIM signed by SES automatically
    ├── 3. Submit to SES (raw email)
    ├── 4. Record SES message ID for delivery tracking
    ├── 5. Update delivery status → Sent
    └── 6. Delete SQS message on success

[SES]

    ├──► [Recipient MTA]

    └──► [SNS: delivery notifications]


         [SQS: telemetry]


         [Worker Service]

              ├── Update delivery status
              └── Dispatch webhook

Message-ID Generation

Format: <uuid.timestamp@domain> — UUID ensures uniqueness, timestamp aids debugging, domain matches DKIM signing domain.

Queue Message Format

The API uploads attachment content to S3 before queueing. The SQS message contains only S3 storage keys, keeping the payload well under the 256KB SQS limit. The OutboundJob struct contains the message envelope, optional body text/HTML, and a Vec<EmailAttachment> (the same attachment metadata type used elsewhere — no separate AttachmentRef type).

Source: crates/core/src/lib.rsdelivery::OutboundJob

MIME Building

The worker builds an RFC 5321 MIME message with proper threading headers (In-Reply-To, References) and multipart structure (text/plain + text/html alternatives, plus attachments).

DKIM signing is handled automatically by SES when the domain identity is verified and DKIM is enabled.

Delivery Notifications

SES publishes delivery events via SNS → SQS telemetry queue:

SES Configuration

hcl
resource "aws_ses_event_destination" "sns" {
  name                   = "delivery-events"
  configuration_set_name = aws_ses_configuration_set.main.name
  enabled                = true
  matching_types         = ["send", "delivery", "bounce", "complaint"]

  sns_destination {
    topic_arn = aws_sns_topic.delivery_events.arn
  }
}

Notification Examples

Delivery:

json
{
  "eventType": "Delivery",
  "mail": { "messageId": "ses-message-id" },
  "delivery": {
    "timestamp": "2024-01-15T10:31:00Z",
    "recipients": ["recipient@example.com"],
    "smtpResponse": "250 OK"
  }
}

Bounce:

json
{
  "eventType": "Bounce",
  "mail": { "messageId": "ses-message-id" },
  "bounce": {
    "bounceType": "Permanent",
    "bounceSubType": "General",
    "bouncedRecipients": [
      { "emailAddress": "invalid@example.com", "diagnosticCode": "550 User unknown" }
    ]
  }
}

Complaint:

json
{
  "eventType": "Complaint",
  "mail": { "messageId": "ses-message-id" },
  "complaint": {
    "complainedRecipients": [{ "emailAddress": "annoyed@example.com" }],
    "complaintFeedbackType": "abuse"
  }
}

Delivery Tracking

After SES accepts a message, delivery status is tracked via SES event notifications:

SES EventMailman StatusAction
DeliveryDeliveredUpdate status
BounceBouncedUpdate status, add to suppression list
ComplaintComplainedUpdate status, add to suppression list

The telemetry worker loop processes these events by matching SES message IDs back to internal message records.

Bounce Handling

TypeSubTypeAction
PermanentGeneralSuppress address
PermanentNoEmailSuppress address
PermanentSuppressedAlready suppressed
TransientGeneralRetry later
TransientMailboxFullRetry later
TransientContentRejectedReview content

Suppression List

Permanently bounced addresses are added to the suppression list. Suppression is checked at API time (synchronous, before queueing). If any recipient is suppressed, the API returns 422 Unprocessable Entity with the suppressed addresses listed. The message is never queued.

Volume Rate Limiting

In addition to per-token request rate limiting (governor, 100 req/min default), outbound email has a separate volume rate limit:

  • Tracked per API key in Redis
  • Counts total recipients (to + cc + bcc) per minute
  • A single request with 100 BCC recipients counts as 100 toward the limit
  • Returns 429 Too Many Requests when exceeded

Outbound Retry Strategy

Transient failures:

  • SES throttling → Exponential backoff (1s, 2s, 4s, 8s, max 60s)
  • Network errors → Retry up to 3 times
  • Transient bounces → Re-queue with delay (1 hour)

Permanent failures:

  • Hard bounce → Mark failed, no retry
  • Complaint → Mark complained, no retry
  • Invalid address → Mark failed, no retry

Outbound Performance Targets

MetricTarget
API response time (p50)< 100ms
API response time (p99)< 500ms
Time to SES submission (p50)< 2s
Time to SES submission (p99)< 10s
Throughput50 messages/second (scaling with workers)