Skip to content

Conventions

Project Structure

mailman/
├── crates/               # Workspace crates (see Crate Map)
├── tests/e2e/            # Black-box E2E tests
├── terraform/            # AWS infrastructure
├── openapi.yaml          # OpenAPI 3.1 spec (source of truth for API surface)
└── docs/                 # This documentation site (Scalar, canonical)

What Goes Where

LocationPurposeSource of Truth?
docs/ (this site)Canonical documentation for users and developersYes
openapi.yamlAPI surface contractYes for API shape

When you add a feature, update docs/ and openapi.yaml.

Error Handling

Errors flow through three layers, each with a clear responsibility:

1. Domain Errors (core)

Domain errors are typed variants with structured fields (e.g., Validation { field, reason }, NotFound { resource, id }) — never match on error message strings. Use thiserror for derivation.

See crates/core/src/lib.rsDomainError enum for the full definition.

2. Error Propagation

  • Use ? to propagate errors up the call chain
  • Library crates (core, adapters-*, http) use typed errors with thiserror
  • Binary crates (bin-api, bin-worker) may use anyhow for top-level error handling
  • Never use anyhow in library crates — it erases type information

3. HTTP Error Mapping (http)

AppError maps domain errors to HTTP responses:

ErrorHTTP StatusResponse Code
Validation400validation_error
NotFound404not_found
AttachmentTooLarge413validation_error
Storage500internal_error
Delivery500delivery_error
VolumeLimited (on ApiError)429volume_limited

See crates/http/src/error.rsApiError::into_response for the full mapping.

Security: Storage and delivery errors mask internal details in the response. The real error is logged server-side before masking.

json
{
  "error": {
    "code": "internal_error",
    "message": "An internal error occurred"
  }
}

Error Recovery Strategy

Error TypeRecovery
Input validationReject immediately (400/422)
Entity not foundReturn 404
Transient I/O (network, timeout)Retry via SQS visibility timeout
Permanent I/O (parse failure, malware)Delete from queue, may land in DLQ
Webhook delivery failureExponential backoff retries (up to 7)
Configuration errorFail fast on startup (panic is OK)

Logging

Levels

LevelWhen to UseExample
errorUnrecoverable failures that require human attentionDatabase connection lost, SES credentials invalid
warnDegraded but functional — something is wrong but the system continuesClamAV unreachable (falling back to noop), webhook endpoint failing
infoNormal operational events worth trackingRequest lifecycle, message processed, delivery status changed
debugDetailed logic useful during developmentSQL queries, SQS message contents, thread resolution steps
traceFull data dumpsRaw MIME content, complete request/response bodies

Structured Fields

Every log line should include context via tracing spans:

rust
tracing::info!(
    message_id = %msg.message_id,
    thread_id = %thread_id,
    inbox_id = %inbox_id,
    "message processed"
);

Always include: resource IDs (message_id, thread_id, inbox_id, domain_id) Never include: email body content at info or above, API keys, PEM keys

Request Tracing

Every API request gets a UUID trace ID:

  • Generated as middleware (outermost layer)
  • Respects incoming X-Request-ID header if present
  • Included in all log lines within the request span
  • Returned in the response headers

Use Future::instrument(span) for async safety — never hold Span::enter() guards across .await points.

Rust Conventions

Handler Signatures

Route handlers follow a consistent pattern. Authentication is handled by middleware (not an extractor) — claims are accessed via request extensions. State is always Arc<AppState>.

See any handler in crates/http/src/routes/ for live examples (e.g., threads.rs, send.rs).

Response DTOs

  • Implement From<&DomainType> for clean conversion from domain to response types
  • Use #[serde(skip_serializing_if = "Option::is_none")] for optional fields
  • Separate *ListParams and *ListResult structs for paginated endpoints
  • Clamp query parameter values to spec limits in the handler (e.g., limit = limit.min(100))

Naming

  • Test names: {unit}_does_{behavior}_when_{condition}
  • Migration files: NNN_description.sql (sequential numbering, no gaps)
  • Route files: named after the resource (e.g., threads.rs, domains.rs)
  • Config env vars: MAILMAN_SECTION__KEY (prefix + double underscore nesting)

Architecture Boundaries

The crate structure enforces architectural boundaries:

  • core has zero I/O dependencies — pure domain logic only
  • Repository traits live in core/src/repository.rs, implementations in adapters-*
  • Route handlers in http call repository traits, never concrete implementations
  • Binary crates (bin-api, bin-worker) are the only place where concrete types are wired together
  • If you find yourself importing sqlx or aws-sdk-* in core, you're violating the architecture

Validation

Input validation uses garde with derive macros:

rust
#[derive(Deserialize, Validate)]
pub struct CreateInboxRequest {
    #[garde(length(min = 1, max = 255))]
    pub name: String,
    #[garde(length(min = 1, max = 64))]
    pub local_part: String,
}

Soft Deletes

All entities use soft deletes via deleted_at TIMESTAMPTZ. Never hard-delete records through the API. List endpoints filter with WHERE deleted_at IS NULL. We use partial indexes on this column for performance.

Async

  • Use tokio::select! for concurrent polling loops in the worker
  • Never block the async runtime — use tokio::task::spawn_blocking for CPU-heavy work
  • Use Future::instrument(span) for tracing across async boundaries
  • Middleware order matters: trace ID (outermost) → auth → rate limit → handler

Git Conventions

  • Linear history on main — squash merge for most PRs, rebase merge only when the PR has meticulous, individually meaningful commits
  • Branch naming: {author}/tb-NNN-short-description (Linear issue linked)
  • Commit messages: imperative mood, conventional commits (feat:, fix:, refactor:, docs:, test:, chore:)

Database Conventions

  • Primary keys are UUIDs (except messages.message_id which is the RFC 5322 Message-ID text)
  • Timestamps are TIMESTAMPTZ (always UTC)
  • All tables have created_at, most have updated_at and deleted_at
  • Use gen_random_uuid() for default UUIDs
  • JSON metadata goes in JSONB columns (e.g., delivery_events.details)
  • Add indexes for any column used in WHERE or JOIN clauses
  • Add partial indexes for soft-delete patterns: CREATE INDEX ... WHERE deleted_at IS NULL

API Conventions

  • All responses use consistent JSON error envelope
  • Errors always return { "error": { "code": "...", "message": "..." } }
  • List endpoints support limit and offset pagination
  • Soft deletes return 204 No Content
  • Resource creation returns 201 Created
  • Async operations (send) return 202 Accepted
  • Axum returns 422 for missing required JSON fields (deserialization failure), 400 for validation errors after parsing — these are different

Dependency Management

  • Avoid adding new dependencies without justification. Every dependency is attack surface and compile time.
  • Prefer crates from the tokio/tower ecosystem for async compatibility
  • Pin major versions in Cargo.toml (e.g., serde = "1" not serde = "*")
  • Run make check before committing — CI enforces fmt, clippy, test, and sqlx-check