Conventions

Project Structure

mailman/
├── crates/               # Workspace crates (see Crate Map)
├── tests/e2e/            # Black-box E2E tests
├── terraform/            # AWS infrastructure
├── openapi.yaml          # OpenAPI 3.1 spec (source of truth for API surface)
└── docs/                 # This documentation site (Scalar, canonical)

What Goes Where

Location	Purpose	Source of Truth?
`docs/` (this site)	Canonical documentation for users and developers	Yes
`openapi.yaml`	API surface contract	Yes for API shape

When you add a feature, update docs/ and openapi.yaml.

Error Handling

Errors flow through three layers, each with a clear responsibility:

1. Domain Errors (`core`)

Domain errors are typed variants with structured fields (e.g., Validation { field, reason }, NotFound { resource, id }) — never match on error message strings. Use thiserror for derivation.

See crates/core/src/lib.rs — DomainError enum for the full definition.

2. Error Propagation

Use ? to propagate errors up the call chain
Library crates (core, adapters-*, http) use typed errors with thiserror
Binary crates (bin-api, bin-worker) may use anyhow for top-level error handling
Never use anyhow in library crates — it erases type information

3. HTTP Error Mapping (`http`)

AppError maps domain errors to HTTP responses:

Error	HTTP Status	Response Code
`Validation`	400	`validation_error`
`NotFound`	404	`not_found`
`AttachmentTooLarge`	413	`validation_error`
`Storage`	500	`internal_error`
`Delivery`	500	`delivery_error`
`VolumeLimited` (on `ApiError`)	429	`volume_limited`

See crates/http/src/error.rs — ApiError::into_response for the full mapping.

Security: Storage and delivery errors mask internal details in the response. The real error is logged server-side before masking.

json

{
  "error": {
    "code": "internal_error",
    "message": "An internal error occurred"
  }
}

Error Recovery Strategy

Error Type	Recovery
Input validation	Reject immediately (400/422)
Entity not found	Return 404
Transient I/O (network, timeout)	Retry via SQS visibility timeout
Permanent I/O (parse failure, malware)	Delete from queue, may land in DLQ
Webhook delivery failure	Exponential backoff retries (up to 7)
Configuration error	Fail fast on startup (panic is OK)

Logging

Levels

Level	When to Use	Example
`error`	Unrecoverable failures that require human attention	Database connection lost, SES credentials invalid
`warn`	Degraded but functional — something is wrong but the system continues	ClamAV unreachable (falling back to noop), webhook endpoint failing
`info`	Normal operational events worth tracking	Request lifecycle, message processed, delivery status changed
`debug`	Detailed logic useful during development	SQL queries, SQS message contents, thread resolution steps
`trace`	Full data dumps	Raw MIME content, complete request/response bodies

Structured Fields

Every log line should include context via tracing spans:

rust

tracing::info!(
    message_id = %msg.message_id,
    thread_id = %thread_id,
    inbox_id = %inbox_id,
    "message processed"
);

Always include: resource IDs (message_id, thread_id, inbox_id, domain_id) Never include: email body content at info or above, API keys, PEM keys

Request Tracing

Every API request gets a UUID trace ID:

Generated as middleware (outermost layer)
Respects incoming X-Request-ID header if present
Included in all log lines within the request span
Returned in the response headers

Use Future::instrument(span) for async safety — never hold Span::enter() guards across .await points.

Rust Conventions

Handler Signatures

Route handlers follow a consistent pattern. Authentication is handled by middleware (not an extractor) — claims are accessed via request extensions. State is always Arc<AppState>.

See any handler in crates/http/src/routes/ for live examples (e.g., threads.rs, send.rs).

Response DTOs

Implement From<&DomainType> for clean conversion from domain to response types
Use #[serde(skip_serializing_if = "Option::is_none")] for optional fields
Separate *ListParams and *ListResult structs for paginated endpoints
Clamp query parameter values to spec limits in the handler (e.g., limit = limit.min(100))

Naming

Test names: {unit}_does_{behavior}_when_{condition}
Migration files: NNN_description.sql (sequential numbering, no gaps)
Route files: named after the resource (e.g., threads.rs, domains.rs)
Config env vars: MAILMAN_SECTION__KEY (prefix + double underscore nesting)

Architecture Boundaries

The crate structure enforces architectural boundaries:

core has zero I/O dependencies — pure domain logic only
Repository traits live in core/src/repository.rs, implementations in adapters-*
Route handlers in http call repository traits, never concrete implementations
Binary crates (bin-api, bin-worker) are the only place where concrete types are wired together
If you find yourself importing sqlx or aws-sdk-* in core, you're violating the architecture

Validation

Input validation uses garde with derive macros:

rust

#[derive(Deserialize, Validate)]
pub struct CreateInboxRequest {
    #[garde(length(min = 1, max = 255))]
    pub name: String,
    #[garde(length(min = 1, max = 64))]
    pub local_part: String,
}

Soft Deletes

All entities use soft deletes via deleted_at TIMESTAMPTZ. Never hard-delete records through the API. List endpoints filter with WHERE deleted_at IS NULL. We use partial indexes on this column for performance.

Async

Use tokio::select! for concurrent polling loops in the worker
Never block the async runtime — use tokio::task::spawn_blocking for CPU-heavy work
Use Future::instrument(span) for tracing across async boundaries
Middleware order matters: trace ID (outermost) → auth → rate limit → handler

Git Conventions

Linear history on main — squash merge for most PRs, rebase merge only when the PR has meticulous, individually meaningful commits
Branch naming: {author}/tb-NNN-short-description (Linear issue linked)
Commit messages: imperative mood, conventional commits (feat:, fix:, refactor:, docs:, test:, chore:)

Database Conventions

Primary keys are UUIDs (except messages.message_id which is the RFC 5322 Message-ID text)
Timestamps are TIMESTAMPTZ (always UTC)
All tables have created_at, most have updated_at and deleted_at
Use gen_random_uuid() for default UUIDs
JSON metadata goes in JSONB columns (e.g., delivery_events.details)
Add indexes for any column used in WHERE or JOIN clauses
Add partial indexes for soft-delete patterns: CREATE INDEX ... WHERE deleted_at IS NULL

API Conventions

All responses use consistent JSON error envelope
Errors always return { "error": { "code": "...", "message": "..." } }
List endpoints support limit and offset pagination
Soft deletes return 204 No Content
Resource creation returns 201 Created
Async operations (send) return 202 Accepted
Axum returns 422 for missing required JSON fields (deserialization failure), 400 for validation errors after parsing — these are different

Dependency Management

Avoid adding new dependencies without justification. Every dependency is attack surface and compile time.
Prefer crates from the tokio/tower ecosystem for async compatibility
Pin major versions in Cargo.toml (e.g., serde = "1" not serde = "*")
Run make check before committing — CI enforces fmt, clippy, test, and sqlx-check

Conventions ​

Project Structure ​

What Goes Where ​

Error Handling ​

1. Domain Errors (core) ​

2. Error Propagation ​

3. HTTP Error Mapping (http) ​

Error Recovery Strategy ​

Logging ​

Levels ​

Structured Fields ​

Request Tracing ​

Rust Conventions ​

Handler Signatures ​

Response DTOs ​

Naming ​

Architecture Boundaries ​

Validation ​

Soft Deletes ​

Async ​

Git Conventions ​

Database Conventions ​

API Conventions ​

Dependency Management ​