Conventions
Project Structure
mailman/
├── crates/ # Workspace crates (see Crate Map)
├── tests/e2e/ # Black-box E2E tests
├── terraform/ # AWS infrastructure
├── openapi.yaml # OpenAPI 3.1 spec (source of truth for API surface)
└── docs/ # This documentation site (Scalar, canonical)What Goes Where
| Location | Purpose | Source of Truth? |
|---|---|---|
docs/ (this site) | Canonical documentation for users and developers | Yes |
openapi.yaml | API surface contract | Yes for API shape |
When you add a feature, update docs/ and openapi.yaml.
Error Handling
Errors flow through three layers, each with a clear responsibility:
1. Domain Errors (core)
Domain errors are typed variants with structured fields (e.g., Validation { field, reason }, NotFound { resource, id }) — never match on error message strings. Use thiserror for derivation.
See
crates/core/src/lib.rs—DomainErrorenum for the full definition.
2. Error Propagation
- Use
?to propagate errors up the call chain - Library crates (
core,adapters-*,http) use typed errors withthiserror - Binary crates (
bin-api,bin-worker) may useanyhowfor top-level error handling - Never use
anyhowin library crates — it erases type information
3. HTTP Error Mapping (http)
AppError maps domain errors to HTTP responses:
| Error | HTTP Status | Response Code |
|---|---|---|
Validation | 400 | validation_error |
NotFound | 404 | not_found |
AttachmentTooLarge | 413 | validation_error |
Storage | 500 | internal_error |
Delivery | 500 | delivery_error |
VolumeLimited (on ApiError) | 429 | volume_limited |
See
crates/http/src/error.rs—ApiError::into_responsefor the full mapping.
Security: Storage and delivery errors mask internal details in the response. The real error is logged server-side before masking.
{
"error": {
"code": "internal_error",
"message": "An internal error occurred"
}
}Error Recovery Strategy
| Error Type | Recovery |
|---|---|
| Input validation | Reject immediately (400/422) |
| Entity not found | Return 404 |
| Transient I/O (network, timeout) | Retry via SQS visibility timeout |
| Permanent I/O (parse failure, malware) | Delete from queue, may land in DLQ |
| Webhook delivery failure | Exponential backoff retries (up to 7) |
| Configuration error | Fail fast on startup (panic is OK) |
Logging
Levels
| Level | When to Use | Example |
|---|---|---|
error | Unrecoverable failures that require human attention | Database connection lost, SES credentials invalid |
warn | Degraded but functional — something is wrong but the system continues | ClamAV unreachable (falling back to noop), webhook endpoint failing |
info | Normal operational events worth tracking | Request lifecycle, message processed, delivery status changed |
debug | Detailed logic useful during development | SQL queries, SQS message contents, thread resolution steps |
trace | Full data dumps | Raw MIME content, complete request/response bodies |
Structured Fields
Every log line should include context via tracing spans:
tracing::info!(
message_id = %msg.message_id,
thread_id = %thread_id,
inbox_id = %inbox_id,
"message processed"
);Always include: resource IDs (message_id, thread_id, inbox_id, domain_id) Never include: email body content at info or above, API keys, PEM keys
Request Tracing
Every API request gets a UUID trace ID:
- Generated as middleware (outermost layer)
- Respects incoming
X-Request-IDheader if present - Included in all log lines within the request span
- Returned in the response headers
Use Future::instrument(span) for async safety — never hold Span::enter() guards across .await points.
Rust Conventions
Handler Signatures
Route handlers follow a consistent pattern. Authentication is handled by middleware (not an extractor) — claims are accessed via request extensions. State is always Arc<AppState>.
See any handler in
crates/http/src/routes/for live examples (e.g.,threads.rs,send.rs).
Response DTOs
- Implement
From<&DomainType>for clean conversion from domain to response types - Use
#[serde(skip_serializing_if = "Option::is_none")]for optional fields - Separate
*ListParamsand*ListResultstructs for paginated endpoints - Clamp query parameter values to spec limits in the handler (e.g.,
limit = limit.min(100))
Naming
- Test names:
{unit}_does_{behavior}_when_{condition} - Migration files:
NNN_description.sql(sequential numbering, no gaps) - Route files: named after the resource (e.g.,
threads.rs,domains.rs) - Config env vars:
MAILMAN_SECTION__KEY(prefix + double underscore nesting)
Architecture Boundaries
The crate structure enforces architectural boundaries:
corehas zero I/O dependencies — pure domain logic only- Repository traits live in
core/src/repository.rs, implementations inadapters-* - Route handlers in
httpcall repository traits, never concrete implementations - Binary crates (
bin-api,bin-worker) are the only place where concrete types are wired together - If you find yourself importing
sqlxoraws-sdk-*incore, you're violating the architecture
Validation
Input validation uses garde with derive macros:
#[derive(Deserialize, Validate)]
pub struct CreateInboxRequest {
#[garde(length(min = 1, max = 255))]
pub name: String,
#[garde(length(min = 1, max = 64))]
pub local_part: String,
}Soft Deletes
All entities use soft deletes via deleted_at TIMESTAMPTZ. Never hard-delete records through the API. List endpoints filter with WHERE deleted_at IS NULL. We use partial indexes on this column for performance.
Async
- Use
tokio::select!for concurrent polling loops in the worker - Never block the async runtime — use
tokio::task::spawn_blockingfor CPU-heavy work - Use
Future::instrument(span)for tracing across async boundaries - Middleware order matters: trace ID (outermost) → auth → rate limit → handler
Git Conventions
- Linear history on main — squash merge for most PRs, rebase merge only when the PR has meticulous, individually meaningful commits
- Branch naming:
{author}/tb-NNN-short-description(Linear issue linked) - Commit messages: imperative mood, conventional commits (
feat:,fix:,refactor:,docs:,test:,chore:)
Database Conventions
- Primary keys are UUIDs (except
messages.message_idwhich is the RFC 5322 Message-ID text) - Timestamps are
TIMESTAMPTZ(always UTC) - All tables have
created_at, most haveupdated_atanddeleted_at - Use
gen_random_uuid()for default UUIDs - JSON metadata goes in
JSONBcolumns (e.g.,delivery_events.details) - Add indexes for any column used in
WHEREorJOINclauses - Add partial indexes for soft-delete patterns:
CREATE INDEX ... WHERE deleted_at IS NULL
API Conventions
- All responses use consistent JSON error envelope
- Errors always return
{ "error": { "code": "...", "message": "..." } } - List endpoints support
limitandoffsetpagination - Soft deletes return
204 No Content - Resource creation returns
201 Created - Async operations (send) return
202 Accepted - Axum returns
422for missing required JSON fields (deserialization failure),400for validation errors after parsing — these are different
Dependency Management
- Avoid adding new dependencies without justification. Every dependency is attack surface and compile time.
- Prefer crates from the tokio/tower ecosystem for async compatibility
- Pin major versions in
Cargo.toml(e.g.,serde = "1"notserde = "*") - Run
make checkbefore committing — CI enforces fmt, clippy, test, and sqlx-check