tinylink/AGENTS.md

152 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# AGENTS
## Mission & Audience
- This document lives at the root so every agentic helper knows how to make, run, and reason about the middleware.
- Refer back to `docs/workstation_plan.md` for the architectural story, expected flows, and the canonical payload contract before touching new features.
- Preserve the operational stability that the SQLite queue + delivery worker already provides; avoid accidental schema drift or config leaks.
- Tailor every change to the Node 20+ CommonJS ecosystem and the SQLite-backed persistence layer this repo already embraces.
## Command Reference
### Install & Bootstrapping
- `npm install` populates `node_modules` (no lockfile generation beyond the committed `package-lock.json`).
- `npm start` is the go-to run command; it migrates the database, primes the instrument cache, spins up connectors, and starts the delivery worker plus health/metrics services.
- `npm run migrate` runs `middleware/src/storage/migrate.js` on demand; use it before seeding schema migrations in new environments or CI jobs.
### Maintenance & Database Care
- `npm run maintenance -- backup` copies `middleware/data/workstation.sqlite` to `workstation.sqlite.bak-<timestamp>`; this file should stay in place and not be committed or removed.
- `npm run maintenance -- vacuum` runs SQLite's `VACUUM` via `middleware/src/scripts/maintenance.js` and logs success/failure to stdout/stderr.
- `npm run maintenance -- prune --days=<n>` deletes `delivery_log` entries older than `<n>` days; default is 30 if `--days` is omitted.
### Testing & Single-Test Command
- `npm test` executes `node middleware/test/parsers.test.js` and serves as the allowable smoke check until a richer test harness exists.
- To rerun the single parser suite manually, target `node middleware/test/parsers.test.js` directly; it logs success via `console.log` and exits non-zero on failure.
## Environment & Secrets
- Node 20+ is assumed because the code uses optional chaining, `String.raw`, and other modern primitives; keep the same runtime for development and CI.
- All ports, DB paths, and CLQMS credentials are wired through `middleware/config/default.js` and its environmental overrides (e.g., `HTTP_JSON_PORT`, `CLQMS_TOKEN`, `WORKER_BATCH_SIZE`).
- Treat `CLQMS_TOKEN`, database files, and other secrets as environment-provided values; never embed them in checked-in files.
- `middleware/data/workstation.sqlite` is the runtime database. Dont delete or reinitialize it from the repository tree unless part of an explicit migration/backup operation.
## Observability Endpoints
- `/health` returns connector statuses plus pending/retrying/dead-letter counts from `middleware/src/routes/health.js`.
- `/health/ready` pings the SQLite queue; any failure there should log an error and respond with `503` per the existing route logic.
- `/metrics` exposes Prometheus-style gauges/counters that read straight from `queue/sqliteQueue`; keep the plaintext format exactly as defined so Prometheus scrapers don't break.
- Health and metrics routers are mounted on `middleware/src/index.js` at ports declared in the config, so any addition should remain consistent with Express middleware ordering.
## Delivery Runbook & Retry Behavior
- Backoff: `30s -> 2m -> 10m -> 30m -> 2h -> 6h`, max 10 attempts as defined in `config.retries.schedule`. The worker taps `buildNextAttempt` in `deliveryWorker.js` to honor this array.
- Retry transient failures (timeouts, DNS/connection, HTTP 5xx); skip HTTP 400/422 or validation errors and ship those payloads immediately to `dead_letter` with the response body.
- After max attempts move the canonical payload to `dead_letter` with the final error message so postmortem tooling can surface the failure.
- `queue.recordDeliveryAttempt` accompanies every outbound delivery, so keep latency, status, and response code logging aligned with this helper.
- Duplicate detection relies on `utils/hash.dedupeKey`; keep `results` sorted and hashed consistently so deduplication stays stable.
- `deliveryWorker` marks `locked_at`/`locked_by` using `queue.claimPending` and always releases them via `queue.markOutboxStatus` to avoid worker starvation.
## Instrument Configuration Cache
- Instrument configuration is cached in `instrumentConfig/service.js`; reloads happen on init and via `setInterval`, so mutate the cache through `service.upsert` rather than touching `store` directly.
- `service.reload` parses JSON in the `config` column, logs parsing failures with `logger.warn`, and only keeps rows that successfully parse.
- Service helpers expose `list`, `get`, and `byConnector` so connectors can fetch the subset they care about without iterating raw rows.
- Store interactions use `middleware/src/storage/instrumentConfigStore.js`, which leverages `DatabaseClient` and parameterized `ON CONFLICT` upserts; follow that pattern when extending tables.
- `instrumentService.init` must run before connectors start so `processMessage` can enforce instrument-enabled checks and connector matching.
- Always drop payloads with no enabled config or connector mismatch and mark the raw row as `dropped` so operators can trace why a message was ignored.
## Metrics & Logging Enhancements
- `metrics.js` builds human-readable Prometheus strings via `formatMetric`; keep the helper intact when adding new metrics so type/help annotations stay formatted correctly.
- Metrics route reports pending, retrying, dead letters, delivery attempts, last success timestamp, and average latency; add new stats only when there is a clear operational need.
- Use `queue` helpers (`pendingCount`, `retryingCount`, `deadLetterCount`, `getLastSuccessTimestamp`, `getAverageLatency`, `getDeliveryAttempts`) rather than running fresh queries in routes.
- Always set the response content type to `text/plain; version=0.0.4; charset=utf-8` before returning metrics so Prometheus scrapers accept the payload.
- Health logs should cite both connectors and queue metrics so failure contexts are actionable and correlate with the operational dashboards referenced in `docs/workstation_plan.md`.
- Mask sensitive fields and avoid dumping raw payloads in logs; connectors and parsers add context objects to errors rather than full payload dumps.
## Maintenance Checklist
- `middleware/src/scripts/maintenance.js` supports the commands `backup`, `vacuum`, and `prune --days=<n>` (default 30); call these from CI or ops scripts when the backlog grows.
- `backup` copies the SQLite file before running migrations or schema updates so you can roll back quickly.
- `vacuum` recalculates and rebuilds the DB; wrap it in maintenance windows because it briefly locks the database.
- `prune` deletes old rows from `delivery_log`; use the same threshold as `docs/workstation_plan.md` (default 30 days) unless stakeholders approve a different retention.
- `maintenance` logging uses `console.log`/`console.error` because the script runs outside the Express app; keep those calls simple and exit with non-zero codes on failure to alert CI.
- Document every manual maintenance action in the repository README or a runbook so second-tier operators know what happened.
## Data & Schema Source of Truth
- All schema statements live in `middleware/db/migrations/00*_*.sql`; the bootstrapper iterates over these files alphabetically via `fs.readdirSync` and `db.exec`, so keep new migrations in that folder and add them with increasing numeric prefixes.
- Table definitions include: `inbox_raw`, `outbox_result`, `delivery_log`, `instrument_config`, and `dead_letter`. An additional migration adds `locked_at` and `locked_by` to `outbox_result`.
- `middleware/src/storage/migrate.js` is idempotent; it applies every `.sql` in the migrations folder unconditionally. Avoid writing irreversible SQL (DROP, ALTER without fallback) unless you also add compensating migrations.
- `DatabaseClient` in `middleware/src/storage/db.js` wraps sqlite3 callbacks in promises; reuse its `run`, `get`, and `all` helpers to keep SQL parameterization consistent and to centralize `busyTimeout` configuration.
## Code Style Guidelines
### Modules, Imports, and Exports
- Prefer CommonJS `const ... = require(...)` at the top of each module; grouping local `require`s by directory depth (config, utils, domain) keeps files predictable.
- Export objects/functions via `module.exports = { ... }` or `module.exports = <function>` depending on whether multiple helpers are exported.
- When a file exposes a factory (connectors, queue), return named methods (`start`, `stop`, `onMessage`, `health`) to keep the bootstrapper happy.
### Formatting & Layout
- Use two spaces for indentation and include semicolons at the end of statements; this matches existing files such as `middleware/src/utils/logger.js` and `index.js`.
- Keep line length reasonable (~100 characters) and break wrapped strings with template literals (see metric formatters) rather than concatenating with `+`.
- Prefer single quotes for strings unless interpolation or escaping makes backticks clearer.
- Keep helper functions (splitters, builders) at the top of parser modules, followed by the main exported parse function.
### Naming Conventions
- Stick to camelCase for functions, methods, and variables (`processMessage`, `buildNextAttempt`, `messageHandler`).
- Use descriptive object properties that mirror domain terms (`instrument_id`, `result_time`, `connector`, `status`).
- Constants for configuration or retry schedules stay uppercase/lowercase as seen in `config.retries.schedule`; keep them grouped inside `config/default.js`.
### Async Flow & Error Handling
- Embrace `async/await` everywhere; existing code rarely uses raw promises (except for wrappers like `new Promise((resolve) => ...)`).
- Wrap I/O boundaries in `try/catch` blocks and log failures with structured data via `logger.error({ err: err.message }, '...')` so Pino hooks can parse them.
- When rethrowing an error, ensure the calling context knows whether the failure is fatal (e.g., `processMessage` rethrows after queue logging).
- For connectors, propagate errors through `onError` hooks so the bootstrapper can log them consistently.
### Logging & Diagnostics
- Always prefer `middleware/src/utils/logger.js` instead of `console.log`/`console.error` inside core services; the exception is low-level scripts like `maintenance.js` and migration runners.
- Use structured objects for context (`{ err: err.message, connector: connector.name() }`), especially around delivery failures and config reloads.
- Log positive states (start listening, health server ready) along with port numbers so the runtime state can be traced during deployment.
### Validation & Canonical Payloads
- Use `zod` for inbound schema checks; validators already live in `middleware/src/routes/instrumentConfig.js` and `middleware/src/normalizers/index.js`.
- Always normalize parser output via `normalize(parsed)` before queue insertion to guarantee `instrument_id`, `sample_id`, `result_time`, and `results` conform to expectations.
- If `normalize` throws, let the caller log the failure and drop the payload silently after marking `inbox_raw` as `failed` to avoid partial writes.
### Database & Queue Best Practices
- Use `DatabaseClient` for all SQL interactions; it centralizes `busyTimeout` and promise conversion and prevents sqlite3 callback spaghetti.
- Parameterize every statement with `?` placeholders (see `queue/sqliteQueue.js` and `instrumentConfigStore.js`) to avoid SQL injection hazards.
- Always mark `inbox_raw` rows as `processed`, `failed`, or `dropped` after parsing to keep operators aware of what happened.
- When marking `outbox_result` statuses, clear `locked_at/locked_by` and update `attempts`/`next_attempt_at` in one statement so watchers can rely on atomic semantics.
### Connectors & Pipeline Contracts
- Each connector must provide `name`, `type`, `start`, `stop`, `health`, `onMessage`, and `onError` per the current implementation; keep this contract if you add new protocols.
- Keep connector internals event-driven: emit `messageHandler(payload)` and handle `.catch(errorHandler)` to ensure downstream failures get logged.
- For TCP connectors, track connections in `Set`s so `stop()` can destroy them before closing the server.
- Do not assume payload framing beyond what the current parser needs; let the parser module handle splitting text and trimming.
### Worker & Delivery Guidelines
- The delivery worker polls the queue (`config.worker.batchSize`) and records every attempt via `queue.recordDeliveryAttempt`; add retries in the same pattern if you introduce new failure-handling logic.
- Respect the retry schedule defined in `config.retries.schedule`; `buildNextAttempt` uses `Math.min` to cap indexes, so new delays should append to `config.retries.schedule` only.
- Duplicate detection relies on `utils/hash.dedupeKey`; keep `results` sorted and hashed consistently so deduplication stays stable.
- On HTTP 400/422 responses or too many retries, move payloads to `dead_letter` and log the reason to keep operators informed.
### Testing & Coverage Expectations
- Parser tests live in `middleware/test/parsers.test.js`; they rely on `node:assert` and deliberately simple sample payloads to avoid external dependencies.
- Add new tests by mimicking that files style—plain `assert.strictEqual` checks, no test framework dependencies, and `console.log` success acknowledgment.
- If you enhance the test surface, keep it runnable via `npm test` so agents and CI scripts can still rely on a single command line.
### Documentation & Storytelling
- Keep `docs/workstation_plan.md` in sync with architectural changes; it surfaces connector flows, phases, retry policies, and maintenance checklists that agents rely on.
- When adding routes/features, document the endpoint, request payload, and expected responses in either `docs/` or inline comments near the route.
## Cursor & Copilot Rules
- No `.cursor/rules/` or `.cursorrules` directories are present in this repo; therefore there are no Cursor-specific constraints to copy here.
- `.github/copilot-instructions.md` is absent as well, so there are no Copilot instructions to enforce or repeat.
## Final Notes for Agents
- Keep changes isolated to their area of responsibility; the middleware is intentionally minimal, so avoid introducing new bundlers/languages.
- Before opening PRs, rerun `npm run migrate` and `npm test` to verify schema/app coherence.
- Use environment variable overrides from `middleware/config/default.js` when running in staging/production so the same config file can stay committed.
## Additional Notes
- Never revert existing changes you did not make unless explicitly requested, since those changes were made by the user.
- If there are unrelated changes in the working tree, leave them untouched and focus on the files that matter for the ticket.
- Avoid destructive git commands (`git reset --hard`, `git checkout --`) unless the user explicitly requests them.
- If documentation updates were part of your change, add them to `docs/workstation_plan.md` or explain why the doc already covers the behavior.
- When a connector or parser handles a new instrument, double-check `instrument_config` rows to ensure the connector name matches the incoming protocol.
- The `queue` keeps `status`, `attempts`, `next_attempt_at`, and `locked_*` in sync; always update all relevant columns in a single SQL call to avoid race conditions.
- Keep the SQL schema in sync with `middleware/db/migrations`; add new migrations rather than editing existing ones when altering tables.