# AGENTS ## Mission & Audience - This document lives at the root so every agentic helper knows how to make, run, and reason about the middleware. - Refer back to `docs/workstation_plan.md` for the architectural story, expected flows, and the canonical payload contract before touching new features. - Preserve the operational stability that the SQLite queue + delivery worker already provides; avoid accidental schema drift or config leaks. - Tailor every change to the Node 20+ CommonJS ecosystem and the SQLite-backed persistence layer this repo already embraces. ## Command Reference ### Install & Bootstrapping - `npm install` populates `node_modules` (no lockfile generation beyond the committed `package-lock.json`). - `npm start` is the go-to run command; it migrates the database, primes the instrument cache, spins up connectors, and starts the delivery worker plus health/metrics services. - `npm run migrate` runs `middleware/src/storage/migrate.js` on demand; use it before seeding schema migrations in new environments or CI jobs. ### Maintenance & Database Care - `npm run maintenance -- backup` copies `middleware/data/workstation.sqlite` to `workstation.sqlite.bak-`; this file should stay in place and not be committed or removed. - `npm run maintenance -- vacuum` runs SQLite's `VACUUM` via `middleware/src/scripts/maintenance.js` and logs success/failure to stdout/stderr. - `npm run maintenance -- prune --days=` deletes `delivery_log` entries older than `` days; default is 30 if `--days` is omitted. ### Testing & Single-Test Command - `npm test` executes `node middleware/test/parsers.test.js` and serves as the allowable smoke check until a richer test harness exists. - To rerun the single parser suite manually, target `node middleware/test/parsers.test.js` directly; it logs success via `console.log` and exits non-zero on failure. ## Environment & Secrets - Node 20+ is assumed because the code uses optional chaining, `String.raw`, and other modern primitives; keep the same runtime for development and CI. - All ports, DB paths, and CLQMS credentials are wired through `middleware/config/default.js` and its environmental overrides (e.g., `HTTP_JSON_PORT`, `CLQMS_TOKEN`, `WORKER_BATCH_SIZE`). - Treat `CLQMS_TOKEN`, database files, and other secrets as environment-provided values; never embed them in checked-in files. - `middleware/data/workstation.sqlite` is the runtime database. Don’t delete or reinitialize it from the repository tree unless part of an explicit migration/backup operation. ## Observability Endpoints - `/health` returns connector statuses plus pending/retrying/dead-letter counts from `middleware/src/routes/health.js`. - `/health/ready` pings the SQLite queue; any failure there should log an error and respond with `503` per the existing route logic. - `/metrics` exposes Prometheus-style gauges/counters that read straight from `queue/sqliteQueue`; keep the plaintext format exactly as defined so Prometheus scrapers don't break. - Health and metrics routers are mounted on `middleware/src/index.js` at ports declared in the config, so any addition should remain consistent with Express middleware ordering. ## Delivery Runbook & Retry Behavior - Backoff: `30s -> 2m -> 10m -> 30m -> 2h -> 6h`, max 10 attempts as defined in `config.retries.schedule`. The worker taps `buildNextAttempt` in `deliveryWorker.js` to honor this array. - Retry transient failures (timeouts, DNS/connection, HTTP 5xx); skip HTTP 400/422 or validation errors and ship those payloads immediately to `dead_letter` with the response body. - After max attempts move the canonical payload to `dead_letter` with the final error message so postmortem tooling can surface the failure. - `queue.recordDeliveryAttempt` accompanies every outbound delivery, so keep latency, status, and response code logging aligned with this helper. - Duplicate detection relies on `utils/hash.dedupeKey`; keep `results` sorted and hashed consistently so deduplication stays stable. - `deliveryWorker` marks `locked_at`/`locked_by` using `queue.claimPending` and always releases them via `queue.markOutboxStatus` to avoid worker starvation. ## Instrument Configuration Cache - Instrument configuration is cached in `instrumentConfig/service.js`; reloads happen on init and via `setInterval`, so mutate the cache through `service.upsert` rather than touching `store` directly. - `service.reload` parses JSON in the `config` column, logs parsing failures with `logger.warn`, and only keeps rows that successfully parse. - Service helpers expose `list`, `get`, and `byConnector` so connectors can fetch the subset they care about without iterating raw rows. - Store interactions use `middleware/src/storage/instrumentConfigStore.js`, which leverages `DatabaseClient` and parameterized `ON CONFLICT` upserts; follow that pattern when extending tables. - `instrumentService.init` must run before connectors start so `processMessage` can enforce instrument-enabled checks and connector matching. - Always drop payloads with no enabled config or connector mismatch and mark the raw row as `dropped` so operators can trace why a message was ignored. ## Metrics & Logging Enhancements - `metrics.js` builds human-readable Prometheus strings via `formatMetric`; keep the helper intact when adding new metrics so type/help annotations stay formatted correctly. - Metrics route reports pending, retrying, dead letters, delivery attempts, last success timestamp, and average latency; add new stats only when there is a clear operational need. - Use `queue` helpers (`pendingCount`, `retryingCount`, `deadLetterCount`, `getLastSuccessTimestamp`, `getAverageLatency`, `getDeliveryAttempts`) rather than running fresh queries in routes. - Always set the response content type to `text/plain; version=0.0.4; charset=utf-8` before returning metrics so Prometheus scrapers accept the payload. - Health logs should cite both connectors and queue metrics so failure contexts are actionable and correlate with the operational dashboards referenced in `docs/workstation_plan.md`. - Mask sensitive fields and avoid dumping raw payloads in logs; connectors and parsers add context objects to errors rather than full payload dumps. ## Maintenance Checklist - `middleware/src/scripts/maintenance.js` supports the commands `backup`, `vacuum`, and `prune --days=` (default 30); call these from CI or ops scripts when the backlog grows. - `backup` copies the SQLite file before running migrations or schema updates so you can roll back quickly. - `vacuum` recalculates and rebuilds the DB; wrap it in maintenance windows because it briefly locks the database. - `prune` deletes old rows from `delivery_log`; use the same threshold as `docs/workstation_plan.md` (default 30 days) unless stakeholders approve a different retention. - `maintenance` logging uses `console.log`/`console.error` because the script runs outside the Express app; keep those calls simple and exit with non-zero codes on failure to alert CI. - Document every manual maintenance action in the repository README or a runbook so second-tier operators know what happened. ## Data & Schema Source of Truth - All schema statements live in `middleware/db/migrations/00*_*.sql`; the bootstrapper iterates over these files alphabetically via `fs.readdirSync` and `db.exec`, so keep new migrations in that folder and add them with increasing numeric prefixes. - Table definitions include: `inbox_raw`, `outbox_result`, `delivery_log`, `instrument_config`, and `dead_letter`. An additional migration adds `locked_at` and `locked_by` to `outbox_result`. - `middleware/src/storage/migrate.js` is idempotent; it applies every `.sql` in the migrations folder unconditionally. Avoid writing irreversible SQL (DROP, ALTER without fallback) unless you also add compensating migrations. - `DatabaseClient` in `middleware/src/storage/db.js` wraps sqlite3 callbacks in promises; reuse its `run`, `get`, and `all` helpers to keep SQL parameterization consistent and to centralize `busyTimeout` configuration. ## Code Style Guidelines ### Modules, Imports, and Exports - Prefer CommonJS `const ... = require(...)` at the top of each module; grouping local `require`s by directory depth (config, utils, domain) keeps files predictable. - Export objects/functions via `module.exports = { ... }` or `module.exports = ` depending on whether multiple helpers are exported. - When a file exposes a factory (connectors, queue), return named methods (`start`, `stop`, `onMessage`, `health`) to keep the bootstrapper happy. ### Formatting & Layout - Use two spaces for indentation and include semicolons at the end of statements; this matches existing files such as `middleware/src/utils/logger.js` and `index.js`. - Keep line length reasonable (~100 characters) and break wrapped strings with template literals (see metric formatters) rather than concatenating with `+`. - Prefer single quotes for strings unless interpolation or escaping makes backticks clearer. - Keep helper functions (splitters, builders) at the top of parser modules, followed by the main exported parse function. ### Naming Conventions - Stick to camelCase for functions, methods, and variables (`processMessage`, `buildNextAttempt`, `messageHandler`). - Use descriptive object properties that mirror domain terms (`instrument_id`, `result_time`, `connector`, `status`). - Constants for configuration or retry schedules stay uppercase/lowercase as seen in `config.retries.schedule`; keep them grouped inside `config/default.js`. ### Async Flow & Error Handling - Embrace `async/await` everywhere; existing code rarely uses raw promises (except for wrappers like `new Promise((resolve) => ...)`). - Wrap I/O boundaries in `try/catch` blocks and log failures with structured data via `logger.error({ err: err.message }, '...')` so Pino hooks can parse them. - When rethrowing an error, ensure the calling context knows whether the failure is fatal (e.g., `processMessage` rethrows after queue logging). - For connectors, propagate errors through `onError` hooks so the bootstrapper can log them consistently. ### Logging & Diagnostics - Always prefer `middleware/src/utils/logger.js` instead of `console.log`/`console.error` inside core services; the exception is low-level scripts like `maintenance.js` and migration runners. - Use structured objects for context (`{ err: err.message, connector: connector.name() }`), especially around delivery failures and config reloads. - Log positive states (start listening, health server ready) along with port numbers so the runtime state can be traced during deployment. ### Validation & Canonical Payloads - Use `zod` for inbound schema checks; validators already live in `middleware/src/routes/instrumentConfig.js` and `middleware/src/normalizers/index.js`. - Always normalize parser output via `normalize(parsed)` before queue insertion to guarantee `instrument_id`, `sample_id`, `result_time`, and `results` conform to expectations. - If `normalize` throws, let the caller log the failure and drop the payload silently after marking `inbox_raw` as `failed` to avoid partial writes. ### Database & Queue Best Practices - Use `DatabaseClient` for all SQL interactions; it centralizes `busyTimeout` and promise conversion and prevents sqlite3 callback spaghetti. - Parameterize every statement with `?` placeholders (see `queue/sqliteQueue.js` and `instrumentConfigStore.js`) to avoid SQL injection hazards. - Always mark `inbox_raw` rows as `processed`, `failed`, or `dropped` after parsing to keep operators aware of what happened. - When marking `outbox_result` statuses, clear `locked_at/locked_by` and update `attempts`/`next_attempt_at` in one statement so watchers can rely on atomic semantics. ### Connectors & Pipeline Contracts - Each connector must provide `name`, `type`, `start`, `stop`, `health`, `onMessage`, and `onError` per the current implementation; keep this contract if you add new protocols. - Keep connector internals event-driven: emit `messageHandler(payload)` and handle `.catch(errorHandler)` to ensure downstream failures get logged. - For TCP connectors, track connections in `Set`s so `stop()` can destroy them before closing the server. - Do not assume payload framing beyond what the current parser needs; let the parser module handle splitting text and trimming. ### Worker & Delivery Guidelines - The delivery worker polls the queue (`config.worker.batchSize`) and records every attempt via `queue.recordDeliveryAttempt`; add retries in the same pattern if you introduce new failure-handling logic. - Respect the retry schedule defined in `config.retries.schedule`; `buildNextAttempt` uses `Math.min` to cap indexes, so new delays should append to `config.retries.schedule` only. - Duplicate detection relies on `utils/hash.dedupeKey`; keep `results` sorted and hashed consistently so deduplication stays stable. - On HTTP 400/422 responses or too many retries, move payloads to `dead_letter` and log the reason to keep operators informed. ### Testing & Coverage Expectations - Parser tests live in `middleware/test/parsers.test.js`; they rely on `node:assert` and deliberately simple sample payloads to avoid external dependencies. - Add new tests by mimicking that file’s style—plain `assert.strictEqual` checks, no test framework dependencies, and `console.log` success acknowledgment. - If you enhance the test surface, keep it runnable via `npm test` so agents and CI scripts can still rely on a single command line. ### Documentation & Storytelling - Keep `docs/workstation_plan.md` in sync with architectural changes; it surfaces connector flows, phases, retry policies, and maintenance checklists that agents rely on. - When adding routes/features, document the endpoint, request payload, and expected responses in either `docs/` or inline comments near the route. ## Cursor & Copilot Rules - No `.cursor/rules/` or `.cursorrules` directories are present in this repo; therefore there are no Cursor-specific constraints to copy here. - `.github/copilot-instructions.md` is absent as well, so there are no Copilot instructions to enforce or repeat. ## Final Notes for Agents - Keep changes isolated to their area of responsibility; the middleware is intentionally minimal, so avoid introducing new bundlers/languages. - Before opening PRs, rerun `npm run migrate` and `npm test` to verify schema/app coherence. - Use environment variable overrides from `middleware/config/default.js` when running in staging/production so the same config file can stay committed. ## Additional Notes - Never revert existing changes you did not make unless explicitly requested, since those changes were made by the user. - If there are unrelated changes in the working tree, leave them untouched and focus on the files that matter for the ticket. - Avoid destructive git commands (`git reset --hard`, `git checkout --`) unless the user explicitly requests them. - If documentation updates were part of your change, add them to `docs/workstation_plan.md` or explain why the doc already covers the behavior. - When a connector or parser handles a new instrument, double-check `instrument_config` rows to ensure the connector name matches the incoming protocol. - The `queue` keeps `status`, `attempts`, `next_attempt_at`, and `locked_*` in sync; always update all relevant columns in a single SQL call to avoid race conditions. - Keep the SQL schema in sync with `middleware/db/migrations`; add new migrations rather than editing existing ones when altering tables.