mahdahar dc6cca71cf feat: move instrument onboarding to YAML config

Replace DB-backed instrument upserts with app.yaml-driven config loading, matching, and translator application in the ingestion workflow. Also add serial-port connector support, startup validation tooling, and migration tracking updates to keep runtime behavior and docs aligned.

2026-04-06 16:50:17 +07:00

15 KiB

Raw Blame History

AGENTS

Mission & Audience

This document lives at the root so every agentic helper knows how to make, run, and reason about the middleware.
Refer back to docs/workstation_plan.md for the architectural story, expected flows, and the canonical payload contract before touching new features.
Preserve the operational stability that the SQLite queue + delivery worker already provides; avoid accidental schema drift or config leaks.
Tailor every change to the Node 20+ CommonJS ecosystem and the SQLite-backed persistence layer this repo already embraces.

Command Reference

Install & Bootstrapping

npm install populates node_modules (no lockfile generation beyond the committed package-lock.json).
npm start is the go-to run command; it migrates the database, primes the instrument cache, spins up connectors, and starts the delivery worker plus health/metrics services.
npm run migrate runs middleware/src/storage/migrate.js on demand; use it before seeding schema migrations in new environments or CI jobs.

Maintenance & Database Care

npm run maintenance -- backup copies middleware/data/workstation.sqlite to workstation.sqlite.bak-<timestamp>; this file should stay in place and not be committed or removed.
npm run maintenance -- vacuum runs SQLite's VACUUM via middleware/src/scripts/maintenance.js and logs success/failure to stdout/stderr.
npm run maintenance -- prune --days=<n> deletes delivery_log entries older than <n> days; default is 30 if --days is omitted.

Testing & Single-Test Command

npm test executes node middleware/test/parsers.test.js and serves as the allowable smoke check until a richer test harness exists.
To rerun the single parser suite manually, target node middleware/test/parsers.test.js directly; it logs success via console.log and exits non-zero on failure.

Environment & Secrets

Node 20+ is assumed because the code uses optional chaining, String.raw, and other modern primitives; keep the same runtime for development and CI.
All ports, DB paths, and CLQMS credentials are sourced from middleware/config/app.yaml (loaded by middleware/config/default.js) as the single runtime config file.
Treat CLQMS_TOKEN, database files, and other secrets as environment-provided values; never embed them in checked-in files.
middleware/data/workstation.sqlite is the runtime database. Don’t delete or reinitialize it from the repository tree unless part of an explicit migration/backup operation.

Observability Endpoints

/health returns connector statuses plus pending/retrying/dead-letter counts from middleware/src/routes/health.js.
/health/ready pings the SQLite queue; any failure there should log an error and respond with 503 per the existing route logic.
/metrics exposes Prometheus-style gauges/counters that read straight from queue/sqliteQueue; keep the plaintext format exactly as defined so Prometheus scrapers don't break.
Health and metrics routers are mounted on middleware/src/index.js at ports declared in the config, so any addition should remain consistent with Express middleware ordering.

Delivery Runbook & Retry Behavior

Backoff: 30s -> 2m -> 10m -> 30m -> 2h -> 6h, max 10 attempts as defined in config.retries.schedule. The worker taps buildNextAttempt in deliveryWorker.js to honor this array.
Retry transient failures (timeouts, DNS/connection, HTTP 5xx); skip HTTP 400/422 or validation errors and ship those payloads immediately to dead_letter with the response body.
After max attempts move the canonical payload to dead_letter with the final error message so postmortem tooling can surface the failure.
queue.recordDeliveryAttempt accompanies every outbound delivery, so keep latency, status, and response code logging aligned with this helper.
Duplicate detection relies on utils/hash.dedupeKey; keep results sorted and hashed consistently so deduplication stays stable.
deliveryWorker marks locked_at/locked_by using queue.claimPending and always releases them via queue.markOutboxStatus to avoid worker starvation.

Instrument Configuration Cache

Instrument configuration is cached in instrumentConfig/service.js; reloads happen on init and via setInterval, so mutate the cache through service.upsert rather than touching store directly.
service.reload parses JSON in the config column, logs parsing failures with logger.warn, and only keeps rows that successfully parse.
Service helpers expose list, get, and byConnector so connectors can fetch the subset they care about without iterating raw rows.
Store interactions use middleware/src/storage/instrumentConfigStore.js, which leverages DatabaseClient and parameterized ON CONFLICT upserts; follow that pattern when extending tables.
instrumentService.init must run before connectors start so processMessage can enforce instrument-enabled checks and connector matching.
Always drop payloads with no enabled config or connector mismatch and mark the raw row as dropped so operators can trace why a message was ignored.

Metrics & Logging Enhancements

metrics.js builds human-readable Prometheus strings via formatMetric; keep the helper intact when adding new metrics so type/help annotations stay formatted correctly.
Metrics route reports pending, retrying, dead letters, delivery attempts, last success timestamp, and average latency; add new stats only when there is a clear operational need.
Use queue helpers (pendingCount, retryingCount, deadLetterCount, getLastSuccessTimestamp, getAverageLatency, getDeliveryAttempts) rather than running fresh queries in routes.
Always set the response content type to text/plain; version=0.0.4; charset=utf-8 before returning metrics so Prometheus scrapers accept the payload.
Health logs should cite both connectors and queue metrics so failure contexts are actionable and correlate with the operational dashboards referenced in docs/workstation_plan.md.
Mask sensitive fields and avoid dumping raw payloads in logs; connectors and parsers add context objects to errors rather than full payload dumps.

Maintenance Checklist

middleware/src/scripts/maintenance.js supports the commands backup, vacuum, and prune --days=<n> (default 30); call these from CI or ops scripts when the backlog grows.
backup copies the SQLite file before running migrations or schema updates so you can roll back quickly.
vacuum recalculates and rebuilds the DB; wrap it in maintenance windows because it briefly locks the database.
prune deletes old rows from delivery_log; use the same threshold as docs/workstation_plan.md (default 30 days) unless stakeholders approve a different retention.
maintenance logging uses console.log/console.error because the script runs outside the Express app; keep those calls simple and exit with non-zero codes on failure to alert CI.
Document every manual maintenance action in the repository README or a runbook so second-tier operators know what happened.

Data & Schema Source of Truth

All schema statements live in middleware/db/migrations/00*_*.sql; the bootstrapper iterates over these files alphabetically via fs.readdirSync and db.exec, so keep new migrations in that folder and add them with increasing numeric prefixes.
Table definitions include: inbox_raw, outbox_result, delivery_log, instrument_config, and dead_letter. An additional migration adds locked_at and locked_by to outbox_result.
middleware/src/storage/migrate.js is idempotent; it applies every .sql in the migrations folder unconditionally. Avoid writing irreversible SQL (DROP, ALTER without fallback) unless you also add compensating migrations.
DatabaseClient in middleware/src/storage/db.js wraps sqlite3 callbacks in promises; reuse its run, get, and all helpers to keep SQL parameterization consistent and to centralize busyTimeout configuration.

Code Style Guidelines

Modules, Imports, and Exports

Prefer CommonJS const ... = require(...) at the top of each module; grouping local requires by directory depth (config, utils, domain) keeps files predictable.
Export objects/functions via module.exports = { ... } or module.exports = <function> depending on whether multiple helpers are exported.
When a file exposes a factory (connectors, queue), return named methods (start, stop, onMessage, health) to keep the bootstrapper happy.

Formatting & Layout

Use two spaces for indentation and include semicolons at the end of statements; this matches existing files such as middleware/src/utils/logger.js and index.js.
Keep line length reasonable (~100 characters) and break wrapped strings with template literals (see metric formatters) rather than concatenating with +.
Prefer single quotes for strings unless interpolation or escaping makes backticks clearer.
Keep helper functions (splitters, builders) at the top of parser modules, followed by the main exported parse function.

Naming Conventions

Stick to camelCase for functions, methods, and variables (processMessage, buildNextAttempt, messageHandler).
Use descriptive object properties that mirror domain terms (instrument_id, result_time, connector, status).
Constants for configuration or retry schedules stay uppercase/lowercase as seen in config.retries.schedule; keep them grouped inside config/default.js.

Async Flow & Error Handling

Embrace async/await everywhere; existing code rarely uses raw promises (except for wrappers like new Promise((resolve) => ...)).
Wrap I/O boundaries in try/catch blocks and log failures with structured data via logger.error({ err: err.message }, '...') so Pino hooks can parse them.
When rethrowing an error, ensure the calling context knows whether the failure is fatal (e.g., processMessage rethrows after queue logging).
For connectors, propagate errors through onError hooks so the bootstrapper can log them consistently.

Logging & Diagnostics

Always prefer middleware/src/utils/logger.js instead of console.log/console.error inside core services; the exception is low-level scripts like maintenance.js and migration runners.
Use structured objects for context ({ err: err.message, connector: connector.name() }), especially around delivery failures and config reloads.
Log positive states (start listening, health server ready) along with port numbers so the runtime state can be traced during deployment.

Validation & Canonical Payloads

Use zod for inbound schema checks; validators already live in middleware/src/routes/instrumentConfig.js and middleware/src/normalizers/index.js.
Always normalize parser output via normalize(parsed) before queue insertion to guarantee instrument_id, sample_id, result_time, and results conform to expectations.
If normalize throws, let the caller log the failure and drop the payload silently after marking inbox_raw as failed to avoid partial writes.

Database & Queue Best Practices

Use DatabaseClient for all SQL interactions; it centralizes busyTimeout and promise conversion and prevents sqlite3 callback spaghetti.
Parameterize every statement with ? placeholders (see queue/sqliteQueue.js and instrumentConfigStore.js) to avoid SQL injection hazards.
Always mark inbox_raw rows as processed, failed, or dropped after parsing to keep operators aware of what happened.
When marking outbox_result statuses, clear locked_at/locked_by and update attempts/next_attempt_at in one statement so watchers can rely on atomic semantics.

Connectors & Pipeline Contracts

Each connector must provide name, type, start, stop, health, onMessage, and onError per the current implementation; keep this contract if you add new protocols.
Keep connector internals event-driven: emit messageHandler(payload) and handle .catch(errorHandler) to ensure downstream failures get logged.
For TCP connectors, track connections in Sets so stop() can destroy them before closing the server.
Do not assume payload framing beyond what the current parser needs; let the parser module handle splitting text and trimming.

Worker & Delivery Guidelines

The delivery worker polls the queue (config.worker.batchSize) and records every attempt via queue.recordDeliveryAttempt; add retries in the same pattern if you introduce new failure-handling logic.
Respect the retry schedule defined in config.retries.schedule; buildNextAttempt uses Math.min to cap indexes, so new delays should append to config.retries.schedule only.
Duplicate detection relies on utils/hash.dedupeKey; keep results sorted and hashed consistently so deduplication stays stable.
On HTTP 400/422 responses or too many retries, move payloads to dead_letter and log the reason to keep operators informed.

Testing & Coverage Expectations

Parser tests live in middleware/test/parsers.test.js; they rely on node:assert and deliberately simple sample payloads to avoid external dependencies.
Add new tests by mimicking that file’s style—plain assert.strictEqual checks, no test framework dependencies, and console.log success acknowledgment.
If you enhance the test surface, keep it runnable via npm test so agents and CI scripts can still rely on a single command line.

Documentation & Storytelling

Keep docs/workstation_plan.md in sync with architectural changes; it surfaces connector flows, phases, retry policies, and maintenance checklists that agents rely on.
When adding routes/features, document the endpoint, request payload, and expected responses in either docs/ or inline comments near the route.

Cursor & Copilot Rules

No .cursor/rules/ or .cursorrules directories are present in this repo; therefore there are no Cursor-specific constraints to copy here.
.github/copilot-instructions.md is absent as well, so there are no Copilot instructions to enforce or repeat.

Final Notes for Agents

Keep changes isolated to their area of responsibility; the middleware is intentionally minimal, so avoid introducing new bundlers/languages.
Before opening PRs, rerun npm run migrate and npm test to verify schema/app coherence.
Use environment variable overrides from middleware/config/default.js when running in staging/production so the same config file can stay committed.

Additional Notes

Never revert existing changes you did not make unless explicitly requested, since those changes were made by the user.
If there are unrelated changes in the working tree, leave them untouched and focus on the files that matter for the ticket.
Avoid destructive git commands (git reset --hard, git checkout --) unless the user explicitly requests them.
If documentation updates were part of your change, add them to docs/workstation_plan.md or explain why the doc already covers the behavior.
When a connector or parser handles a new instrument, double-check instrument_config rows to ensure the connector name matches the incoming protocol.
The queue keeps status, attempts, next_attempt_at, and locked_* in sync; always update all relevant columns in a single SQL call to avoid race conditions.
Keep the SQL schema in sync with middleware/db/migrations; add new migrations rather than editing existing ones when altering tables.

15 KiB Raw Blame History Unescape Escape

AGENTS

Mission & Audience

Command Reference

Install & Bootstrapping

Maintenance & Database Care

Testing & Single-Test Command

Environment & Secrets

Observability Endpoints

Delivery Runbook & Retry Behavior

Instrument Configuration Cache

Metrics & Logging Enhancements

Maintenance Checklist

Data & Schema Source of Truth

Code Style Guidelines

Modules, Imports, and Exports

Formatting & Layout

Naming Conventions

Async Flow & Error Handling

Logging & Diagnostics

Validation & Canonical Payloads

Database & Queue Best Practices

Connectors & Pipeline Contracts

Worker & Delivery Guidelines

Testing & Coverage Expectations

Documentation & Storytelling

Cursor & Copilot Rules

Final Notes for Agents

Additional Notes

15 KiB

Raw Blame History