Plugins, data platform and exports
Helix relies on a plugin system so teams can expose proprietary data sources, transformations and export targets without forking the core agent.
Plugin lifecycle
- Registration: plugins describe capabilities in YAML (datasets, actions, export formats). The control plane validates schemas and stores them in Postgres.
- Provisioning: a plugin runner pulls signed containers from our registry, injects secrets from Vault and mounts a scratch dataset.
- Monitoring: health reports feed Prometheus metrics, which inform autoscaling and circuit breakers.
Data platform integrations
Plugins provide adapters for warehouses, lakes and SaaS APIs. Each adapter exposes three core hooks:
describe()returns schema metadata to power autocomplete, fuzzy matching and lineage diagrams.sample()returns profile statistics and a capped dataframe for the planner.execute(plan)materializes data using the plugin's native engine (Spark, Flink, Snowflake, REST pagination, etc.).
Results flow back through our normalization layer so downstream charting code always receives pandas dataframes.
Transformation library
Reusable transformations live in a curated library. Plugins can contribute new modules by providing deterministic tests and judge prompts. Approved modules become callable tools inside the planning prompt.
We track usage to retire unused modules and prioritize optimizations on the hot path.
Exports
The export surface mirrors plugin capabilities. Users can push results to CSV, Slack, Google Sheets, S3, or bespoke systems registered through the same interface.
Every export emits an audit event that captures source dataset, filters applied and the destination. These events feed compliance dashboards.
Governance
Plugins ship with policy bundles that restrict who can invoke them. Policy decisions happen server-side so even if the UI is bypassed, unauthorized actions are denied.
We maintain a compatibility matrix that tests plugins against supported Helix versions each night. Failures block promotion to production until owners resolve them.
Roadmap
Next steps include automatic generation of plugin scaffolding, richer sandboxing for third-party code and incremental exports that minimize data transfer.