Introduction
LakeSync — declare what data goes where. The engine handles the rest.
LakeSync is an open-source TypeScript sync engine. Pluggable adapters connect any readable or writable system — Postgres, BigQuery, S3/Iceberg, Jira, Salesforce, or local SQLite. Declarative sync rules define what data flows between them. Every adapter is both a source and a destination. Local SQLite is one consumer among many — data can also materialise into Postgres, MySQL, or BigQuery destination tables.
How It Works
- Adapters connect to data sources — Postgres, BigQuery, S3/R2, Jira, Salesforce, or anything you implement the interface for
- Sync rules define what data flows to each consumer — bucket-based filtering with
eq/inoperators and JWT claim references - The gateway evaluates rules and routes data between adapters, with real-time WebSocket broadcast to connected clients
- Destinations receive data as queryable tables — local SQLite for offline-capable apps, or Postgres/MySQL/BigQuery via the materialise protocol
Adapters
Two interfaces abstract all data sources. Adapters are both sources and destinations.
| Adapter | Interface | Details |
|---|---|---|
| Postgres / MySQL | DatabaseAdapter | insertDeltas, queryDeltasSince, getLatestState, ensureSchema |
| BigQuery | DatabaseAdapter | Idempotent MERGE inserts, INT64 HLC precision, clustered by table + hlc |
| S3 / R2 (Iceberg) | LakeAdapter | putObject, getObject, listObjects, deleteObject — Parquet + Iceberg table format |
| Custom | Either | Implement the interface for any readable data source. CompositeAdapter routes to multiple backends. |
Source Connectors
Source connectors poll external APIs on an interval and push changes into the sync gateway. They extend BaseSourcePoller in @lakesync/core, which handles lifecycle, chunked push, and memory-managed ingestion with automatic backpressure.
| Connector | Package | Details |
|---|---|---|
| Jira Cloud | @lakesync/connector-jira | Issues, comments, and projects via JQL-filtered polling |
| Salesforce | @lakesync/connector-salesforce | Accounts, contacts, opportunities, and leads via SOQL queries |
| Database (Postgres / MySQL / BigQuery) | @lakesync/core | Cursor-based or diff-based polling via ConnectorIngestConfig |
Key Features
- Pluggable adapters —
DatabaseAdapterfor SQL-like sources,LakeAdapterfor object storage. Both are bidirectional. Cross-backend flows via sync rules. - Materialise protocol — All three database adapters (Postgres, MySQL, BigQuery) implement
Materialisable, materialising flushed deltas into queryable destination tables via a genericSqlDialectpattern. Hybrid column model (synced columns + extensibleprops JSONB). Supports composite primary keys, soft delete (default), and external ID deduplication. Adding a new destination = 4 SQL dialect methods. - Source polling —
BaseSourcePollerprovides lifecycle management, chunked push, and memory-managed ingestion with automatic backpressure and flush. Connectors extend it to poll any external API. - Adapter-sourced pull — Pull data from named source adapters (BigQuery, Postgres, etc.) directly into local SQLite. The gateway queries the adapter and applies sync rules before returning filtered deltas.
- Sync rules DSL — Declarative bucket-based filtering with
eq/in/neq/gt/lt/gte/lteoperators andjwt:claim references. Pure function evaluation viafilterDeltas(). - Column-level LWW — Conflicts resolved per-column, not per-row. Concurrent edits to different fields never overwrite each other.
- Real-time sync — WebSocket-based server-initiated broadcast. When any client pushes, others receive deltas in sub-100ms. Auto-reconnect with exponential backoff. HTTP polling as fallback.
- Offline support — Local SQLite via sql.js WASM. Persistent IndexedDB outbox survives page refreshes and process crashes. Automatic drain on reconnect.
- Hybrid Logical Clocks — Branded
HLCTimestampbigint (48-bit wall clock + 16-bit counter). Causal ordering with deterministicclientIdtiebreaking. - Result-based error handling — Public APIs return
Result<T, E>instead of throwing.
Packages
| Package | Description |
|---|---|
@lakesync/core | HLC, Delta, Result, conflict resolution, sync rules, adapter interfaces (LakeAdapter, DatabaseAdapter, Materialisable), base source poller, connector types |
@lakesync/client | LocalDB (sql.js), SyncCoordinator, transports, queues |
@lakesync/gateway | In-memory sync gateway with push/pull protocol |
@lakesync/gateway-server | Self-hosted HTTP + WebSocket gateway server for Node.js and Bun |
@lakesync/proto | Protocol Buffer serialisation for the sync wire format |
@lakesync/adapter | Storage adapters — S3/R2, Postgres, MySQL, BigQuery, Composite, FanOut, Lifecycle |
@lakesync/connector-jira | Jira Cloud source connector — polls issues, comments, and projects |
@lakesync/connector-salesforce | Salesforce CRM source connector — polls accounts, contacts, opportunities, and leads |
@lakesync/parquet | Parquet file encoding/decoding for delta persistence in Iceberg format |
@lakesync/catalogue | Iceberg REST catalogue client for table metadata and commit operations |
@lakesync/compactor | Background compaction, maintenance, and checkpoint generation |
@lakesync/analyst | Analytical query engine powered by DuckDB-WASM |
@lakesync/react | React hooks — useQuery, useMutation, useSyncStatus, LakeSyncProvider |
lakesync | Unified package re-exporting all packages |