CLI Reference
All LakeXpress commands and options.
Overview
LakeXpress uses a two-step workflow:
- Create a sync configuration with
LakeXpress config create– stores settings in database - Execute the sync with
LakeXpress sync– runs the export
Additional commands: logdb for database management, status for monitoring.
Basic Usage
LakeXpress [COMMAND] [OPTIONS]
Global Options
Help and Version
| Option | Description |
|---|---|
-h, --help |
Show help message and exit |
--version, -v |
Show version |
--no_banner |
Suppress startup banner |
Database Lifecycle Management
logdb init
Initialize the logging database schema.
./LakeXpress logdb init \
-a credentials.json \
--log_db_auth_id log_db_postgres
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
When to use: Optional – the schema auto-creates on first sync. Useful to pre-verify connectivity.
logdb drop
Drop the logging database schema.
./LakeXpress logdb drop \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--confirm
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
--confirm |
Flag | No | Skip safety prompt |
logdb truncate
Clear all data from the logging database, preserving the schema.
./LakeXpress logdb truncate \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--confirm
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
--confirm |
Flag | No | Skip safety prompt |
--sync_id ID |
String | No | Only truncate data for a specific sync |
logdb locks
Display locked tables in the logging database.
Note: Locks only apply to incremental syncs, protecting watermarks from concurrent modifications. Full syncs do not use locks.
./LakeXpress logdb locks \
-a credentials.json \
--log_db_auth_id log_db_postgres
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
--sync_id ID |
String | No | Filter locks for a specific sync |
logdb release-locks
Release stale or stuck locks.
When to use: After a sync crashes or is killed mid-export, stale locks block subsequent runs. Use logdb locks to identify them, then release-locks to clear them.
# View current locks first
./LakeXpress logdb locks \
-a credentials.json \
--log_db_auth_id log_db_postgres
# Release stale locks (requires --confirm)
./LakeXpress logdb release-locks \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--confirm
# Release only locks older than 24 hours
./LakeXpress logdb release-locks \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--max_age_hours 24 \
--confirm
# Release a specific table lock by ID
./LakeXpress logdb release-locks \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--table_id 42 \
--confirm
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
--confirm |
Flag | Yes | Required to confirm the operation |
--max_age_hours N |
Integer | No | Only release locks older than N hours |
--table_id ID |
Integer | No | Release lock for a specific table ID |
Configuration Management
config create
Create a sync configuration stored in the logging database.
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id source_pg \
--source_db_name tpch \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--n_jobs 4 \
--generate_metadata
Authentication Options
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
Sync Identity Options
| Option | Type | Required | Description |
|---|---|---|---|
--sync_id ID |
String | No | Custom sync ID (1-64 chars, alphanumeric, underscores, hyphens) Examples: my_sync, prod-daily-export, sync_2026Auto-generates if omitted. Fails if ID already exists |
Source Database Options
| Option | Type | Required | Description |
|---|---|---|---|
--source_db_auth_id ID |
String | Yes | Source database identifier in auth file |
--source_db_name NAME |
String | Yes | Source database name (e.g., tpch, northwind) |
--source_schema_name PATTERN |
String | Yes | Source schema name(s), supports SQL patterns (e.g., public, prod_%) |
Table Filtering Options
| Option | Type | Description |
|---|---|---|
-i, --include PATTERN |
String | Include tables matching SQL patterns (comma-separated) Example: orders%, customer% |
-e, --exclude PATTERN |
String | Exclude tables matching SQL patterns (comma-separated) Example: temp%, test% |
--min_rows INT |
Integer | Minimum row count filter |
--max_rows INT |
Integer | Maximum row count filter |
Pattern Matching: Uses SQL LIKE syntax – % matches any characters, _ matches one character.
Incremental Sync Options
| Option | Type | Description |
|---|---|---|
--incremental_table SPEC |
String | Define an incremental table (repeatable) Format: schema.table:column:type[:i\|:e][@start][!strategy]Example: tpch_1.orders:o_orderdate:date |
--incremental_safety_lag INT |
Integer | Safety lag in seconds for late-arriving data (default: 0) |
Note: Tables not configured with --incremental_table are fully exported on each sync.
Incremental Column Types:
date- YYYY-MM-DDdatetime- YYYY-MM-DD HH:MM:SStimestamp- Timestampinteger- Integer sequence
Direction Options:
:i- Include (default):e- Exclude
Example:
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id source_pg \
--source_db_name tpch \
--source_schema_name tpch_1_incremental \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--incremental_table "tpch_1_incremental.orders:o_orderdate:date" \
--incremental_table "tpch_1_incremental.lineitem:l_shipdate:date" \
--incremental_safety_lag 3600
See Incremental Sync Documentation for details.
FastBCP Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
--fastbcp_dir_path PATH |
Path | N/A | FastBCP executable directory |
-p, --fastbcp_p INT |
Integer | 1 | Parallel jobs within FastBCP for large table partitioning |
--fastbcp_table_config CONFIG |
String | N/A | Table-specific FastBCP config Format: table:method:key_column:p[;table:...]Example: lineitem:DataDriven:YEAR(l_shipdate):8;orders:Ctid::4 |
--large_table_threshold INT |
Integer | 100000 | Row count threshold for parallel export |
--compression_type TYPE |
String | Zstd | Parquet compression (Zstd, Snappy, Gzip, Lz4, None) |
Parallel Processing Options
| Option | Type | Default | Description |
|---|---|---|---|
--n_jobs INT |
Integer | 1 | Number of parallel table export jobs |
Example: --n_jobs 4 --fastbcp_p 2 exports 4 tables simultaneously, each using 2 parallel processes.
Storage Options
Choose either --output_dir (local) or --target_storage_id (cloud).
| Option | Type | Mutually Exclusive With | Description |
|---|---|---|---|
--output_dir PATH |
Path | --target_storage_id |
Local directory for exports |
--target_storage_id ID |
String | --output_dir |
Cloud storage ID (e.g., s3_01, gcs_01, azure_01) |
--sub_path SUB_PATH |
String | N/A | Sub-path between base path and schema directory Example: staging/temp creates base/staging/temp/schema/table/ |
Publishing Options
| Option | Type | Description |
|---|---|---|
--publish_target ID |
String | Credential ID for publishing target (Snowflake, AWS Glue, Databricks, Fabric, DuckLake) |
--publish_method METHOD |
String | external (default) – data stays in cloud storageinternal – data loaded into target database |
--publish_database_name NAME |
String | Database name for publishing targets (AWS Glue, Databricks) |
--publish_schema_pattern PATTERN |
String | Dynamic schema naming using tokens:{schema}, {table}, {database}, {date}, {timestamp}, {uuid}, {subpath}Default: EXT_{schema} (external), {schema} (internal) |
--publish_table_pattern PATTERN |
String | Dynamic table naming (same tokens as schema pattern) Default: {table}. Must include {table} token |
--no_views |
Flag | Skip view creation (Snowflake external tables only) |
--snowflake_pk_constraints |
Flag | Propagate PRIMARY KEY constraints to Snowflake internal tables |
Metadata Options
| Option | Type | Default | Description |
|---|---|---|---|
--generate_metadata |
Flag | false | Generate CDM metadata (manifest.json and .cdm.json files) |
--manifest_name NAME |
String | Auto | Custom CDM manifest name Default: schema name (per-schema) or database name (global) |
Behavior Options
| Option | Type | Default | Description |
|---|---|---|---|
--error_action ACTION |
String | fail | fail – stop on first errorcontinue – skip failed tablesskip – skip errors silently |
--env_name NAME |
String | default | Environment name for configuration isolation |
Logging Options
| Option | Type | Default | Description |
|---|---|---|---|
--log_level LEVEL |
String | INFO | DEBUG, INFO, WARNING, ERROR, CRITICAL |
--log_dir PATH |
Path | Current directory | Log file directory |
--no_progress |
Flag | false | Disable progress bar |
config list
List all sync configurations.
./LakeXpress config list \
-a credentials.json \
--log_db_auth_id log_db_postgres
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
--env_name NAME |
String | No | Filter by environment name |
config delete
Delete a sync configuration and all associated data (runs, table metadata, watermarks).
Recommended workflow: Run without --confirm first to preview what will be deleted, then run with --confirm to execute.
# Step 1: Dry run - preview what will be deleted
./LakeXpress config delete \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--sync_id 20251208-a1b2c3d4-e5f6-7890
# Step 2: Confirm deletion
./LakeXpress config delete \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--sync_id 20251208-a1b2c3d4-e5f6-7890 \
--confirm
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
--sync_id ID |
String | Yes | Sync configuration ID to delete |
--confirm |
Flag | No | Execute the deletion (without this flag, shows preview only) |
What gets deleted:
- The sync configuration
- All run history for this sync
- Table metadata and watermarks (incremental sync state)
Note: This does not delete exported files in cloud storage or published tables in target systems (Snowflake, Glue, etc.).
Sync Execution
sync
Execute a sync using the most recent configuration or a specified sync ID.
./LakeXpress sync
| Option | Type | Description |
|---|---|---|
--sync_id ID |
String | Sync configuration to use (defaults to most recent) |
-a, --auth_file PATH |
Path | Override credentials file |
--fastbcp_dir_path PATH |
Path | Override FastBCP directory |
--resume |
Flag | Resume from last incomplete run |
--run_id ID |
String | Specific run ID to resume Format: YYYYMMDD-XXXXXXXX-XXXX-XXXX |
Example:
# Execute most recent configuration
./LakeXpress sync
# Execute specific configuration
./LakeXpress sync --sync_id 20251208-a1b2c3d4-e5f6-7890
# Resume incomplete run
./LakeXpress sync --run_id 20251208-f7g8h9i0-j1k2-l3m4 --resume
sync export
Export data without publishing. Same options as sync.
./LakeXpress sync export
sync publish
Publish previously exported data to Snowflake, AWS Glue, Databricks, Fabric, BigQuery, MotherDuck, or DuckLake. Same options as sync.
./LakeXpress sync publish
Legacy YAML Support
run
Execute an export from a legacy YAML configuration file.
./LakeXpress run \
-c config_20251202_164948.yml \
-a credentials.json
| Option | Type | Required | Description |
|---|---|---|---|
-c, --config PATH |
Path | Yes | YAML configuration file |
-a, --auth_file PATH |
Path | No | Override credentials file |
--log_db_auth_id ID |
String | No | Override log database credential ID |
Note: YAML files are auto-generated by config create but superseded by database-stored configurations.
Status and Monitoring
status
Query sync and run status.
./LakeXpress status \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--sync_id 20251208-a1b2c3d4-e5f6-7890
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
--sync_id ID |
String | No | Filter by sync configuration |
--run_id ID |
String | No | Filter by run |
-v, --verbose |
Flag | No | Show detailed run list |
Cleanup and Maintenance
cleanup
Remove orphaned or stale runs from the logging database.
./LakeXpress cleanup \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--sync_id my_sync \
--older-than 7d \
--dry-run
| Option | Type | Required | Description |
|---|---|---|---|
-a, --auth_file PATH |
Path | Yes | JSON credentials file |
--log_db_auth_id ID |
String | Yes | Logging database identifier in auth file |
--sync_id ID |
String | Yes | Sync configuration to clean up |
--older-than DURATION |
String | No | Only delete runs older than this (e.g., 7d, 24h, 30m) |
--status STATUS |
String | No | Only delete runs with this status: running or failed (default: both) |
--dry-run |
Flag | No | Preview deletions without executing |
Complete Examples
Basic Export to S3
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_db_name sales_db \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--n_jobs 4 \
--fastbcp_p 2
./LakeXpress sync
Export with Snowflake Publishing
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_db_name sales_db \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--publish_target snowflake_prod \
--publish_schema_pattern "EXT_{subpath}_{date}" \
--publish_table_pattern "{schema}_{table}" \
--sub_path production \
--n_jobs 4
./LakeXpress sync
Incremental Export
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id ds_04_pg \
--source_db_name tpch \
--source_schema_name tpch_1_incremental \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id aws_s3_01 \
--incremental_table "tpch_1_incremental.orders:o_orderdate:date" \
--incremental_table "tpch_1_incremental.lineitem:l_shipdate:date" \
--generate_metadata
# First sync -- exports all data
./LakeXpress sync
# Subsequent syncs -- incremental tables export only new data, others fully exported
./LakeXpress sync
Export with Custom Naming and Table Filtering
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_db_name analytics \
--source_schema_name "sales%, marketing%" \
--include "fact_%, dim_%" \
--exclude "temp%, test%" \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--sub_path data-lake/prod \
--publish_target snowflake_prod \
--publish_schema_pattern "ANALYTICS_{subpath}" \
--publish_table_pattern "{schema}_{table}" \
--n_jobs 8 \
--fastbcp_p 4 \
--generate_metadata
./LakeXpress sync