CLI Reference

All LakeXpress commands and options.

Overview

LakeXpress uses a two-step workflow:

  1. Create a sync configuration with LakeXpress config create – stores settings in database
  2. Execute the sync with LakeXpress sync – runs the export

Additional commands: logdb for database management, status for monitoring.

Basic Usage

LakeXpress [COMMAND] [OPTIONS]

Global Options

Help and Version

Option Description
-h, --help Show help message and exit
--version, -v Show version
--no_banner Suppress startup banner

Database Lifecycle Management

logdb init

Initialize the logging database schema.

./LakeXpress logdb init \
  -a credentials.json \
  --log_db_auth_id log_db_postgres
Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file

When to use: Optional – the schema auto-creates on first sync. Useful to pre-verify connectivity.

logdb drop

Drop the logging database schema.

./LakeXpress logdb drop \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --confirm
Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file
--confirm Flag No Skip safety prompt

logdb truncate

Clear all data from the logging database, preserving the schema.

./LakeXpress logdb truncate \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --confirm
Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file
--confirm Flag No Skip safety prompt
--sync_id ID String No Only truncate data for a specific sync

logdb locks

Display locked tables in the logging database.

Note: Locks only apply to incremental syncs, protecting watermarks from concurrent modifications. Full syncs do not use locks.

./LakeXpress logdb locks \
  -a credentials.json \
  --log_db_auth_id log_db_postgres
Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file
--sync_id ID String No Filter locks for a specific sync

logdb release-locks

Release stale or stuck locks.

When to use: After a sync crashes or is killed mid-export, stale locks block subsequent runs. Use logdb locks to identify them, then release-locks to clear them.

# View current locks first
./LakeXpress logdb locks \
  -a credentials.json \
  --log_db_auth_id log_db_postgres

# Release stale locks (requires --confirm)
./LakeXpress logdb release-locks \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --confirm

# Release only locks older than 24 hours
./LakeXpress logdb release-locks \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --max_age_hours 24 \
  --confirm

# Release a specific table lock by ID
./LakeXpress logdb release-locks \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --table_id 42 \
  --confirm
Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file
--confirm Flag Yes Required to confirm the operation
--max_age_hours N Integer No Only release locks older than N hours
--table_id ID Integer No Release lock for a specific table ID

Configuration Management

config create

Create a sync configuration stored in the logging database.

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id source_pg \
  --source_db_name tpch \
  --source_schema_name public \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id s3_01 \
  --n_jobs 4 \
  --generate_metadata

Authentication Options

Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file

Sync Identity Options

Option Type Required Description
--sync_id ID String No Custom sync ID (1-64 chars, alphanumeric, underscores, hyphens)
Examples: my_sync, prod-daily-export, sync_2026
Auto-generates if omitted. Fails if ID already exists

Source Database Options

Option Type Required Description
--source_db_auth_id ID String Yes Source database identifier in auth file
--source_db_name NAME String Yes Source database name (e.g., tpch, northwind)
--source_schema_name PATTERN String Yes Source schema name(s), supports SQL patterns (e.g., public, prod_%)

Table Filtering Options

Option Type Description
-i, --include PATTERN String Include tables matching SQL patterns (comma-separated)
Example: orders%, customer%
-e, --exclude PATTERN String Exclude tables matching SQL patterns (comma-separated)
Example: temp%, test%
--min_rows INT Integer Minimum row count filter
--max_rows INT Integer Maximum row count filter

Pattern Matching: Uses SQL LIKE syntax – % matches any characters, _ matches one character.

Incremental Sync Options

Option Type Description
--incremental_table SPEC String Define an incremental table (repeatable)
Format: schema.table:column:type[:i\|:e][@start][!strategy]
Example: tpch_1.orders:o_orderdate:date
--incremental_safety_lag INT Integer Safety lag in seconds for late-arriving data (default: 0)

Note: Tables not configured with --incremental_table are fully exported on each sync.

Incremental Column Types:

  • date - YYYY-MM-DD
  • datetime - YYYY-MM-DD HH:MM:SS
  • timestamp - Timestamp
  • integer - Integer sequence

Direction Options:

  • :i - Include (default)
  • :e - Exclude

Example:

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id source_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1_incremental \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id s3_01 \
  --incremental_table "tpch_1_incremental.orders:o_orderdate:date" \
  --incremental_table "tpch_1_incremental.lineitem:l_shipdate:date" \
  --incremental_safety_lag 3600

See Incremental Sync Documentation for details.

FastBCP Configuration Options

Option Type Default Description
--fastbcp_dir_path PATH Path N/A FastBCP executable directory
-p, --fastbcp_p INT Integer 1 Parallel jobs within FastBCP for large table partitioning
--fastbcp_table_config CONFIG String N/A Table-specific FastBCP config
Format: table:method:key_column:p[;table:...]
Example: lineitem:DataDriven:YEAR(l_shipdate):8;orders:Ctid::4
--large_table_threshold INT Integer 100000 Row count threshold for parallel export
--compression_type TYPE String Zstd Parquet compression (Zstd, Snappy, Gzip, Lz4, None)

Parallel Processing Options

Option Type Default Description
--n_jobs INT Integer 1 Number of parallel table export jobs

Example: --n_jobs 4 --fastbcp_p 2 exports 4 tables simultaneously, each using 2 parallel processes.

Storage Options

Choose either --output_dir (local) or --target_storage_id (cloud).

Option Type Mutually Exclusive With Description
--output_dir PATH Path --target_storage_id Local directory for exports
--target_storage_id ID String --output_dir Cloud storage ID (e.g., s3_01, gcs_01, azure_01)
--sub_path SUB_PATH String N/A Sub-path between base path and schema directory
Example: staging/temp creates base/staging/temp/schema/table/

Publishing Options

Option Type Description
--publish_target ID String Credential ID for publishing target (Snowflake, AWS Glue, Databricks, Fabric, DuckLake)
--publish_method METHOD String external (default) – data stays in cloud storage
internal – data loaded into target database
--publish_database_name NAME String Database name for publishing targets (AWS Glue, Databricks)
--publish_schema_pattern PATTERN String Dynamic schema naming using tokens:
{schema}, {table}, {database}, {date}, {timestamp}, {uuid}, {subpath}
Default: EXT_{schema} (external), {schema} (internal)
--publish_table_pattern PATTERN String Dynamic table naming (same tokens as schema pattern)
Default: {table}. Must include {table} token
--no_views Flag Skip view creation (Snowflake external tables only)
--snowflake_pk_constraints Flag Propagate PRIMARY KEY constraints to Snowflake internal tables

Metadata Options

Option Type Default Description
--generate_metadata Flag false Generate CDM metadata (manifest.json and .cdm.json files)
--manifest_name NAME String Auto Custom CDM manifest name
Default: schema name (per-schema) or database name (global)

Behavior Options

Option Type Default Description
--error_action ACTION String fail fail – stop on first error
continue – skip failed tables
skip – skip errors silently
--env_name NAME String default Environment name for configuration isolation

Logging Options

Option Type Default Description
--log_level LEVEL String INFO DEBUG, INFO, WARNING, ERROR, CRITICAL
--log_dir PATH Path Current directory Log file directory
--no_progress Flag false Disable progress bar

config list

List all sync configurations.

./LakeXpress config list \
  -a credentials.json \
  --log_db_auth_id log_db_postgres
Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file
--env_name NAME String No Filter by environment name

config delete

Delete a sync configuration and all associated data (runs, table metadata, watermarks).

Recommended workflow: Run without --confirm first to preview what will be deleted, then run with --confirm to execute.

# Step 1: Dry run - preview what will be deleted
./LakeXpress config delete \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --sync_id 20251208-a1b2c3d4-e5f6-7890

# Step 2: Confirm deletion
./LakeXpress config delete \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --sync_id 20251208-a1b2c3d4-e5f6-7890 \
  --confirm
Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file
--sync_id ID String Yes Sync configuration ID to delete
--confirm Flag No Execute the deletion (without this flag, shows preview only)

What gets deleted:

  • The sync configuration
  • All run history for this sync
  • Table metadata and watermarks (incremental sync state)

Note: This does not delete exported files in cloud storage or published tables in target systems (Snowflake, Glue, etc.).

Sync Execution

sync

Execute a sync using the most recent configuration or a specified sync ID.

./LakeXpress sync
Option Type Description
--sync_id ID String Sync configuration to use (defaults to most recent)
-a, --auth_file PATH Path Override credentials file
--fastbcp_dir_path PATH Path Override FastBCP directory
--resume Flag Resume from last incomplete run
--run_id ID String Specific run ID to resume
Format: YYYYMMDD-XXXXXXXX-XXXX-XXXX

Example:

# Execute most recent configuration
./LakeXpress sync

# Execute specific configuration
./LakeXpress sync --sync_id 20251208-a1b2c3d4-e5f6-7890

# Resume incomplete run
./LakeXpress sync --run_id 20251208-f7g8h9i0-j1k2-l3m4 --resume

sync export

Export data without publishing. Same options as sync.

./LakeXpress sync export

sync publish

Publish previously exported data to Snowflake, AWS Glue, Databricks, Fabric, BigQuery, MotherDuck, or DuckLake. Same options as sync.

./LakeXpress sync publish

Legacy YAML Support

run

Execute an export from a legacy YAML configuration file.

./LakeXpress run \
  -c config_20251202_164948.yml \
  -a credentials.json
Option Type Required Description
-c, --config PATH Path Yes YAML configuration file
-a, --auth_file PATH Path No Override credentials file
--log_db_auth_id ID String No Override log database credential ID

Note: YAML files are auto-generated by config create but superseded by database-stored configurations.

Status and Monitoring

status

Query sync and run status.

./LakeXpress status \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --sync_id 20251208-a1b2c3d4-e5f6-7890
Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file
--sync_id ID String No Filter by sync configuration
--run_id ID String No Filter by run
-v, --verbose Flag No Show detailed run list

Cleanup and Maintenance

cleanup

Remove orphaned or stale runs from the logging database.

./LakeXpress cleanup \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --sync_id my_sync \
  --older-than 7d \
  --dry-run
Option Type Required Description
-a, --auth_file PATH Path Yes JSON credentials file
--log_db_auth_id ID String Yes Logging database identifier in auth file
--sync_id ID String Yes Sync configuration to clean up
--older-than DURATION String No Only delete runs older than this (e.g., 7d, 24h, 30m)
--status STATUS String No Only delete runs with this status: running or failed (default: both)
--dry-run Flag No Preview deletions without executing

Complete Examples

Basic Export to S3

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_db_name sales_db \
  --source_schema_name public \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id s3_01 \
  --n_jobs 4 \
  --fastbcp_p 2

./LakeXpress sync

Export with Snowflake Publishing

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_db_name sales_db \
  --source_schema_name public \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id s3_01 \
  --publish_target snowflake_prod \
  --publish_schema_pattern "EXT_{subpath}_{date}" \
  --publish_table_pattern "{schema}_{table}" \
  --sub_path production \
  --n_jobs 4

./LakeXpress sync

Incremental Export

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id ds_04_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1_incremental \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id aws_s3_01 \
  --incremental_table "tpch_1_incremental.orders:o_orderdate:date" \
  --incremental_table "tpch_1_incremental.lineitem:l_shipdate:date" \
  --generate_metadata

# First sync -- exports all data
./LakeXpress sync

# Subsequent syncs -- incremental tables export only new data, others fully exported
./LakeXpress sync

Export with Custom Naming and Table Filtering

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_db_name analytics \
  --source_schema_name "sales%, marketing%" \
  --include "fact_%, dim_%" \
  --exclude "temp%, test%" \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id s3_01 \
  --sub_path data-lake/prod \
  --publish_target snowflake_prod \
  --publish_schema_pattern "ANALYTICS_{subpath}" \
  --publish_table_pattern "{schema}_{table}" \
  --n_jobs 8 \
  --fastbcp_p 4 \
  --generate_metadata

./LakeXpress sync

See Also