Quick start guide
Two steps: create a config, then run a sync.
Prerequisites
- LakeXpress and FastBCP binaries (Windows or Linux)
- Source database connection details (read access to tables and
information_schema) and a logging database - A storage destination: local directory or cloud credentials (S3, GCS, Azure)
- Publishing credentials (optional) for Snowflake, Databricks, BigQuery, etc.
Create a credentials file
{
"log_db_postgres": {
"ds_type": "postgres",
"auth_mode": "classic",
"info": {
"username": "postgres",
"password": "${DB_PASSWORD}",
"server": "localhost",
"port": 5432,
"database": "lakexpress_log"
}
},
"source_postgres": {
"ds_type": "postgres",
"auth_mode": "classic",
"info": {
"username": "postgres",
"password": "${DB_PASSWORD}",
"server": "localhost",
"port": 5432,
"database": "production_db"
}
},
"s3_01": {
"ds_type": "s3",
"auth_mode": "profile",
"info": {
"directory": "s3://my-data-lake/exports",
"profile": "your-aws-profile"
}
}
}
Save as credentials.json in a secure location.
Tip: Environment variables – Use
${VAR_NAME}in any string value. Plain-text passwords also work.# Linux export DB_PASSWORD="your_password" # Windows (cmd) set DB_PASSWORD=your_password # Windows (PowerShell) $env:DB_PASSWORD = "your_password"
For other databases (Oracle, SQL Server, MySQL) and storage backends (GCS, Azure), see Database Configuration and Storage Backends.
Initialize the logging database (optional)
The logging database tracks syncs, runs, and table exports. LakeXpress creates the schema automatically on first sync, so this step is optional.
Use logdb init to verify connectivity or pre-create the schema for audit purposes.
Windows (PowerShell)
.\LakeXpress.exe logdb init `
-a credentials.json `
--log_db_auth_id log_db_postgres
Linux
./LakeXpress logdb init \
-a credentials.json \
--log_db_auth_id log_db_postgres
Create a sync configuration
The configuration is stored in the logging database and reused for every sync.
Export to local filesystem
Windows (PowerShell)
.\LakeXpress.exe config create `
-a credentials.json `
--log_db_auth_id log_db_postgres `
--source_db_auth_id source_postgres `
--source_db_name public `
--source_schema_name public `
--fastbcp_dir_path .\FastBCP_win-x64\latest\ `
--output_dir .\exports `
--n_jobs 4 `
--fastbcp_p 2
Linux
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_db_name public \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--output_dir ./exports \
--n_jobs 4 \
--fastbcp_p 2
Exports all tables from public to ./exports/public/table_name/, 4 tables in parallel with 2-way partitioning.
Run the sync
Windows (PowerShell)
.\LakeXpress.exe sync
Linux
./LakeXpress sync
Loads the config from the logging database, exports tables, and shows real-time progress.
More examples
Export to cloud storage
Export to AWS S3 with CDM metadata:
Windows (PowerShell)
.\LakeXpress.exe config create `
-a credentials.json `
--log_db_auth_id log_db_postgres `
--source_db_auth_id source_postgres `
--source_db_name tpch `
--source_schema_name public `
--fastbcp_dir_path .\FastBCP_win-x64\latest\ `
--target_storage_id s3_01 `
--n_jobs 4 `
--fastbcp_p 2 `
--generate_metadata
.\LakeXpress.exe sync
Linux
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_db_name tpch \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--n_jobs 4 \
--fastbcp_p 2 \
--generate_metadata
./LakeXpress sync
Exports to S3 and generates CDM metadata files.
Filter tables with patterns
Use include/exclude patterns to select specific tables:
Windows (PowerShell)
.\LakeXpress.exe config create `
-a credentials.json `
--log_db_auth_id log_db_postgres `
--source_db_auth_id source_postgres `
--source_db_name public `
--source_schema_name public `
--include "orders%, customer%, product%" `
--exclude "temp%, test%" `
--fastbcp_dir_path .\FastBCP_win-x64\latest\ `
--output_dir .\exports `
--n_jobs 4
.\LakeXpress.exe sync
Linux
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_db_name public \
--source_schema_name public \
--include "orders%, customer%, product%" \
--exclude "temp%, test%" \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--output_dir ./exports \
--n_jobs 4
./LakeXpress sync
Includes tables matching orders%, customer%, or product%; excludes those matching temp% or test%.
Incremental sync
Use a watermark column so subsequent syncs only export new rows:
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id source_pg \
--source_db_name tpch \
--source_schema_name tpch_1_incremental \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--incremental_table "tpch_1_incremental.orders:o_orderdate:date" \
--incremental_table "tpch_1_incremental.lineitem:l_shipdate:date" \
--generate_metadata
# First sync: exports everything and records high watermarks
./LakeXpress sync
# Later syncs: only exports rows past the watermark
./LakeXpress sync
Tables not configured as incremental are fully exported each sync. See the Incremental Sync guide for details.
Resume failed syncs
./LakeXpress sync --run_id 20251208-f7g8h9i0-j1k2-l3m4 --resume
Skips completed tables and retries only the failed ones.
Snowflake publishing
Export to S3 and create Snowflake external tables in one step:
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_db_name public \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--publish_target snowflake_prod \
--n_jobs 4
./LakeXpress sync
Query the data in Snowflake:
SELECT * FROM PUBLIC.V_CUSTOMER LIMIT 10;
For internal tables with primary key constraints, add --publish_method internal --snowflake_pk_constraints. See the Snowflake Publishing Guide.
Reference
List configurations
./LakeXpress config list \
-a credentials.json \
--log_db_auth_id log_db_postgres
Check sync status
./LakeXpress status -a credentials.json --log_db_auth_id log_db_postgres --sync_id <your-sync-id>
Manage the logging database
# Initialize the schema
./LakeXpress logdb init -a credentials.json --log_db_auth_id log_db_postgres
# Clear run history (keeps schema)
./LakeXpress logdb truncate -a credentials.json --log_db_auth_id log_db_postgres
# Drop the schema
./LakeXpress logdb drop -a credentials.json --log_db_auth_id log_db_postgres --confirm
Next steps
- CLI reference - all available options
- Incremental sync - continuous updates
- Storage backends - S3, GCS, Azure, local
- Database configuration - all supported databases
- Examples - real-world scenarios