Real-World Examples
Two-step workflow examples: config create then sync.
Table of Contents
- PostgreSQL to Local Filesystem
- PostgreSQL to S3 Storage
- PostgreSQL to Google Cloud Storage
- PostgreSQL to OVH S3
- SQL Server to Local Filesystem
- SQL Server to S3 Storage
- SQL Server to Azure Storage
- Multi-Schema Export
- Table Filtering Examples
- Incremental Sync Examples
- Resume Failed Exports
PostgreSQL to Local Filesystem
- PostgreSQL TPC-H
tpch.tpch_1to/tmp/tpch/ - SQL Server log DB, 4 parallel tables, 2-way parallelism, CDM metadata
Windows (PowerShell)
.\LakeXpress.exe config create `
-a .\credentials.json `
--log_db_auth_id log_db_ms `
--source_db_auth_id ds_04_pg `
--source_db_name tpch `
--source_schema_name tpch_1 `
--fastbcp_dir_path .\FastBCP_win-x64\latest\ `
--fastbcp_p 2 `
--n_jobs 4 `
--output_dir C:\exports\tpch\ `
--generate_metadata
Linux
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id ds_04_pg \
--source_db_name tpch \
--source_schema_name tpch_1 \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--output_dir /tmp/tpch/ \
--generate_metadata
./LakeXpress sync
Exports all tables from tpch_1 to /tmp/tpch/tpch_1/ with CDM manifest.
PostgreSQL to S3 Storage
- PostgreSQL TPC-H
tpch.tpch_1to AWS S3 - SQL Server log DB, 4 parallel tables, 2-way parallelism, CDM metadata
Windows (PowerShell)
.\LakeXpress.exe config create `
-a .\credentials.json `
--log_db_auth_id log_db_ms `
--source_db_auth_id ds_04_pg `
--source_db_name tpch `
--source_schema_name tpch_1 `
--fastbcp_dir_path .\FastBCP_win-x64\latest\ `
--fastbcp_p 2 `
--n_jobs 4 `
--target_storage_id s3_01 `
--generate_metadata
Linux
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id ds_04_pg \
--source_db_name tpch \
--source_schema_name tpch_1 \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--target_storage_id s3_01 \
--generate_metadata
./LakeXpress sync
Uploads Parquet files to s3://your-bucket/base_path/tpch_1/ with CDM metadata.
PostgreSQL to Google Cloud Storage
- PostgreSQL TPC-H to GCS, multi-schema via pattern
tpch_1% - SQL Server log DB, 4 parallel tables, 2-way parallelism, CDM metadata
Windows (PowerShell)
.\LakeXpress.exe config create `
-a .\credentials.json `
--log_db_auth_id log_db_ms `
--source_db_auth_id ds_04_pg `
--source_db_name tpch `
--source_schema_name tpch_1% `
--fastbcp_dir_path .\FastBCP_win-x64\latest\ `
--fastbcp_p 2 `
--n_jobs 4 `
--target_storage_id gcs_01 `
--generate_metadata
Linux
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id ds_04_pg \
--source_db_name tpch \
--source_schema_name tpch_1% \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--target_storage_id gcs_01 \
--generate_metadata
./LakeXpress sync
Exports all schemas matching tpch_1% to GCS.
PostgreSQL to S3 Storage (OVH)
- PostgreSQL TPC-H
tpch.tpch_1to OVH S3-compatible storage - Uses custom endpoint via AWS profile
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id ds_04_pg \
--source_db_name tpch \
--source_schema_name tpch_1 \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--target_storage_id s3_02 \
--generate_metadata
./LakeXpress sync
credentials.json:
{
"s3_02": {
"ds_type": "s3",
"auth_mode": "profile",
"info": {
"directory": "s3://my-ovh-bucket/exports",
"profile": "ovh"
}
}
}
~/.aws/config:
[profile ovh]
endpoint_url = https://s3.gra.io.cloud.ovh.net
region = gra
~/.aws/credentials:
[ovh]
aws_access_key_id = YOUR_OVH_ACCESS_KEY
aws_secret_access_key = YOUR_OVH_SECRET_KEY
SQL Server to Local Filesystem
AdventureWorksDW
- SQL Server
adventureworksdw.dboto local filesystem - PostgreSQL log DB
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_pg \
--source_db_auth_id ds_10_ms \
--source_db_name adventureworksdw \
--source_schema_name dbo \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--output_dir /tmp/tpch/ \
--generate_metadata
./LakeXpress sync
AdventureWorks with Sub-Path
Uses --sub_path to export into a specific sub-directory:
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_pg \
--source_db_auth_id ds_11_ms \
--source_db_name adventureworks2019 \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--output_dir /tmp/ \
--sub_path adventureworks2019 \
--generate_metadata
./LakeXpress sync
Creates /tmp/adventureworks2019/schema/table/ structure.
SQL Server to S3 Storage
- SQL Server
adventureworksdw.dboto AWS S3 - PostgreSQL log DB, organized by database name via
--sub_path
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_pg \
--source_db_auth_id ds_10_ms \
--source_db_name adventureworksdw \
--source_schema_name dbo \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--target_storage_id s3_01 \
--sub_path adventureworksdw \
--generate_metadata
./LakeXpress sync
Uploads to s3://bucket/base_path/adventureworksdw/dbo/table/.
SQL Server to Azure Storage
- SQL Server
adventureworks2019.salesto Azure ADLS Gen2 - Service Principal auth, PostgreSQL log DB
- 4 parallel tables, 2-way parallelism, CDM metadata
Windows (PowerShell)
.\LakeXpress.exe config create `
-a .\credentials.json `
--log_db_auth_id log_db_pg `
--source_db_auth_id ds_11_ms `
--source_db_name adventureworks2019 `
--source_schema_name sales `
--fastbcp_dir_path .\FastBCP_win-x64\latest\ `
--fastbcp_p 2 `
--n_jobs 4 `
--target_storage_id azure_01 `
--sub_path adventureworks/sales `
--generate_metadata
Linux
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_pg \
--source_db_auth_id ds_11_ms \
--source_db_name adventureworks2019 \
--source_schema_name sales \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--target_storage_id azure_01 \
--sub_path adventureworks/sales \
--generate_metadata
./LakeXpress sync
credentials.json:
{
"azure_01": {
"ds_type": "azure",
"auth_mode": "service_principal",
"info": {
"directory": "abfss://datalake.dfs.core.windows.net/exports",
"azure_client_id": "your-application-client-id",
"azure_tenant_id": "your-directory-tenant-id",
"azure_client_secret": "your-client-secret"
}
},
"log_db_pg": {
"ds_type": "postgres",
"auth_mode": "classic",
"info": {
"username": "postgres",
"password": "your_password",
"server": "localhost",
"port": 5432,
"database": "lakexpress_log"
}
},
"ds_11_ms": {
"ds_type": "mssql",
"auth_mode": "classic",
"info": {
"username": "sa",
"password": "your_password",
"server": "localhost",
"port": 1433,
"database": "adventureworks2019"
}
}
}
Output path:
abfss://datalake@mystorageaccount.dfs.core.windows.net/exports/adventureworks/sales/schema/table/*.parquet
Multi-Schema Export
- PostgreSQL with multiple TPC-H schemas matching
tpch_1% - Exports to AWS S3 with custom sub-path
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id ds_04_pg \
--source_db_name tpch \
--source_schema_name tpch_1% \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--target_storage_id s3_01 \
--sub_path titi/tata \
--generate_metadata
./LakeXpress sync
Matches tpch_1, tpch_10, tpch_100, etc. Each schema exported to s3://bucket/base/titi/tata/schema_name/.
Table Filtering Examples
Include Specific Table Patterns
Export only fact and dimension tables:
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_schema_name public \
--include "fact_%, dim_%" \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--output_dir ./exports \
--n_jobs 4
./LakeXpress sync
Includes fact_sales, fact_orders, dim_customer, dim_product, etc.
Exclude Temporary and Test Tables
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_schema_name public \
--exclude "temp_%, test_%, staging_%" \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--output_dir ./exports \
--n_jobs 4
./LakeXpress sync
Excludes temp_data, test_results, staging_import, etc.
Combine Include and Exclude
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_schema_name analytics \
--include "report_%, dashboard_%" \
--exclude "%_test, %_backup" \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--output_dir ./exports \
--n_jobs 4
./LakeXpress sync
Includes report_sales, dashboard_kpi. Excludes report_sales_test, dashboard_kpi_backup.
Incremental Sync Examples
Daily Order Updates
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_db_name ecommerce \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--incremental_table "public.orders:created_at:datetime" \
--incremental_table "public.order_items:created_at:datetime" \
--publish_target snowflake_prod \
--n_jobs 4
# Run daily via cron or scheduler
./LakeXpress sync
First run exports all rows. Subsequent runs export only new orders; other tables are fully exported each time.
Event Log Ingestion
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_db_name events \
--source_schema_name events \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id aws_s3_01 \
--incremental_table "events.pageviews:event_time:timestamp" \
--incremental_table "events.clicks:event_time:timestamp" \
--incremental_table "events.conversions:event_time:timestamp" \
--incremental_safety_lag 600 \
--sub_path production/events \
--n_jobs 8 \
--fastbcp_p 4
# Run every 10 minutes
./LakeXpress sync
See Incremental Sync Documentation for details.
Resume Failed Exports
When a sync starts, note the run ID from output:
2025-10-31 10:15:23 | INFO | Starting sync run: 20251031-2f73b4d0-8647-11ef-8089-c403a82a4577
Resume with:
./LakeXpress sync --run_id 20251031-2f73b4d0-8647-11ef-8089-c403a82a4577 --resume
This skips completed tables and retries failed ones.
Query Failed Tables
SELECT
source_schema,
source_table,
status,
error_message,
started_at,
finished_at
FROM jobs
WHERE run_id = '20251031-2f73b4d0-8647-11ef-8089-c403a82a4577'
AND status = 'failed'
ORDER BY source_schema, source_table;
Advanced Examples
Custom FastBCP Configuration
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_schema_name tpch \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_table_config "lineitem:DataDriven:YEAR(l_shipdate):8;orders:Ctid::4;customer:RangeId:c_custkey:2" \
--n_jobs 4 \
--output_dir ./exports
./LakeXpress sync
lineitem: DataDriven, 8-way, partitioned by yearorders: Ctid, 4-waycustomer: RangeId, 2-way, partitioned by customer key
Export with Debug Logging
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--output_dir ./exports \
--log_level DEBUG \
--log_dir ./logs \
--n_jobs 2
./LakeXpress sync
Publish Previously Exported Data
./LakeXpress sync publish
Triggers only the publishing step to Snowflake/Databricks/Glue/Fabric using existing configuration.
Snowflake Publishing Examples
External Tables with Date-Based Schema
./LakeXpress config create \
-a data/ds_credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id ds_04_pg \
--source_db_name tpch \
--source_schema_name tpch_1 \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--target_storage_id aws_s3_01 \
--generate_metadata \
--sub_path toto \
--publish_target snowflake_pass \
--publish_schema_pattern "EXT_{subpath}_{date}" \
--publish_table_pattern "{schema}_{table}"
./LakeXpress sync
Creates in Snowflake:
- Schema:
EXT_toto_20251208 - Tables:
tpch_1_customer,tpch_1_orders,tpch_1_lineitem - Views:
V_tpch_1_customer,V_tpch_1_orders,V_tpch_1_lineitem
SELECT COUNT(*) FROM EXT_toto_20251208.V_tpch_1_customer;
SELECT
c.C_NAME,
COUNT(o.O_ORDERKEY) as order_count
FROM EXT_toto_20251208.V_tpch_1_customer c
LEFT JOIN EXT_toto_20251208.V_tpch_1_orders o
ON c.C_CUSTKEY = o.O_CUSTKEY
GROUP BY c.C_NAME
LIMIT 10;
Internal Tables with Primary Keys
./LakeXpress config create \
-a data/ds_credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id ds_04_pg \
--source_db_name tpch \
--source_schema_name tpch_1 \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--target_storage_id aws_s3_01 \
--generate_metadata \
--sub_path toto \
--publish_method internal \
--publish_target snowflake_pat \
--snowflake_pk_constraints
./LakeXpress sync
Creates in Snowflake:
- Schema:
tpch_1 - Tables:
customer,orders,lineitemwith data loaded and PRIMARY KEY constraints - No views needed for internal tables
Multi-Schema Consolidation
Merge multiple schemas into one Snowflake schema:
./LakeXpress config create \
-a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--source_schema_name sales,marketing,finance \
--target_storage_id aws_s3_01 \
--sub_path consolidated \
--publish_target snowflake_prod \
--publish_schema_pattern "{subpath}" \
--publish_table_pattern "{schema}_{table}" \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/
./LakeXpress sync
Creates schema consolidated with tables: sales_orders, sales_customers, marketing_campaigns, finance_invoices.
Internal Tables with Incremental Sync
./LakeXpress config create \
-a data/ds_credentials.json \
--log_db_auth_id log_db_ms \
--source_db_auth_id ds_04_pg \
--source_db_name sales \
--source_schema_name tpch_1 \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--fastbcp_p 2 \
--n_jobs 4 \
--target_storage_id aws_s3_01 \
--generate_metadata \
--sub_path dev \
--incremental_table "tpch_1.orders:o_orderdate:date" \
--incremental_table "tpch_1.lineitem:l_shipdate:date" \
--publish_method internal \
--publish_target snowflake_pat \
--snowflake_pk_constraints
# First sync: full load
./LakeXpress sync
# Subsequent syncs: incremental for orders/lineitem, full for other tables
./LakeXpress sync
For full Snowflake documentation, see the Snowflake Publishing Guide.