Real-World Examples

Two-step workflow examples: config create then sync.

Table of Contents

PostgreSQL to Local Filesystem

  • PostgreSQL TPC-H tpch.tpch_1 to /tmp/tpch/
  • SQL Server log DB, 4 parallel tables, 2-way parallelism, CDM metadata

Windows (PowerShell)

.\LakeXpress.exe config create `
  -a .\credentials.json `
  --log_db_auth_id log_db_ms `
  --source_db_auth_id ds_04_pg `
  --source_db_name tpch `
  --source_schema_name tpch_1 `
  --fastbcp_dir_path .\FastBCP_win-x64\latest\ `
  --fastbcp_p 2 `
  --n_jobs 4 `
  --output_dir C:\exports\tpch\ `
  --generate_metadata

Linux

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id ds_04_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1 \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --output_dir /tmp/tpch/ \
  --generate_metadata

./LakeXpress sync

Exports all tables from tpch_1 to /tmp/tpch/tpch_1/ with CDM manifest.

PostgreSQL to S3 Storage

  • PostgreSQL TPC-H tpch.tpch_1 to AWS S3
  • SQL Server log DB, 4 parallel tables, 2-way parallelism, CDM metadata

Windows (PowerShell)

.\LakeXpress.exe config create `
  -a .\credentials.json `
  --log_db_auth_id log_db_ms `
  --source_db_auth_id ds_04_pg `
  --source_db_name tpch `
  --source_schema_name tpch_1 `
  --fastbcp_dir_path .\FastBCP_win-x64\latest\ `
  --fastbcp_p 2 `
  --n_jobs 4 `
  --target_storage_id s3_01 `
  --generate_metadata

Linux

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id ds_04_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1 \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --target_storage_id s3_01 \
  --generate_metadata

./LakeXpress sync

Uploads Parquet files to s3://your-bucket/base_path/tpch_1/ with CDM metadata.

PostgreSQL to Google Cloud Storage

  • PostgreSQL TPC-H to GCS, multi-schema via pattern tpch_1%
  • SQL Server log DB, 4 parallel tables, 2-way parallelism, CDM metadata

Windows (PowerShell)

.\LakeXpress.exe config create `
  -a .\credentials.json `
  --log_db_auth_id log_db_ms `
  --source_db_auth_id ds_04_pg `
  --source_db_name tpch `
  --source_schema_name tpch_1% `
  --fastbcp_dir_path .\FastBCP_win-x64\latest\ `
  --fastbcp_p 2 `
  --n_jobs 4 `
  --target_storage_id gcs_01 `
  --generate_metadata

Linux

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id ds_04_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1% \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --target_storage_id gcs_01 \
  --generate_metadata

./LakeXpress sync

Exports all schemas matching tpch_1% to GCS.

PostgreSQL to S3 Storage (OVH)

  • PostgreSQL TPC-H tpch.tpch_1 to OVH S3-compatible storage
  • Uses custom endpoint via AWS profile
./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id ds_04_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1 \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --target_storage_id s3_02 \
  --generate_metadata

./LakeXpress sync

credentials.json:

{
  "s3_02": {
    "ds_type": "s3",
    "auth_mode": "profile",
    "info": {
      "directory": "s3://my-ovh-bucket/exports",
      "profile": "ovh"
    }
  }
}

~/.aws/config:

[profile ovh]
endpoint_url = https://s3.gra.io.cloud.ovh.net
region = gra

~/.aws/credentials:

[ovh]
aws_access_key_id = YOUR_OVH_ACCESS_KEY
aws_secret_access_key = YOUR_OVH_SECRET_KEY

SQL Server to Local Filesystem

AdventureWorksDW

  • SQL Server adventureworksdw.dbo to local filesystem
  • PostgreSQL log DB
./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_pg \
  --source_db_auth_id ds_10_ms \
  --source_db_name adventureworksdw \
  --source_schema_name dbo \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --output_dir /tmp/tpch/ \
  --generate_metadata

./LakeXpress sync

AdventureWorks with Sub-Path

Uses --sub_path to export into a specific sub-directory:

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_pg \
  --source_db_auth_id ds_11_ms \
  --source_db_name adventureworks2019 \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --output_dir /tmp/ \
  --sub_path adventureworks2019 \
  --generate_metadata

./LakeXpress sync

Creates /tmp/adventureworks2019/schema/table/ structure.

SQL Server to S3 Storage

  • SQL Server adventureworksdw.dbo to AWS S3
  • PostgreSQL log DB, organized by database name via --sub_path
./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_pg \
  --source_db_auth_id ds_10_ms \
  --source_db_name adventureworksdw \
  --source_schema_name dbo \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --target_storage_id s3_01 \
  --sub_path adventureworksdw \
  --generate_metadata

./LakeXpress sync

Uploads to s3://bucket/base_path/adventureworksdw/dbo/table/.

SQL Server to Azure Storage

  • SQL Server adventureworks2019.sales to Azure ADLS Gen2
  • Service Principal auth, PostgreSQL log DB
  • 4 parallel tables, 2-way parallelism, CDM metadata

Windows (PowerShell)

.\LakeXpress.exe config create `
  -a .\credentials.json `
  --log_db_auth_id log_db_pg `
  --source_db_auth_id ds_11_ms `
  --source_db_name adventureworks2019 `
  --source_schema_name sales `
  --fastbcp_dir_path .\FastBCP_win-x64\latest\ `
  --fastbcp_p 2 `
  --n_jobs 4 `
  --target_storage_id azure_01 `
  --sub_path adventureworks/sales `
  --generate_metadata

Linux

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_pg \
  --source_db_auth_id ds_11_ms \
  --source_db_name adventureworks2019 \
  --source_schema_name sales \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --target_storage_id azure_01 \
  --sub_path adventureworks/sales \
  --generate_metadata

./LakeXpress sync

credentials.json:

{
  "azure_01": {
    "ds_type": "azure",
    "auth_mode": "service_principal",
    "info": {
      "directory": "abfss://datalake.dfs.core.windows.net/exports",
      "azure_client_id": "your-application-client-id",
      "azure_tenant_id": "your-directory-tenant-id",
      "azure_client_secret": "your-client-secret"
    }
  },
  "log_db_pg": {
    "ds_type": "postgres",
    "auth_mode": "classic",
    "info": {
      "username": "postgres",
      "password": "your_password",
      "server": "localhost",
      "port": 5432,
      "database": "lakexpress_log"
    }
  },
  "ds_11_ms": {
    "ds_type": "mssql",
    "auth_mode": "classic",
    "info": {
      "username": "sa",
      "password": "your_password",
      "server": "localhost",
      "port": 1433,
      "database": "adventureworks2019"
    }
  }
}

Output path:

abfss://datalake@mystorageaccount.dfs.core.windows.net/exports/adventureworks/sales/schema/table/*.parquet

Multi-Schema Export

  • PostgreSQL with multiple TPC-H schemas matching tpch_1%
  • Exports to AWS S3 with custom sub-path
./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id ds_04_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1% \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --target_storage_id s3_01 \
  --sub_path titi/tata \
  --generate_metadata

./LakeXpress sync

Matches tpch_1, tpch_10, tpch_100, etc. Each schema exported to s3://bucket/base/titi/tata/schema_name/.

Table Filtering Examples

Include Specific Table Patterns

Export only fact and dimension tables:

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_schema_name public \
  --include "fact_%, dim_%" \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --output_dir ./exports \
  --n_jobs 4

./LakeXpress sync

Includes fact_sales, fact_orders, dim_customer, dim_product, etc.

Exclude Temporary and Test Tables

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_schema_name public \
  --exclude "temp_%, test_%, staging_%" \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --output_dir ./exports \
  --n_jobs 4

./LakeXpress sync

Excludes temp_data, test_results, staging_import, etc.

Combine Include and Exclude

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_schema_name analytics \
  --include "report_%, dashboard_%" \
  --exclude "%_test, %_backup" \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --output_dir ./exports \
  --n_jobs 4

./LakeXpress sync

Includes report_sales, dashboard_kpi. Excludes report_sales_test, dashboard_kpi_backup.

Incremental Sync Examples

Daily Order Updates

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_db_name ecommerce \
  --source_schema_name public \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id s3_01 \
  --incremental_table "public.orders:created_at:datetime" \
  --incremental_table "public.order_items:created_at:datetime" \
  --publish_target snowflake_prod \
  --n_jobs 4

# Run daily via cron or scheduler
./LakeXpress sync

First run exports all rows. Subsequent runs export only new orders; other tables are fully exported each time.

Event Log Ingestion

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_db_name events \
  --source_schema_name events \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id aws_s3_01 \
  --incremental_table "events.pageviews:event_time:timestamp" \
  --incremental_table "events.clicks:event_time:timestamp" \
  --incremental_table "events.conversions:event_time:timestamp" \
  --incremental_safety_lag 600 \
  --sub_path production/events \
  --n_jobs 8 \
  --fastbcp_p 4

# Run every 10 minutes
./LakeXpress sync

See Incremental Sync Documentation for details.

Resume Failed Exports

When a sync starts, note the run ID from output:

2025-10-31 10:15:23 | INFO | Starting sync run: 20251031-2f73b4d0-8647-11ef-8089-c403a82a4577

Resume with:

./LakeXpress sync --run_id 20251031-2f73b4d0-8647-11ef-8089-c403a82a4577 --resume

This skips completed tables and retries failed ones.

Query Failed Tables

SELECT
    source_schema,
    source_table,
    status,
    error_message,
    started_at,
    finished_at
FROM jobs
WHERE run_id = '20251031-2f73b4d0-8647-11ef-8089-c403a82a4577'
  AND status = 'failed'
ORDER BY source_schema, source_table;

Advanced Examples

Custom FastBCP Configuration

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_schema_name tpch \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_table_config "lineitem:DataDriven:YEAR(l_shipdate):8;orders:Ctid::4;customer:RangeId:c_custkey:2" \
  --n_jobs 4 \
  --output_dir ./exports

./LakeXpress sync
  • lineitem: DataDriven, 8-way, partitioned by year
  • orders: Ctid, 4-way
  • customer: RangeId, 2-way, partitioned by customer key

Export with Debug Logging

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_schema_name public \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --output_dir ./exports \
  --log_level DEBUG \
  --log_dir ./logs \
  --n_jobs 2

./LakeXpress sync

Publish Previously Exported Data

./LakeXpress sync publish

Triggers only the publishing step to Snowflake/Databricks/Glue/Fabric using existing configuration.

Snowflake Publishing Examples

External Tables with Date-Based Schema

./LakeXpress config create \
  -a data/ds_credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id ds_04_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1 \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --target_storage_id aws_s3_01 \
  --generate_metadata \
  --sub_path toto \
  --publish_target snowflake_pass \
  --publish_schema_pattern "EXT_{subpath}_{date}" \
  --publish_table_pattern "{schema}_{table}"

./LakeXpress sync

Creates in Snowflake:

  • Schema: EXT_toto_20251208
  • Tables: tpch_1_customer, tpch_1_orders, tpch_1_lineitem
  • Views: V_tpch_1_customer, V_tpch_1_orders, V_tpch_1_lineitem
SELECT COUNT(*) FROM EXT_toto_20251208.V_tpch_1_customer;

SELECT
  c.C_NAME,
  COUNT(o.O_ORDERKEY) as order_count
FROM EXT_toto_20251208.V_tpch_1_customer c
LEFT JOIN EXT_toto_20251208.V_tpch_1_orders o
  ON c.C_CUSTKEY = o.O_CUSTKEY
GROUP BY c.C_NAME
LIMIT 10;

Internal Tables with Primary Keys

./LakeXpress config create \
  -a data/ds_credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id ds_04_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1 \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --target_storage_id aws_s3_01 \
  --generate_metadata \
  --sub_path toto \
  --publish_method internal \
  --publish_target snowflake_pat \
  --snowflake_pk_constraints

./LakeXpress sync

Creates in Snowflake:

  • Schema: tpch_1
  • Tables: customer, orders, lineitem with data loaded and PRIMARY KEY constraints
  • No views needed for internal tables

Multi-Schema Consolidation

Merge multiple schemas into one Snowflake schema:

./LakeXpress config create \
  -a credentials.json \
  --log_db_auth_id log_db_postgres \
  --source_db_auth_id source_postgres \
  --source_schema_name sales,marketing,finance \
  --target_storage_id aws_s3_01 \
  --sub_path consolidated \
  --publish_target snowflake_prod \
  --publish_schema_pattern "{subpath}" \
  --publish_table_pattern "{schema}_{table}" \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/

./LakeXpress sync

Creates schema consolidated with tables: sales_orders, sales_customers, marketing_campaigns, finance_invoices.

Internal Tables with Incremental Sync

./LakeXpress config create \
  -a data/ds_credentials.json \
  --log_db_auth_id log_db_ms \
  --source_db_auth_id ds_04_pg \
  --source_db_name sales \
  --source_schema_name tpch_1 \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --fastbcp_p 2 \
  --n_jobs 4 \
  --target_storage_id aws_s3_01 \
  --generate_metadata \
  --sub_path dev \
  --incremental_table "tpch_1.orders:o_orderdate:date" \
  --incremental_table "tpch_1.lineitem:l_shipdate:date" \
  --publish_method internal \
  --publish_target snowflake_pat \
  --snowflake_pk_constraints

# First sync: full load
./LakeXpress sync

# Subsequent syncs: incremental for orders/lineitem, full for other tables
./LakeXpress sync

For full Snowflake documentation, see the Snowflake Publishing Guide.

See Also