Database Configuration
LakeXpress supports multiple database types as sources and for export logging.
Supported Databases
Source Databases
| Database | Support Status | Notes |
|---|---|---|
| PostgreSQL | Supported | Ctid-based parallel export |
| Oracle | Supported | Thin and thick modes, Rowid-based parallel export |
| SQL Server | Supported | Physloc-based parallel export |
| MySQL | Supported | Partition detection |
| MariaDB | Supported | Dedicated mariadb ds_type |
Log Databases
| Database | Support Status | Notes |
|---|---|---|
| PostgreSQL | Supported | Recommended for production |
| SQL Server | Supported | |
| MySQL | Supported | |
| SQLite | Supported | Single-user scenarios |
| DuckDB | Supported | Embedded analytical option |
Authentication File Format
Create a JSON file with your database credentials:
{
"log_db_postgres": {
"ds_type": "postgres",
"auth_mode": "classic",
"info": {
"username": "postgres",
"password": "your_password",
"server": "localhost",
"port": 5432,
"database": "lakexpress_log"
}
},
"source_oracle": {
"ds_type": "oracle",
"auth_mode": "classic",
"info": {
"username": "oracle_user",
"password": "oracle_password",
"server": "oracle-server.com",
"port": 1521,
"database": "orclpdb1",
"lib_dir": "/opt/oracle/instantclient_19_8"
}
},
"source_postgres": {
"ds_type": "postgres",
"auth_mode": "classic",
"info": {
"username": "postgres",
"password": "your_password",
"server": "localhost",
"port": 5432,
"database": "production_db"
}
},
"source_mssql": {
"ds_type": "mssql",
"auth_mode": "classic",
"info": {
"username": "sa",
"password": "mssql_password",
"server": "mssql-server.com",
"port": 1433,
"database": "sales_db"
}
},
"log_db_sqlite": {
"ds_type": "sqlite",
"auth_mode": "filesystem",
"info": {
"filepath": "/path/to/lakexpress_log.sqlite"
}
}
}
PostgreSQL Configuration
Connection Parameters
| Parameter | Required | Description |
|---|---|---|
ds_type |
Yes | Must be "postgres" |
auth_mode |
Yes | Must be "classic" |
username |
Yes | PostgreSQL username |
password |
Yes | PostgreSQL password |
server |
Yes | Hostname or IP |
port |
Yes | Port (default: 5432) |
database |
Yes | Database name |
Required Permissions
-- Source database user:
GRANT SELECT ON ALL TABLES IN SCHEMA public TO lakexpress_user;
GRANT USAGE ON SCHEMA public TO lakexpress_user;
-- information_schema access is granted by default
Example
{
"postgres_prod": {
"ds_type": "postgres",
"auth_mode": "classic",
"info": {
"username": "dataexport",
"password": "SecureP@ssw0rd",
"server": "pg-prod.company.com",
"port": 5432,
"database": "analytics"
}
}
}
Oracle Configuration
Connection Parameters
| Parameter | Required | Description |
|---|---|---|
ds_type |
Yes | Must be "oracle" |
auth_mode |
Yes | Must be "classic" |
username |
Yes | Oracle username |
password |
Yes | Oracle password |
server |
Yes | Hostname or IP |
port |
Yes | Port (default: 1521) |
database |
Yes | Service name or SID |
lib_dir |
No | Client library directory (thick mode) |
Thin vs Thick Mode
Thin Mode (Recommended):
- No client libraries required
- Oracle 12.1+
- Omit
lib_dir
Thick Mode:
- Required for Oracle 11.2 and earlier
- Requires Oracle Instant Client
- Set
lib_dirto the Instant Client directory
Required Permissions
For parallel RowID-based exports:
-- As Oracle DBA (e.g., sys as sysdba)
GRANT SELECT_CATALOG_ROLE TO your_username;
Grants access to metadata discovery, RowID range partitioning, and export method selection.
Example: Thin Mode
{
"oracle_prod": {
"ds_type": "oracle",
"auth_mode": "classic",
"info": {
"username": "EXPORT_USER",
"password": "OracleP@ss",
"server": "oracle-prod.company.com",
"port": 1521,
"database": "PRODPDB"
}
}
}
Example: Thick Mode
{
"oracle_legacy": {
"ds_type": "oracle",
"auth_mode": "classic",
"info": {
"username": "LEGACY_USER",
"password": "OracleP@ss",
"server": "oracle-11g.company.com",
"port": 1521,
"database": "LEGACY",
"lib_dir": "/opt/oracle/instantclient_19_8"
}
}
}
SQL Server Configuration
Connection Parameters
| Parameter | Required | Description |
|---|---|---|
ds_type |
Yes | Must be "mssql" |
auth_mode |
Yes | Must be "classic" |
username |
Yes | SQL Server username |
password |
Yes | SQL Server password |
server |
Yes | Hostname or IP |
port |
Yes | Port (default: 1433) |
database |
Yes | Database name |
LakeXpress falls back to PyMSSQL when ODBC drivers are unavailable.
Required Permissions
-- Source database user:
GRANT SELECT ON SCHEMA::dbo TO lakexpress_user;
GRANT VIEW DEFINITION ON SCHEMA::dbo TO lakexpress_user;
-- INFORMATION_SCHEMA access is granted by default
Example
{
"mssql_prod": {
"ds_type": "mssql",
"auth_mode": "classic",
"info": {
"username": "sa",
"password": "StrongP@ssw0rd",
"server": "mssql-prod.company.com",
"port": 1433,
"database": "SalesDB"
}
}
}
MySQL Configuration
MySQL works as both a source and log database.
Connection Parameters
| Parameter | Required | Description |
|---|---|---|
ds_type |
Yes | Must be "mysql" |
auth_mode |
Yes | Must be "classic" |
server |
Yes | Hostname or IP |
port |
Yes | Port (default: 3306) |
database |
Yes | Database name |
username |
Yes | MySQL username |
password |
Yes | MySQL password |
Required Permissions
-- Source database user:
GRANT SELECT ON your_database.* TO 'lakexpress_user'@'%';
-- Log database user:
GRANT ALL PRIVILEGES ON lakexpress_log.* TO 'lakexpress_user'@'%';
-- INFORMATION_SCHEMA access is granted by default
Example: Source Database
{
"source_mysql": {
"ds_type": "mysql",
"auth_mode": "classic",
"info": {
"server": "localhost",
"port": 3306,
"database": "your-database",
"username": "your-username",
"password": "your-password"
}
}
}
Example: Log Database
{
"log_db_mysql": {
"ds_type": "mysql",
"auth_mode": "classic",
"info": {
"server": "localhost",
"port": 3306,
"database": "lakexpress_log",
"username": "your-username",
"password": "your-password"
}
}
}
MariaDB Configuration
MariaDB is supported as a source database with its own mariadb ds_type.
Connection Parameters
| Parameter | Required | Description |
|---|---|---|
ds_type |
Yes | Must be "mariadb" |
auth_mode |
Yes | Must be "classic" |
server |
Yes | Hostname or IP |
port |
Yes | Port (default: 3306) |
database |
Yes | Database name |
username |
Yes | MariaDB username |
password |
Yes | MariaDB password |
Required Permissions
-- Source database user:
GRANT SELECT ON your_database.* TO 'lakexpress_user'@'%';
-- INFORMATION_SCHEMA access is granted by default
Example: Source Database
{
"source_mariadb": {
"ds_type": "mariadb",
"auth_mode": "classic",
"info": {
"server": "localhost",
"port": 3306,
"database": "your-database",
"username": "your-username",
"password": "your-password"
}
}
}
SQLite Configuration (Log Database Only)
Connection Parameters
| Parameter | Required | Description |
|---|---|---|
ds_type |
Yes | Must be "sqlite" |
auth_mode |
Yes | Must be "filesystem" |
filepath |
Yes | Path to SQLite database file |
Example
{
"log_db_local": {
"ds_type": "sqlite",
"auth_mode": "filesystem",
"info": {
"filepath": "/var/lib/lakexpress/export_log.sqlite"
}
}
}
When to Use SQLite
Good for: single-user setups, development, testing, deployments without a database server.
Avoid for: multi-user environments, high-concurrency production, distributed systems.
DuckDB Configuration (Log Database Only)
File-based databases only. In-memory mode (:memory:) is not supported.
Connection Parameters
| Parameter | Required | Description |
|---|---|---|
ds_type |
Yes | Must be "duckdb" |
auth_mode |
Yes | Must be "filesystem" |
filepath |
Yes | Path to DuckDB database file |
Example
{
"log_db_duckdb": {
"ds_type": "duckdb",
"auth_mode": "filesystem",
"info": {
"filepath": "/var/lib/lakexpress/export_log.duckdb"
}
}
}
When to Use DuckDB
Good for: single-user setups, development, testing, fast analytical queries, lightweight embedded needs.
Avoid for: concurrent writes, high-concurrency production, distributed systems.
Connection String Examples
Using the Auth File
# PostgreSQL source, PostgreSQL log
./LakeXpress -a auth.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres
# Oracle source, SQL Server log
./LakeXpress -a auth.json \
--log_db_auth_id log_db_mssql \
--source_db_auth_id source_oracle
# SQL Server source, SQLite log
./LakeXpress -a auth.json \
--log_db_auth_id log_db_sqlite \
--source_db_auth_id source_mssql
# MySQL source, MySQL log
./LakeXpress -a auth.json \
--log_db_auth_id log_db_mysql \
--source_db_auth_id source_mysql
# MySQL source, DuckDB log
./LakeXpress -a auth.json \
--log_db_auth_id log_db_duckdb \
--source_db_auth_id source_mysql
Troubleshooting
PostgreSQL
Connection refused:
Check PostgreSQL is running (systemctl status postgresql), verify pg_hba.conf allows your IP, check firewall (sudo ufw allow 5432/tcp).
Authentication failed:
Verify credentials, check password encryption in pg_hba.conf, confirm user exists: SELECT * FROM pg_user WHERE usename='your_user';
Oracle
ORA-12154: TNS could not resolve connect identifier:
Verify service name, check tnsnames.ora, try IP instead of hostname, confirm listener is running.
ORA-28000: account is locked:
ALTER USER your_username ACCOUNT UNLOCK;
Thick mode library not found:
Verify lib_dir path, confirm Instant Client is installed, check file permissions, set LD_LIBRARY_PATH on Linux.
SQL Server
Login failed for user: Confirm SQL Server authentication is enabled (not Windows-only), verify user exists with correct permissions, check password and IP restrictions.
Cannot open database requested by the login:
Verify database name, check user access, confirm database is online: SELECT state_desc FROM sys.databases WHERE name='YourDB'
SQLite
Unable to open database file: Check file path, verify directory exists with write permissions, use absolute paths, check disk space.
MySQL / MariaDB
Access denied for user:
Verify credentials, check host access: SELECT user, host FROM mysql.user WHERE user='your_user';, grant if needed: GRANT ALL PRIVILEGES ON database.* TO 'user'@'%';
Unknown database:
Verify name: SHOW DATABASES;, create if needed: CREATE DATABASE your_database;
Connection refused:
Check MySQL is running (systemctl status mysql), verify port (netstat -tlnp | grep 3306), check bind-address allows remote connections, check firewall (sudo ufw allow 3306/tcp).
DuckDB
Unable to open database file: Check file path, verify directory exists with write permissions, use absolute paths, check disk space.
Database is locked: Ensure no other process is using the file, close existing connections, check for stale lock files.
Snowflake Publishing Configuration
LakeXpress can create Snowflake tables after exporting to cloud storage.
Authentication Modes
Snowflake supports four authentication methods:
Password Authentication
{
"snowflake_password": {
"ds_type": "snowflake",
"auth_mode": "password",
"info": {
"account": "your-account-identifier",
"user": "your-username",
"password": "your-password",
"warehouse": "your-warehouse",
"database": "your-database",
"stage": "your-external-stage"
}
}
}
Programmatic Access Token (PAT) - Recommended for Automation
{
"snowflake_pat": {
"ds_type": "snowflake",
"auth_mode": "pat",
"info": {
"account": "your-account-identifier",
"user": "your-username",
"token": "your-personal-access-token",
"warehouse": "your-warehouse",
"database": "your-database",
"stage": "your-external-stage"
}
}
}
Key-Pair Authentication - Most Secure
{
"snowflake_keypair": {
"ds_type": "snowflake",
"auth_mode": "keypair",
"info": {
"account": "your-account-identifier",
"user": "your-username",
"private_key_path": "/path/to/rsa_key.p8",
"private_key_passphrase": "optional-passphrase",
"warehouse": "your-warehouse",
"database": "your-database",
"stage": "your-external-stage"
}
}
}
OAuth Authentication
{
"snowflake_oauth": {
"ds_type": "snowflake",
"auth_mode": "oauth",
"info": {
"account": "your-account-identifier",
"user": "your-username",
"oauth_token": "your-oauth-token",
"warehouse": "your-warehouse",
"database": "your-database",
"stage": "your-external-stage"
}
}
}
Configuration Parameters
| Parameter | Required | Description |
|---|---|---|
ds_type |
Yes | Must be "snowflake" |
auth_mode |
Yes | One of: password, pat, keypair, oauth |
account |
Yes | Snowflake account identifier (e.g., tj36405.eu-west-1) |
user |
Yes | Snowflake username (usually all caps) |
warehouse |
Yes | Snowflake warehouse name |
database |
Yes | Target database name |
stage |
Yes | Snowflake stage name pointing to cloud storage |
password |
Conditional | Required for password auth mode |
token |
Conditional | Required for pat auth mode |
private_key_path |
Conditional | Required for keypair auth mode |
private_key_passphrase |
Optional | For encrypted private keys |
oauth_token |
Conditional | Required for oauth auth mode |
Important: Stage Configuration
The Snowflake stage must point to the same location as your cloud storage configuration.
Example - S3:
{
"aws_s3": {
"ds_type": "s3",
"info": {
"directory": "s3://your-bucket-name/path/to/exports"
}
},
"snowflake_pat": {
"ds_type": "snowflake",
"info": {
"stage": "your-external-stage"
}
}
}
The stage URL must match the S3 directory path.
Verify stage location in Snowflake:
DESC STAGE YOUR_DB.PUBLIC.YOUR_STAGE;
-- URL column must match S3 directory exactly
Usage
# Export to S3 and publish to Snowflake
./LakeXpress -a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--target_storage_id aws_s3 \
--publish_target snowflake_pat \
--fastbcp_dir_path /path/to/fastbcp
For more details, see the Snowflake Publishing Guide.
Microsoft Fabric Publishing Configuration
LakeXpress can create Fabric Lakehouse tables after exporting to OneLake storage.
Authentication
Fabric uses Service Principal authentication with Azure AD:
{
"fabric_lakehouse": {
"ds_type": "fabric",
"auth_mode": "service_principal",
"fabric_target": "lakehouse",
"info": {
"workspace_id": "your-workspace-id",
"lakehouse_id": "your-lakehouse-id",
"lakehouse_name": "your-lakehouse-name",
"sql_endpoint": "your-sql-endpoint.datawarehouse.fabric.microsoft.com",
"azure_client_id": "your-application-client-id",
"azure_tenant_id": "your-directory-tenant-id",
"azure_client_secret": "your-client-secret"
}
}
}
Configuration Parameters
| Parameter | Required | Description |
|---|---|---|
ds_type |
Yes | Must be "fabric" |
auth_mode |
Yes | Must be "service_principal" |
fabric_target |
Yes | Must be "lakehouse" |
workspace_id |
Yes | Fabric workspace GUID |
lakehouse_id |
Yes | Lakehouse GUID |
lakehouse_name |
Yes | Lakehouse name |
sql_endpoint |
Yes | SQL analytics endpoint hostname |
azure_client_id |
Yes | Application (client) ID from Azure AD app registration |
azure_tenant_id |
Yes | Directory (tenant) ID from Azure AD |
azure_client_secret |
Yes | Client secret from Azure AD app |
Finding Fabric Configuration Values
Workspace ID and Lakehouse ID:
From the Fabric portal URL: https://app.fabric.microsoft.com/groups/{workspace_id}/lakehouses/{lakehouse_id}
SQL Endpoint: In the Lakehouse view, click “SQL analytics endpoint” in the bottom pane and copy the connection string hostname.
Usage
# Export to OneLake and publish to Fabric Lakehouse
./LakeXpress -a credentials.json \
--log_db_auth_id log_db_postgres \
--source_db_auth_id source_postgres \
--target_storage_id onelake_storage \
--publish_target fabric_lakehouse \
--publish_method internal \
--fastbcp_dir_path /path/to/fastbcp
Security Best Practices
1. Credential Storage
- Never commit credentials to version control
- Restrict
auth.jsonpermissions:chmod 600 auth.json - Use environment-specific auth files (dev, staging, prod)
2. Least Privilege
- Create dedicated database users for LakeXpress
- Grant only SELECT on source, full privileges on log DB
- Use read-only credentials for source databases
3. Network Security
- Use SSL/TLS connections
- Restrict database access by IP
- Use VPN or private networks
4. Password Management
- Use strong passwords (12+ characters, mixed case, numbers, symbols)
- Rotate passwords regularly
- Consider secret managers (AWS Secrets Manager, HashiCorp Vault)
5. Environment Variables for Passwords
LakeXpress supports ${VAR_NAME} syntax in any string value in the credentials JSON:
{
"source_postgres": {
"ds_type": "postgres",
"auth_mode": "classic",
"info": {
"username": "postgres",
"password": "${PG_PASSWORD}",
"server": "localhost",
"port": 5432,
"database": "production_db"
}
},
"source_mssql": {
"ds_type": "mssql",
"auth_mode": "classic",
"info": {
"username": "sa",
"password": "${MSSQL_PASSWORD}",
"server": "mssql-server.com",
"port": 1433,
"database": "sales_db"
}
}
}
Set variables before running LakeXpress:
# Linux
export PG_PASSWORD="SecureP@ssw0rd"
export MSSQL_PASSWORD="StrongP@ssw0rd"
# Windows (cmd)
set PG_PASSWORD=SecureP@ssw0rd
# Windows (PowerShell)
$env:PG_PASSWORD = "SecureP@ssw0rd"
Plain-text passwords still work – values without $ are returned as-is.
See Also
- Quick Start Guide - Getting started
- CLI Reference - Command-line options
- Storage Backends - Storage configuration