Logging and Monitoring

LakeXpress uses two logging systems: a database for structured export tracking and log files for operational details.

Database Logging

The export tracking database (SQL Server, PostgreSQL, or SQLite) records all export activity for reporting and troubleshooting.

Database Schema Overview

Export Database Schema
Click image to view full size. Primary keys are marked with ● symbols and relationships show cardinality (1:N).

Key Tables

export_runs

Top-level record per export run:

  • Unique run identifier
  • Run type (full, incremental, custom)
  • Source database and schema
  • Storage configuration
  • Start/completion timestamps and status
  • Aggregate metrics (total, completed, failed jobs)
  • Total rows exported and storage size

export_jobs

Per-table export job tracking:

  • Job ID and associated run_id
  • Step ID linking to export_steps
  • Source table details (database, schema, table)
  • Storage settings (type, path, format, compression)
  • Parquet settings (row group size)
  • FastBCP settings (distribution key, method, parallel degree)
  • Results (file count, rows exported)
  • Status (job_status, step_status)
  • Retry count and timestamps (created, started, completed)

export_steps

Workflow step definitions:

  • Step names and descriptions
  • Links to jobs and log entries

export_files

Per-file tracking for generated Parquet files:

  • File ID linked to parent job
  • File name and full path
  • Storage location (local, S3, Azure, GCS)
  • File size, row count, status, creation timestamp

export_columns

Column metadata per exported table:

  • Job ID + source column name (composite PK)
  • Source database type
  • Parquet column name (post-transformation)
  • Source data type, precision, scale
  • Ordinal position, length, nullability

export_log

Event log:

  • Log ID and timestamp
  • Links to run_id, job_id, step_id
  • Log level (INFO, WARNING, ERROR, DEBUG)
  • Event status, elapsed time (seconds)
  • Source table context (database, schema, table)
  • Event/error messages and error codes

partition_columns

FastBCP partitioning configuration:

  • Partition config ID
  • Source table identification (db type, database, schema, table)
  • FastBCP distribution key, method, parallel degree
  • Environment name, enable/disable flag
  • Description and audit timestamps

File-Based Logging

Each run creates a log file named: lakexpress_YYYYMMDD_HHMMSS_<run_id>.log

Log Format

Structured format with timestamp, log level, process ID, and context:

[YYYY-MM-DD HH:MM:SS.fff+TZ :: LEVEL :: PID :: Context] Message

Sample Log Output

[2025-11-05 10:15:23.145+01:00 :: INFO :: 12345 :: MainProcess] Log file path: /var/log/lakexpress/lakexpress_20251105_101523_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d.log
[2025-11-05 10:15:23.146+01:00 :: INFO :: 12345 :: MainProcess] **** Starting export run ****
[2025-11-05 10:15:23.147+01:00 :: INFO :: 12345 :: MainProcess] Source: SQLServer - salesdb.dbo
[2025-11-05 10:15:23.148+01:00 :: INFO :: 12345 :: MainProcess] Target storage: S3 - s3://data-lake/exports/
[2025-11-05 10:15:23.250+01:00 :: INFO :: 12345 :: MainProcess] Discovered 15 tables for export
[2025-11-05 10:15:23.385+01:00 :: INFO :: 12345 :: Worker-1] Exporting table: orders (estimated 1.2M rows)
[2025-11-05 10:15:45.721+01:00 :: INFO :: 12345 :: Worker-1] Export completed: orders - 1,234,567 rows in 8 files (245 MB)
[2025-11-05 10:15:45.722+01:00 :: INFO :: 12345 :: Worker-1] Export elapsed time: 22.337 seconds

Log Levels

Set via --log-level:

  • INFO: Progress, row counts, operational messages
  • WARNING: Non-critical issues (schema differences, type conversions)
  • ERROR: Failures preventing a table export
  • DEBUG: SQL queries, file operations, diagnostics
  • CRITICAL: Fatal errors stopping the run

Log Location

Defaults to the current working directory. Override with --log-dir:

./lakexpress --log-dir /var/log/lakexpress export ...

Terminal Output

Logs also appear in the terminal, color-coded:

  • Green: success
  • Yellow: warnings
  • Red: errors
  • Cyan: info