Microsoft Fabric Publishing

LakeXpress creates Microsoft Fabric Lakehouse tables from exported Parquet files. Query data in Fabric using SQL, Spark, or Power BI via managed Delta tables or SQL analytics views.

Table of Contents

Prerequisites

1. Microsoft Fabric Setup

Required Resources:

  • Microsoft Fabric workspace with a Lakehouse
  • Service Principal with appropriate permissions
  • OneLake storage access

Required Permissions:

  • Workspace: Member or Contributor role
  • Lakehouse: Read and Write access
  • OneLake: Storage Blob Data Contributor

2. Azure AD App Registration

Create a Service Principal:

  1. Azure Portal > Azure Active Directory > App registrations
  2. Click New registration
  3. Name the app (e.g., “LakeXpress-Fabric”)
  4. Select Accounts in this organizational directory only
  5. Click Register
  6. Note the Application (client) ID and Directory (tenant) ID
  7. Go to Certificates & secrets > New client secret
  8. Create a secret and save the Value (shown only once)

3. Grant Fabric Permissions

Add the Service Principal to your workspace:

  1. Open your Fabric workspace
  2. Click Manage access
  3. Click Add people or groups
  4. Search for your Service Principal
  5. Assign Member or Contributor role
  6. Click Add

4. Find Fabric Configuration Values

Workspace ID and Lakehouse ID:

  1. Open your Lakehouse in the Fabric portal
  2. Extract from the URL:
    https://app.fabric.microsoft.com/groups/{workspace_id}/lakehouses/{lakehouse_id}
    
  3. Copy workspace_id (GUID after /groups/)
  4. Copy lakehouse_id (GUID after /lakehouses/)

Lakehouse Name:

The display name shown in the Fabric portal.

SQL Analytics Endpoint:

  1. In your Lakehouse, click SQL analytics endpoint in the bottom pane
  2. Click Copy SQL connection string
  3. Extract the hostname (e.g., abc123xyz.datawarehouse.fabric.microsoft.com)

5. Credentials Configuration

Both OneLake storage and Fabric publishing credentials go in credentials.json:

{
    "onelake_storage": {
        "ds_type": "onelake",
        "auth_mode": "service_principal",
        "info": {
            "directory": "onelake://your-workspace-name/your-lakehouse-name/",
            "azure_client_id": "your-application-client-id",
            "azure_tenant_id": "your-directory-tenant-id",
            "azure_client_secret": "your-client-secret"
        }
    },
    "fabric_lakehouse": {
        "ds_type": "fabric",
        "auth_mode": "service_principal",
        "fabric_target": "lakehouse",
        "info": {
            "workspace_id": "your-workspace-id",
            "lakehouse_id": "your-lakehouse-id",
            "lakehouse_name": "your-lakehouse-name",
            "sql_endpoint": "your-sql-endpoint.datawarehouse.fabric.microsoft.com",
            "azure_client_id": "your-application-client-id",
            "azure_tenant_id": "your-directory-tenant-id",
            "azure_client_secret": "your-client-secret"
        }
    }
}

Configuration Fields:

Field Description Required
ds_type Must be "fabric" Yes
auth_mode Must be "service_principal" Yes
fabric_target Must be "lakehouse" Yes
workspace_id Fabric workspace GUID Yes
lakehouse_id Lakehouse GUID Yes
lakehouse_name Lakehouse display name Yes
sql_endpoint SQL analytics endpoint hostname Yes
azure_client_id Application (client) ID Yes
azure_tenant_id Directory (tenant) ID Yes
azure_client_secret Client secret value Yes

Authentication Setup

Fabric authenticates via Service Principal (Azure AD).

Service Principal Authentication

{
    "fabric_lakehouse": {
        "ds_type": "fabric",
        "auth_mode": "service_principal",
        "fabric_target": "lakehouse",
        "info": {
            "workspace_id": "your-workspace-id",
            "lakehouse_id": "your-lakehouse-id",
            "lakehouse_name": "your-lakehouse-name",
            "sql_endpoint": "your-sql-endpoint.datawarehouse.fabric.microsoft.com",
            "azure_client_id": "your-application-client-id",
            "azure_tenant_id": "your-directory-tenant-id",
            "azure_client_secret": "your-client-secret"
        }
    }
}

OneLake Directory Format

The OneLake directory must match your Fabric workspace and Lakehouse:

onelake://workspace-name/lakehouse-name/

Or using the Files path:

onelake://workspace-name/lakehouse-name.Lakehouse/Files/

Table Types

LakeXpress supports two Fabric Lakehouse table modes.

Delta tables in the Lakehouse Tables section. Data is converted to Delta Lake format with ACID transactions, time travel, and automatic SQL analytics endpoint availability.

  • Full Delta Lake features (time travel, ACID, versioning)
  • Optimized query performance
  • Native Power BI integration
  • Best for production workloads
./LakeXpress config create \
    -a credentials.json \
    --log_db_auth_id log_db \
    --source_db_auth_id postgres_prod \
    --target_storage_id onelake_storage \
    --publish_target fabric_lakehouse \
    --publish_method internal \
    ...

./LakeXpress sync

External Tables (SQL Analytics Views)

SQL views referencing Parquet files in the Files section. Data stays as Parquet; queries read files directly. Available via the SQL analytics endpoint only.

  • No data conversion overhead
  • Available immediately after export
  • Suited for exploration and ad-hoc analysis
  • Preserves original Parquet format
./LakeXpress config create \
    -a credentials.json \
    --log_db_auth_id log_db \
    --source_db_auth_id postgres_prod \
    --target_storage_id onelake_storage \
    --publish_target fabric_lakehouse \
    --publish_method external \
    ...

./LakeXpress sync

Configuration Options

Option Description Default
--publish_target ID Credential ID for Fabric target (required) -
--publish_method METHOD internal (Delta) or external (views) internal
--publish_schema_pattern PATTERN Not applicable for Fabric (uses Lakehouse) -
--publish_table_pattern PATTERN Table naming pattern {table}
--n_jobs N Parallel workers for table creation 1

Dynamic Naming Patterns

Table names support token-based patterns.

Supported Tokens

Token Description Example Output
{schema} Source schema name tpch_1
{table} Source table name customer
{database} Source database name tpch
{date} Current date (YYYYMMDD) 20251210
{timestamp} Current timestamp (YYYYMMDD_HHMMSS) 20251210_143022
{uuid} UUID4 (consistent per run) a1b2c3d4-...
{subpath} CLI --sub_path parameter staging

Common Patterns

Prefixed Tables

./LakeXpress config create \
    --publish_table_pattern "src_{schema}_{table}" \
    --publish_target fabric_lakehouse \
    --publish_method internal \
    ...

# Result: src_tpch_1_customer, src_tpch_1_orders

Schema-Based Naming

./LakeXpress config create \
    --publish_table_pattern "{schema}_{table}" \
    --publish_target fabric_lakehouse \
    --publish_method internal \
    ...

# Result: tpch_1_customer, tpch_1_orders

Date-Stamped Tables

./LakeXpress config create \
    --publish_table_pattern "{table}_{date}" \
    --publish_target fabric_lakehouse \
    --publish_method internal \
    ...

# Result: customer_20251210, orders_20251210

SQL Views with Prefix

./LakeXpress config create \
    --publish_table_pattern "VW_{schema}_{table}" \
    --publish_target fabric_lakehouse \
    --publish_method external \
    ...

# Result: VW_tpch_1_customer, VW_tpch_1_orders

Usage Examples

Example 1: PostgreSQL to Fabric Delta Tables

./LakeXpress config create \
    -a credentials.json \
    --log_db_auth_id log_db \
    --source_db_auth_id postgres_prod \
    --source_db_name tpch \
    --source_schema_name tpch_1 \
    --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
    --fastbcp_p 2 \
    --n_jobs 4 \
    --target_storage_id onelake_storage \
    --publish_target fabric_lakehouse \
    --publish_method internal \
    --generate_metadata

./LakeXpress sync

Tables (customer, orders, lineitem, etc.) appear in the Lakehouse Tables section, queryable via SQL analytics endpoint and Power BI.

Example 2: SQL Server to Fabric with Custom Naming

./LakeXpress config create \
    -a credentials.json \
    --log_db_auth_id log_db \
    --source_db_auth_id mssql_prod \
    --source_db_name adventureworksdw \
    --source_schema_name dbo \
    --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
    --fastbcp_p 2 \
    --n_jobs 4 \
    --target_storage_id onelake_storage \
    --publish_target fabric_lakehouse \
    --publish_method internal \
    --publish_table_pattern "dw_{table}"

./LakeXpress sync

Creates tables dw_DimCustomer, dw_DimProduct, dw_FactSales, etc.

Example 3: SQL Analytics Views (External)

./LakeXpress config create \
    -a credentials.json \
    --log_db_auth_id log_db \
    --source_db_auth_id postgres_prod \
    --source_db_name tpch \
    --source_schema_name tpch_1 \
    --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
    --fastbcp_p 2 \
    --n_jobs 4 \
    --target_storage_id onelake_storage \
    --publish_target fabric_lakehouse \
    --publish_method external \
    --publish_table_pattern "VW_{schema}_{table}"

./LakeXpress sync

Parquet files land in the Files section; views VW_tpch_1_customer, VW_tpch_1_orders, etc. appear in the SQL analytics endpoint.

Example 4: Parallel Table Creation

./LakeXpress config create \
    -a credentials.json \
    --log_db_auth_id log_db \
    --source_db_auth_id postgres_prod \
    --source_db_name tpch \
    --source_schema_name tpch_1 \
    --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
    --fastbcp_p 4 \
    --n_jobs 8 \
    --target_storage_id onelake_storage \
    --publish_target fabric_lakehouse \
    --publish_method internal

./LakeXpress sync

Example 5: MySQL to Fabric

./LakeXpress config create \
    -a credentials.json \
    --log_db_auth_id log_db \
    --source_db_auth_id mysql_prod \
    --source_db_name ecommerce \
    --source_schema_name ecommerce \
    --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
    --fastbcp_p 2 \
    --n_jobs 4 \
    --target_storage_id onelake_storage \
    --publish_target fabric_lakehouse \
    --publish_method internal \
    --publish_table_pattern "mysql_{table}"

./LakeXpress sync

Data Type Mapping

Source types are mapped to Fabric-compatible Delta Lake types.

PostgreSQL to Fabric

PostgreSQL Type Delta Lake Type
INTEGER, INT4 INT
BIGINT, INT8 BIGINT
SMALLINT, INT2 SMALLINT
NUMERIC(p,s) DECIMAL(p,s)
REAL, FLOAT4 FLOAT
DOUBLE PRECISION DOUBLE
VARCHAR(n), TEXT STRING
DATE DATE
TIMESTAMP TIMESTAMP
TIMESTAMPTZ TIMESTAMP
BOOLEAN BOOLEAN
BYTEA BINARY

SQL Server to Fabric

SQL Server Type Delta Lake Type
INT INT
BIGINT BIGINT
SMALLINT SMALLINT
TINYINT TINYINT
DECIMAL(p,s) DECIMAL(p,s)
FLOAT DOUBLE
REAL FLOAT
VARCHAR(n), NVARCHAR(n) STRING
DATE DATE
DATETIME, DATETIME2 TIMESTAMP
DATETIMEOFFSET TIMESTAMP
BIT BOOLEAN
VARBINARY BINARY

Querying Tables

SQL Analytics Endpoint

-- Query a Delta table
SELECT * FROM your_lakehouse.dbo.customer LIMIT 10;

-- Aggregation
SELECT
    c_nationkey,
    COUNT(*) as customer_count,
    SUM(c_acctbal) as total_balance
FROM your_lakehouse.dbo.customer
GROUP BY c_nationkey
ORDER BY customer_count DESC;

Spark Notebooks

# Read a Delta table
df = spark.read.table("customer")
df.show(10)

# Query with SQL
spark.sql("""
    SELECT c_nationkey, COUNT(*) as cnt
    FROM customer
    GROUP BY c_nationkey
""").show()

# Write results back
result_df.write.mode("overwrite").saveAsTable("customer_summary")

Troubleshooting

Common Issues

Authentication Errors

Error: AADSTS7000215: Invalid client secret provided

Regenerate your client secret in Azure AD and update credentials.json.

Error: The user or service principal does not have access to the workspace

  1. Verify the Service Principal is added to the workspace
  2. Ensure it has Member or Contributor role
  3. Wait a few minutes for permissions to propagate

OneLake Connection Issues

Error: Unable to connect to OneLake storage

  1. Verify the directory format: onelake://workspace-name/lakehouse-name/
  2. Check that the Service Principal has Storage Blob Data Contributor role
  3. Ensure workspace and lakehouse names are correct (case-sensitive)

Table Creation Failures

Error: Failed to create table in Lakehouse

  1. Verify the SQL analytics endpoint is correct
  2. Check that the Lakehouse is active (not paused)
  3. Ensure sufficient capacity units are available
  4. Confirm Parquet files were exported to OneLake

SQL Endpoint Connection Issues

Error: Cannot connect to SQL analytics endpoint

  1. Verify the SQL endpoint hostname
  2. Ensure the SQL analytics endpoint is enabled
  3. Check firewall rules if connecting from on-premises

Verifying Export Success

Check files in OneLake:

  1. Open your Lakehouse in Fabric portal
  2. Go to the Files section
  3. Verify Parquet files exist in the expected folders

Check tables:

  1. Open the SQL analytics endpoint
  2. Expand Tables in the object explorer
  3. Verify your tables appear

Debug Mode

./LakeXpress sync --log_lev DEBUG

CLI Reference

Fabric Publishing Arguments

Option Type Description
--publish_target ID String Credential ID for Fabric publishing (required)
--publish_method METHOD Enum internal (Delta tables) or external (SQL views)
--publish_table_pattern PATTERN String Table naming pattern (default: {table})
--n_jobs N Integer Parallel workers for table creation (default: 1)

Full Example

./LakeXpress config create \
    -a credentials.json \
    --log_db_auth_id log_db_postgres \
    --source_db_auth_id source_postgres \
    --source_db_name tpch \
    --source_schema_name tpch_1 \
    --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
    --fastbcp_p 2 \
    --n_jobs 4 \
    --target_storage_id onelake_storage \
    --publish_target fabric_lakehouse \
    --publish_method internal \
    --publish_table_pattern "{schema}_{table}" \
    --generate_metadata

./LakeXpress sync

See Also