🧰 Step-by-Step Guide to Setting Up Unity Catalog in Databricks
Unity Catalog by Databricks is the key to implementing centralized, secure, and scalable data governance across your Lakehouse environment. It allows you to manage metadata, access control, and auditing across multiple workspaces and clouds from a single location.
In this guide, we’ll walk through a step-by-step setup of Unity Catalog, including identity configuration, linking the metastore, and assigning storage access.

🏗️ Unity Catalog Architecture Overview
Before we dive into the setup, let’s understand the core architecture:
🔹 Key Unity Catalog Components:
- Metastore: Central metadata repository for catalogs, schemas, tables, and views
- User Management: Centralized across workspaces, based on identity federation (e.g., Azure AD)
- Compute: Cluster resources across one or more workspaces
- Default Storage Location: Typically an ADLS Gen2 container linked to the metastore
All components are managed centrally and shared across workspaces—a big step forward from traditional isolated governance models.
🛠️ Step-by-Step Setup Instructions
🔹 Step 1: Create a Unity Catalog Metastore
- Go to your Databricks account console
- Navigate to Unity Catalog > Metastore
- Click Create Metastore
- Provide:
- Name
- Default storage root (e.g.,
abfss://metastore@yourstorage.dfs.core.windows.net
) - Region (same as your workspace)
This metastore will serve as your central metadata store.
🔹 Step 2: Assign Metastore to Workspaces
You can assign the same Unity Catalog metastore to multiple workspaces to allow consistent data access.
📌 Example: Workspace A (Finance), Workspace B (Marketing), and Workspace C (Data Science) can all share the same metastore.
From the UI:
- Go to the Metastore page
- Click Assign to workspace
- Select the desired workspace(s)
- Save
This allows cross-workspace access with consistent governance policies.
🔹 Step 3: Set Up Default Storage (ADLS Gen2)
Unity Catalog requires a default storage location to store managed tables and logs.
- Create an ADLS Gen2 container
- Assign Storage Blob Data Contributor role to one of:
- A Managed Identity
- A Service Principal
- An Access Connector for Databricks
🧠 Best Practice: Use an Access Connector for Databricks for simplified security and rotation.
Example Architecture:
Databricks Unity Catalog
│
Metastore
│
Access Connector ──────> ADLS Gen2 Container
This storage becomes the root location for Delta tables, lineage logs, and other governance artifacts.
🔹 Step 4: Configure Access Connector (Recommended)
To securely allow Unity Catalog to write to the storage, configure an Access Connector:
- Create an Access Connector in Azure:
az databricks access-connector create \ --name unity-access-conn \ --resource-group my-rg \ --workspace-name my-dbx-ws
- Assign
Storage Blob Data Contributor
role to the connector on the storage account - Link the Access Connector to the Unity Catalog Metastore
✅ This avoids the need to manage secrets manually and supports role-based authentication.
🔹 Step 5: Set Up Compute (Clusters)
Create clusters in shared mode with Unity Catalog enabled:
- Go to your workspace
- Create or edit a cluster
- Choose:
- Cluster Mode: Shared
- Unity Catalog: Enabled
- Ensure the cluster uses the same region and workspace assigned to the Unity Catalog Metastore
This enables secure compute environments that can enforce row-level and column-level security at runtime.
✅ Final Architecture Recap
With all components correctly configured, your environment will look like this:
Databricks Unity Catalog
├── User Management (Centralized via Azure AD or SCIM)
├── Metastore (shared across workspaces)
├── Linked to:
│ └── Access Connector
│ └── ADLS Gen2 Container
└── Connected to:
├── Workspace A
└── Workspace B
└── Shared Compute Clusters
This model enables:
- ✅ Fine-grained access control
- ✅ Centralized policy enforcement
- ✅ Streamlined audits and lineage tracking
- ✅ Support for Delta Lake and external data sources
🧠 Real-World Example
A large enterprise with 4 business units (Finance, HR, Engineering, Product) sets up 4 separate workspaces.
Using Unity Catalog:
- They all share one central metastore
- Each business unit can access only the schemas and tables they are authorized for
- Audit logs track every query across workspaces in one location
This reduces compliance overhead, prevents data silos, and enables collaborative analytics securely.