Introduction
Unity Catalog in Databricks provides centralized governance and access control for managed tables and external storage. However, sometimes tables do not reflect updates, leading to stale data, missing records, or slow synchronization with external storage.
🚨 Common issues when data is not updating in Unity Catalog tables:
- INSERT, UPDATE, DELETE operations appear successful but do not change data.
- External tables do not reflect changes from cloud storage (S3, ADLS, GCS).
- MERGE operations fail to update existing records.
- Queries return outdated or incorrect data.
This guide explores possible causes, troubleshooting steps, and solutions for updating Unity Catalog tables in Databricks.
1. Verify That the Table Type Supports Updates
Symptoms:
- INSERT, UPDATE, DELETE commands execute but data does not change.
- MERGE INTO statements do not update existing rows.
- Error: “Operation not supported on external table.”
Causes:
- Unity Catalog manages both external and managed tables.
- External tables cannot be updated directly because they reference data stored in S3, ADLS, or GCS.
Fix:
✅ Check if the table is external or managed:
DESCRIBE TABLE EXTENDED my_catalog.my_schema.my_table;
✅ If the table is external, updates are not supported. Use managed tables instead:
CREATE TABLE my_catalog.my_schema.new_table AS SELECT * FROM my_catalog.my_schema.external_table;
✅ For external tables, modify the underlying data in cloud storage and refresh metadata:
ALTER TABLE my_catalog.my_schema.external_table REFRESH;
2. Ensure Delta Lake Optimizations Are Applied
Symptoms:
- MERGE INTO does not update existing records.
- Deleted rows still appear in queries.
- Updates do not reflect in new queries immediately.
Causes:
- Delta Lake requires optimization to maintain performance.
- ZORDER and VACUUM may be needed to ensure updates are correctly processed.
Fix:
✅ Run OPTIMIZE
to compact small files and improve query performance:
OPTIMIZE my_catalog.my_schema.my_table;
✅ Run VACUUM
to remove old versions of data:
VACUUM my_catalog.my_schema.my_table RETAIN 168 HOURS;
✅ Check the last commit version to confirm the table was updated:
DESCRIBE HISTORY my_catalog.my_schema.my_table;
3. Check If Changes Are Being Committed Properly
Symptoms:
- Updates appear in the session but disappear after the job completes.
- Writes succeed but are not visible to other users.
- INSERT operations are not committed permanently.
Causes:
- Transactions may not be committed properly in Unity Catalog.
- Multiple concurrent writes can cause transaction conflicts.
Fix:
✅ Ensure updates are committed in transactional workloads:
BEGIN TRANSACTION;
UPDATE my_catalog.my_schema.my_table SET column_name = 'new_value' WHERE id = 123;
COMMIT;
✅ Check if another process is overriding changes with a concurrent write:
SHOW TRANSACTIONS;
✅ Use MERGE INTO
for upserts instead of separate INSERT and UPDATE statements:
MERGE INTO my_catalog.my_schema.my_table AS target
USING updated_data AS source
ON target.id = source.id
WHEN MATCHED THEN UPDATE SET target.column_name = source.column_name
WHEN NOT MATCHED THEN INSERT (id, column_name) VALUES (source.id, source.column_name);
4. Refresh Metadata for External Tables
Symptoms:
- External table does not reflect changes from S3, ADLS, or GCS.
- Query results do not match the actual files in cloud storage.
Causes:
- External tables rely on metadata caching, which may not update automatically.
- Databricks does not automatically detect new or deleted files in cloud storage.
Fix:
✅ Manually refresh the table metadata:
ALTER TABLE my_catalog.my_schema.external_table REFRESH;
✅ For partitioned tables, refresh specific partitions:
MSCK REPAIR TABLE my_catalog.my_schema.external_table;
✅ Ensure cloud storage permissions allow Databricks to detect updates:
- AWS S3: Enable
ListBucket
permissions. - Azure ADLS: Assign
Storage Blob Data Reader
role. - GCP Storage: Ensure
roles/storage.objectViewer
is granted.
5. Check for Time Travel or Snapshot Isolation Issues
Symptoms:
- Queries return old data even after successful updates.
- Time travel queries show outdated versions.
Causes:
- Delta tables store historical versions, and queries might be reading older snapshots.
- The table may be locked in a specific version due to time travel settings.
Fix:
✅ Check the available versions in the Delta table:
DESCRIBE HISTORY my_catalog.my_schema.my_table;
✅ Force queries to use the latest data version:
SELECT * FROM my_catalog.my_schema.my_table VERSION AS OF 0;
✅ Use REORG TABLE
to clean up old table versions:
ALTER TABLE my_catalog.my_schema.my_table REORG;
6. Check If Table Permissions Allow Updates
Symptoms:
- UPDATE, DELETE, or INSERT fails with “Permission denied” error.
- Only SELECT queries work, but modifications fail.
Causes:
- Users may not have write permissions on Unity Catalog tables.
- Schema-level restrictions may block updates.
Fix:
✅ Check table permissions:
SHOW GRANTS ON TABLE my_catalog.my_schema.my_table;
✅ Grant necessary privileges:
GRANT MODIFY, SELECT ON TABLE my_catalog.my_schema.my_table TO `user@example.com`;
✅ For schema-level restrictions, grant access to the entire schema:
GRANT USAGE ON SCHEMA my_catalog.my_schema TO `user@example.com`;
7. Resolve Cluster and Compute Issues
Symptoms:
- Unity Catalog tables do not update on specific clusters.
- Changes appear in SQL Warehouse but not in all clusters.
Causes:
- Clusters must be Unity Catalog-enabled to modify Unity Catalog tables.
- Older Databricks runtimes may not fully support Unity Catalog.
Fix:
✅ Ensure the cluster supports Unity Catalog:
- Go to Clusters → Edit Cluster
- Enable Unity Catalog in the Advanced options
- Restart the cluster
✅ Ensure SQL Warehouses are using the correct catalog:
USE CATALOG my_catalog;
✅ Check the Databricks runtime version:
SELECT VERSION();
- Upgrade to Databricks Runtime 11.3+ for full Unity Catalog support.
Step-by-Step Troubleshooting Guide
1. Check the Table Type
DESCRIBE TABLE EXTENDED my_catalog.my_schema.my_table;
- If the table is external, updates are not supported.
2. Verify Table Commit History
DESCRIBE HISTORY my_catalog.my_schema.my_table;
- Check if updates were committed correctly.
3. Refresh Metadata for External Tables
ALTER TABLE my_catalog.my_schema.external_table REFRESH;
4. Optimize and Vacuum the Table
OPTIMIZE my_catalog.my_schema.my_table;
VACUUM my_catalog.my_schema.my_table RETAIN 168 HOURS;
5. Check User Permissions
SHOW GRANTS ON TABLE my_catalog.my_schema.my_table;
- If needed, grant MODIFY, SELECT, and USAGE permissions.
6. Restart the Cluster and Ensure It Supports Unity Catalog
- Enable Unity Catalog support in cluster settings.
Conclusion
If Unity Catalog tables are not updating, check:
✅ The table type (Managed vs. External).
✅ If metadata refresh is needed for external tables.
✅ If Delta optimizations (OPTIMIZE, VACUUM) are required.
✅ If permissions allow modifications.
✅ If clusters and SQL Warehouses support Unity Catalog.
By following this guide, you can successfully resolve data update issues in Unity Catalog tables and ensure data consistency.