Introduction
DELTA003: Delta Write Conflict (Concurrent Updates) occurs when multiple write operations attempt to modify the same Delta Lake table simultaneously, causing a conflict in the transaction log. Delta Lake provides ACID guarantees, but concurrent writes without proper isolation can lead to this error.
🚨 Common symptoms of DELTA003 error:
- Job fails with DELTA003 – Delta write conflict.
- Concurrent jobs modifying the same table fail.
- MERGE, DELETE, or UPDATE operations fail intermittently.
- Streaming and batch jobs conflict on the same table.
Why DELTA003 Error Happens
The error typically occurs in these scenarios:
- Multiple jobs write to the same Delta table at the same time.
- Concurrent batch and streaming jobs modify overlapping data.
- MERGE, DELETE, or UPDATE operations conflict with other writes.
- Long-running jobs lock the Delta log.
Delta Lake’s Transactional Model
Delta Lake uses an optimistic concurrency control (OCC) model. Every transaction checks for conflicts before committing. If a conflict is detected, the transaction fails with DELTA003.
Common Causes and Fixes
1. Concurrent Batch Jobs Writing to the Same Delta Table
Symptoms:
- Two or more batch jobs write to the same table concurrently.
- DELTA003 error occurs intermittently.
Fix:
✅ Avoid concurrent writes to the same table:
- Ensure only one job writes to a table at a time.
- Use job scheduling tools to coordinate batch jobs.
✅ If concurrent writes are necessary, partition the table to avoid overlapping writes:
df.write.partitionBy("date").format("delta").mode("append").save("/mnt/delta/table")
2. Conflict Between Streaming and Batch Writes
Symptoms:
- Streaming job fails due to concurrent batch updates.
- DELTA003 error occurs when batch job modifies the table while the streaming job is running.
Fix:
✅ Use separate tables for streaming ingestion and batch processing:
- Stream into a staging table, then periodically merge the staging table into the main table.
MERGE INTO main_table AS m
USING staging_table AS s
ON m.id = s.id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
✅ Ensure streaming and batch jobs update non-overlapping partitions:
- Partition the table by date or another column that reduces conflicts.
3. Long-Running Transactions Holding Locks
Symptoms:
- Long-running MERGE, UPDATE, or DELETE operations cause write conflicts.
- Subsequent jobs fail with DELTA003.
Fix:
✅ Break large transactions into smaller ones:
for date in date_list:
spark.sql(f"""
MERGE INTO delta_table AS target
USING new_data WHERE target.date = '{date}'
""")
✅ Avoid overwriting the entire table in one operation.
- Instead, update only the affected partitions or subsets.
4. Overlapping MERGE Operations
Symptoms:
- Multiple jobs running MERGE INTO on the same table fail.
- DELTA003 error occurs intermittently or consistently.
Fix:
✅ Stagger the timing of merge operations to avoid overlaps.
✅ Optimize table compaction to reduce the number of small files and reduce merge conflicts:
OPTIMIZE delta.`/mnt/delta/table/` ZORDER BY (primary_key);
5. Checkpoint Mismanagement in Streaming Jobs
Symptoms:
- Streaming job repeatedly fails with DELTA003.
- The checkpoint contains conflicting state information.
Fix:
✅ Clear or recreate the checkpoint directory if corrupted:
dbutils.fs.rm("/mnt/checkpoint_dir", True)
✅ Ensure that each streaming job uses a unique checkpoint directory.
Step-by-Step Troubleshooting Guide
1. Check Delta Table History for Recent Modifications
DESCRIBE HISTORY delta.`/mnt/delta/table/`;
- Look for overlapping operations that may have caused conflicts.
2. Monitor Active Jobs and Logs
- Ensure that no concurrent writes are happening when your job starts.
- Check Databricks job logs for detailed error messages.
3. Reduce Write Conflicts Using Table Partitioning
df.write.partitionBy("year", "month").format("delta").mode("append").save("/mnt/delta/table")
Best Practices to Prevent DELTA003 Errors
✅ Partition Tables to Avoid Write Conflicts
- Partition tables by columns that reduce overlapping writes.
✅ Avoid Concurrent Writes to the Same Table
- Use job scheduling to avoid overlapping jobs.
✅ Use a Separate Table for Staging and Merge Operations
- Reduce conflicts by merging data in batches.
✅ Compact Small Files Regularly
OPTIMIZE delta.`/mnt/delta/table/` ZORDER BY (id);
✅ Monitor Table History and Active Transactions
- Regularly check table history for long-running transactions.
Conclusion
The DELTA003 – Delta Write Conflict error is caused by concurrent updates or overlapping operations in Delta Lake. By avoiding concurrent writes, partitioning tables, and optimizing transactions, you can reduce conflicts and ensure reliable data ingestion and processing.