π What is Shuffle Read & Shuffle Write?
βΆοΈ Shuffle is Spark’s mechanism to redistribute data across partitions, typically during wide transformations like:
groupBy()join()distinct()reduceByKey()

π΅ Shuffle Read:
- Amount of data an executor read from other executors’ outputs across the network.
- Indicates how much inter-node data transfer was needed.
- High values suggest expensive operations like large joins or groupBy.
π΄ Shuffle Write:
- Amount of data an executor wrote out for other executors to read during a shuffle.
- Happens when Spark has to rearrange data for joins, aggregations, etc.
β When is it GOOD or BAD?
| Metric | Good Sign | Bad Sign / When to Investigate |
|---|---|---|
| Shuffle Read | Low and evenly distributed across executors | One executor has most of the read (data skew) |
| Shuffle Write | Low and balanced | High & unbalanced β possible data skew or large joins |
| GC Time | Low % (GC < 10β15% of task time) | GC > 20% of task time β consider memory tuning |
| Total Tasks | Evenly distributed | One executor does a lot more β load imbalance |
π§ How to Investigate Shuffle Problems
- Go to Spark UI β Stages:
- Look for stages with high “Shuffle Read Size” or long durations.
- Hover over task distribution to check skewed partitions.
- Use
.explain()or Spark UI SQL / DAG tab:- Identify if a
join,groupBy, or similar triggered the shuffle.
- Identify if a
- Apply fixes like:
broadcast()small tablesrepartition()orsaltingfor skewed keys
π What is stdout and stderr?
| Log Type | Description |
|---|---|
| stdout | Standard output: All print() or log statements written in the notebook or your code (normal logs). |
| stderr | Standard error: Logs for warnings, stack traces, and errors (e.g., Python exceptions, Spark warnings). |
Click these links to view the executor logs for debugging failed or slow tasks.
π From Your Screenshot
| Executor | Input | Shuffle Read | Shuffle Write | GC Time |
|---|---|---|---|---|
| 0 | 80.7 GiB | 48.4 GiB | 46.5 GiB | 17 min |
| 1 | 71.7 GiB | 43.7 GiB | 40.2 GiB | 21 min |
| 2 | 84.3 GiB | 56.4 GiB | 51.8 GiB | 17 min |
βοΈ Observation:
- Shuffle is relatively evenly distributed β good sign (no obvious skew).
- GC Time is within limits (~2β3% of task time) β healthy memory use.
- Total Tasks also fairly balanced.
π Summary
| Term | Meaning | Healthy Sign |
|---|---|---|
| Shuffle Read | Data read from other executors | Balanced and minimal |
| Shuffle Write | Data written for shuffle | Balanced and not excessive |
| stdout | Debug / print logs | Used for progress/debug info |
| stderr | Errors and warnings | Should be reviewed if job fails |
Here is a cheat sheet in a normal table format to help you understand and monitor Spark Executor metrics in Databricks:
| Metric | Description | What to Check / Action |
|---|---|---|
| Executor ID | Unique identifier for each executor (driver is separate) | Identify which executor is the driver vs. workers |
| Address | IP and port of the executor | Use for identifying node location or debugging IP-specific issues |
| Status | Executor state (Active/Dead) | Investigate dead executorsβpossible memory or disk issues |
| RDD Blocks | Number of RDD blocks cached | High number = memory pressure, consider checkpointing or persisting with storage level |
| Storage Memory | Memory used vs. allocated | If usage is close to max, consider increasing executor memory |
| Disk Used | Temporary disk storage used | Investigate high usage, especially with spills or shuffles |
| Cores | Number of cores allocated to executor | Too low = less parallelism; adjust based on workload |
| Active Tasks | Tasks currently running on executor | Uneven distribution = possible skew |
| Failed Tasks | Count of failed tasks | High failure = investigate logs, GC, or data issues |
| Completed Tasks | Number of tasks successfully completed | Use for performance trend analysis |
| Task Time (GC Time) | Time spent on tasks, with GC (Garbage Collection) duration | High GC = memory pressure; consider tuning memory or caching strategy |
| Input / Output | Input size, shuffle read/write, output | Imbalance may indicate skew or inefficient transformations |
| Shuffle Read/Write | Data read/written across nodes during shuffle | High = expensive joins/repartitioning; consider broadcast join or reduce shuffle partitions |
| Logs (stdout/stderr) | Standard output and error logs per executor | Use to debug stack traces, memory errors, etc. |
| Thread Dump | Capture of current threads running | Use to diagnose hanging tasks, driver not responding, etc. |
Category: