Create a Job Compute Cluster in Databricks

Posted by

Creating a Job Compute (also called a Job Cluster) in Databricks allows you to define a dedicated compute environment that is spun up only when your job runs — and then terminated automatically. This is perfect for scheduled tasks like your group_user_mapping refresh.

Here’s a complete step-by-step guide to creating a Job Compute in Databricks 👇


✅ Step-by-Step: Create a Job Compute Cluster in Databricks


🧭 Step 1: Go to Jobs in the Sidebar

  1. In the Databricks UI, click on “Jobs & Pipelines” (left menu)
  2. Click “Create Job”

🧱 Step 2: Define Your Job

  • Name: e.g. Refresh_Group_User_Mapping
  • Click + Add task
  • Task name: refresh_table
  • Type: Notebook
  • Source: Workspace
  • Path: Select your notebook (e.g., User_Grouping)
  • Now under Compute, click:
+ New job cluster

⚙️ Step 3: Configure the Job Compute Cluster

You’ll now define your job cluster:

SettingRecommendedNotes
Cluster namejobcluster-group-refreshOptional (auto-generated if blank)
Databricks Runtime VersionLatest with LTS (e.g., 11.3 LTS)Use Unity Catalog-compatible version
Node TypeStandard_DS3_v2 or defaultDepends on your workload size
Worker countMin: 1, Max: 2 (autoscale)Or fix to 1 if small job
Enable autoscalingSaves cost on light jobs
Terminate after10–30 minsJob clusters auto-terminate after job ends

✅ Leave most advanced settings as default unless you need specific libraries.


🎯 Step 4: Save & Run

  • After setting up the cluster, click Create Task
  • Back in the Job overview, click Run Now to test it
  • To schedule it, click “Add Schedule” (e.g., daily at 6 AM)

🧠 Why Use Job Compute?

FeatureBenefit
💸 Auto-terminationSaves cost — no idle billing
🧼 Clean environmentEach run starts fresh
📈 ScalableCan scale up/down if autoscaling is enabled
🔒 IsolatedNo impact from shared users or other notebooks

⚠️ Requirements

  • You must be on a premium or enterprise Databricks workspace
  • Unity Catalog jobs require a UC-compatible runtime
  • You need permissions to create clusters (usually allowed by default)

🔄 Later: Modify Job Cluster Settings

If you want to update the job cluster later:

  • Go to the job → click the task → click Edit compute
  • Update instance size, autoscaling, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *

0
Would love your thoughts, please comment.x
()
x