Azure Databricks Architecture
- Azure Databricks is structured to enable secure cross-functional team collaboration while keeping a significant amount of backend services managed by Azure Databricks so you can stay focused on your data science, data analytics, and data engineering tasks.
- Azure Databricks operates out of a control plane and a data plane.
- The control plane includes the backend services that Azure Databricks manages in its own Azure account. Notebook commands and many other workspace configurations are stored in the control plane and encrypted at rest.
- The data plane is managed by your Azure account and is where your data resides. This is also where data is processed.
How Azure Databricks Architecture works?
- Control plane available on databricks subscription and Data plane available at customers subscription
- Control plane having databricks cluster manager , Databrick UI, DBFS, cluster
- Data Plane having Vnet, NSG, Azure blob storageand databricks workspace
- DBFS hosted at Azure blob storage
- Users login via azure AD
- once user login and run some task , it will go databricks cluster Manager and cluster manager send it to Azure Resource manager and Resource Manager taking care of distributing the task, to distribute the task it will go the Vnet and create Virtual manchine , this VM will be created based on node of cluster , once VM created the task will be distributed among these VM
- Once it completed it will send back to Azure resource manager to cluster manager and to users