Workspace in Azure Databricks

Posted by

Workspace

  • An Azure Databricks workspace is an environment for accessing all your Azure Databricks assets.
  • The Workspace organizes below objects into Folders:
    • Notebooks
    • Libraries
    • Experiments
  • You can manage the workspace using the workspace UI, the Databricks CLI, and the Databricks REST API.
  • In the Workspace UI, you can get help by clicking the ? icon at the top right-hand corner.
  • The Shortcuts link displays keyboard shortcuts for working with notebooks.

Below is workspace interface

For any support from Databrick End.

Workspace assets in Azure Databricks

Workspace Assets

  • Clusters
  • Notebooks
  • Jobs
  • Libraries
  • Data
  • Experiments
  • Cluster
    • Databricks cluster is a set of computation resources and configurations on which you run data engineering, data science, and data analytics workloads, such as production ETL pipelines, streaming analytics, ad-hoc analytics, and machine learning.
  • Notebook
    • A notebook is a web-based interface to documents containing a series of runnable cells (commands) that operate on files and tables, visualizations, and narrative text. Commands can be run in sequence, referring to the output of one or more previously run commands.
  • Jobs
    • Jobs is a mechanism for running code in Azure Databricks.
  • Libraries
    • A library makes third-party or locally-built code available to notebooks and jobs running on
  • Data
    • You can import data into a distributed file system mounted into an Azure Databricks workspace and work with it in Azure Databricks notebooks and clusters. You can also use a wide variety of Apache Spark data sources to access data.
  • Experiments
    • MLflow Experiments lets you run MLflow machine learning model trainings.

Working with Workspace Objects in Azure Databricks

Folders

  • Folders contain all static assets within a workspace: notebooks, libraries, experiments, and other folders. Click a folder name to open or close the folder and view its contents. To perform any action on Folder Click down arrow.

Special Folders

  • An Azure Databricks workspace has three special folders: Workspace, Shared, and Users. You cannot rename or move a special folder.
  • The Workspace root folder is a container for all your organization’s Azure Databricks static assets.
  • Shared is for sharing objects across your organization. All users have full permissions for all objects in Shared.
  • Users contains a folder for each user. We will call it as Users Home Folder. Objects in this folder are by default Private to that user.
  • Workspace ID can be found in URL.

Create and Run Spark Job in Databricks

  • Create Cluster
  • Create Notebook
  • Create Table from CSV file
  • Query Table
  • Visualize Query Results
  • Cluster is a set of computation resources and configurations on which we can run workloads.
  • Notebook is a web-based interface to a document that contains runnable code, visualizations, and narrative text.

Create Cluster

Create Notebook

Create Table from CSV file

Query Table

DBFS (databrick file system ) path -where we uploaded file

Visualize Query Results

guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x