Securing sensitive data like passwords and API keys is crucial in Azure Databricks. Azure Key Vault offers a centralized and secure solution for managing such secrets. Here’s a step-by-step guide to integrate Key Vault and Databricks for seamless and secure secret access:
Requirements
Following are the requirements for setting up Databricks-Backed secret scope and secret(s):
- An Azure Subscription
- An Azure Key Vault
- An Azure Databricks workspace
- An Azure Databricks Cluster (Runtime 4.0 or above)
Creating Azure Key Vault
- Create an Azure Key Vault: Go to the Azure portal, create a Key Vault, and configure its settings. Take note of the Key Vault URI.
- Add Secrets: Add secrets to the Key Vault. For example, you can add a secret named
my_secret
with a valuemy_secret_value
.
Open a Web Browser. I am using Chrome.
Enter the URL https://portal.azure.com and hit enter.
Click on “Key vaults”. It will open the blade for “Key vaults”.
Click on “Add”. It will open a new blade for creating a key vault “Create key vault”.
Enter all the information and click the “Create” button. Once the resource is created, refresh the screen and it will show the new “key vault” which we created.
Scroll down and click on the “Properties”.
Save the following information for the “key vault” created. We would be using these properties when we connect to the “key Vault” from “databricks”
- DNS Name
- Resource ID
Creating Secret in Azure Key Vault
Click on “Secrets” on the left-hand side.
Click on “Generate/Import”. We will be creating a secret for the “access key” for the “Azure Blob Storage”.
Enter the required information for creating the “secret”.
After entering all the information click on the “Create” button.
Configure Databricks to Use Key Vault:
Access Databricks Workspace: Log in to your Azure Databricks workspace.
Open the Azure Databricks workspace created as part of the Azure Databricks Workspace mentioned in the Requirements section at the top of the article.
Open the Azure Databricks workspace created as part of the Azure Databricks Workspace mentioned in the Requirements section at the top of the article.
Click on Launch Workspace to open Azure Databricks.
Copy the “URL” from the browser window.
Build the “URL” for creating the secret scope. https://<Databricks_url>#secrets/createScope.
Enter all the required information:
- Scope Name.
- DNS Name (this is the “DNS name” which we saved when we created the “Azure Key Vault”).
- Resource ID (this is the “Resource ID” which we saved when we created the “Azure Key Vault”).
Click the “Create” button.
“Databricks” is now connected with “Azure Key Vault”.
Accessing Secrets in Notebooks
# Mount the secrets scope
dbutils.fs.mounts()
# Access the secret
secret_value = dbutils.secrets.get(scope = "your_scope_name", key = "my_secret")
Using Secrets in Jobs: Access the secrets similarly within your job notebooks.
Example:
- Create a Secret in Azure Key Vault:
- Go to Azure Key Vault.
- Add a secret named
my_secret
with the valuemy_secret_value
.
- Access Secret in Databricks Notebook:
# Mount the secrets scope
dbutils.fs.mounts()
# Access the secret
secret_value = dbutils.secrets.get(scope = "your_scope_name", key = "my_secret")
print(secret_value)
Replace "your_scope_name"
with your created scope’s name.
Enter the following code in the Notebook
dbutils.secrets.get(scope = "azurekeyvault_secret_scope", key = "BlobStorageAccessKey")
#azurekeyvault_secret_scope --> Azure Key Vault based scope which we created in Databricks
#BlobStorageAccessKey --> Secret name which we created in Azure Key Vault
When you run the above command, it should show [REDACTED] which confirms that the secret was used from the Azure Key Vault secrets
In the same notebook we are going to add another command section and use Scala as the language
%scala
val blob_storage_account_access_key = dbutils.secrets.get(scope = "azurekeyvault_secret_scope", key = "BlobStorageAccessKey")
//azurekeyvault_secret_scope --> Azure Key Vault based scope which we created in Databricks
//BlobStorageAccessKey --> Secret name which we created in Azure Key Vault
When you run the above command, it should show [REDACTED] which confirms that the secret was used from the Azure Key Vault secrets.
Notes:
- Ensure proper access control and permissions in Azure Key Vault for Databricks.
- Use secrets in notebooks and jobs securely, avoiding hardcoding or exposing them in plain text.
- Update and manage secrets directly in Azure Key Vault for centralized and secure secret management.
This process allows Azure Databricks to securely access and use secrets stored in Azure Key Vault, providing a more secure and centralized approach to managing sensitive information used in your Databricks workflows. Adjust the steps based on your specific Key Vault and Databricks configurations.