How to use Azure Key Vault for Azure Data Factory?

Posted by

What is Azure Key Vault?

Azure Key Vault is a cloud-based service provided by Microsoft Azure that allows you to securely manage and store sensitive information such as cryptographic keys, secrets, certificates, and application settings. It is designed to help you safeguard your sensitive data and control access to it, all while simplifying the management of encryption keys and secrets used by your applications and services.
Here are some key features and purposes of Azure Key Vault:

  • Secure Key Management: Azure Key Vault provides a secure and centralized way to manage encryption keys for protecting data. It supports both software- and hardware-based key protection, and you can bring your own keys (BYOK) or generate keys within Key Vault.
  • Secrets Management: You can use Azure Key Vault to store and manage application secrets, connection strings, and other sensitive settings. This helps in securing configuration data and simplifies application management.
  • Certificate Management: Key Vault allows you to securely store and manage SSL/TLS certificates and cryptographic keys for applications and services, simplifying the management of certificates and ensuring secure communication.
  • Access Control: You can define fine-grained access control policies to restrict and manage who can access and perform operations within your Key Vault. Role-Based Access Control (RBAC) is used for access management.
  • Logging and Auditing: Key Vault provides logging and auditing capabilities to track access and actions taken within the vault, helping with compliance and monitoring.
  • Integration: It seamlessly integrates with various Azure services, including Azure Virtual Machines, Azure Functions, Azure Web Apps, and Azure SQL Database. It also supports common programming languages and libraries.
  • Key Rotation: Azure Key Vault supports key rotation, which is important for maintaining security by regularly updating encryption keys.
  • HSM Support: You can use Hardware Security Modules (HSMs) to provide additional security for your keys and secrets, ensuring hardware-based protection.
  • Global Availability: Azure Key Vault is available in multiple Azure regions worldwide, allowing you to choose the location that best suits your needs.

What is Azure Data Factory?

Azure Data Factory is a cloud-based data integration service provided by Microsoft Azure. It enables you to create, schedule, and manage data-driven workflows for moving, transforming, and processing data from various sources to various destinations. Azure Data Factory simplifies the ETL (Extract, Transform, Load) and data integration processes, allowing you to build data pipelines for analytics, reporting, and other data-driven tasks.
Key features and functionalities of Azure Data Factory include:

  • Data Movement: Azure Data Factory allows you to move data from various sources to destinations. It supports a wide range of data sources, including on-premises data stores, cloud-based data sources, and data stored in Azure services like Azure Blob Storage, Azure SQL Database, and more.
  • Data Transformation: You can transform data within the pipelines by using data transformation activities. These activities enable data cleaning, enrichment, aggregation, and transformation tasks.
  • Data Orchestration: You can create data-driven workflows and data pipelines that define the sequence and dependencies of activities. Azure Data Factory provides a visual interface for designing these data pipelines.
  • Integration with Azure Services: Azure Data Factory seamlessly integrates with other Azure services, including Azure Databricks, Azure HDInsight, Azure Machine Learning, Azure Synapse Analytics, and more. This allows you to leverage the power of these services in your data pipelines.
  • Data Monitoring and Logging: It provides monitoring and logging capabilities that allow you to track the execution of your data pipelines, identify issues, and troubleshoot problems.
  • Hybrid Data Integration: Azure Data Factory supports hybrid data integration scenarios, making it possible to connect to on-premises data sources securely using the Azure Data Factory Self-Hosted Integration Runtime.
  • Data Security: It provides features for securing your data at rest and in transit. Data can be encrypted and protected according to your organization’s security and compliance requirements.
  • Scalability: Azure Data Factory can scale to handle large data volumes and complex data processing tasks, ensuring high availability and performance.
  • Data Compliance: It provides compliance with various data regulations and standards, making it suitable for organizations with strict data governance requirements.
  • Data Workflow Automation: You can schedule data pipelines to run at specific intervals or in response to events. This automation simplifies routine data integration tasks.

How to Create Azure Key Vault?

First step is creating a key vault. If you already have one then you can skip this step.

  • Go to the Azure portal and create a new resource
  • Search for key vault
  • Select Key Vault and click on Create
  • Select your Subscription and Resource Group 
  • Choose a useful name for the Key Vault
  • Select your Region (the same as your other resources)
  • And choose the Pricing tier. We will use Standard for this demo

How to generate secret in key vault and use in ADF ?

Now that we have a Key Vault we can add the password from the SQL Server user. The Key Vault stores three types of items: Secrets, Keys and Certificates. For passwords, account keys or connectionstrings you need the Secret.

  • Go to the newly created Azure Key Vault
  • Go to Secrets in the left menu
  • Click on the Generate/Import button to create a new secret
  • Choose Manual in the upload options
  • Enter a recognizable and descriptive name. You will later on use this name in ADF
  • Next step is to add the secret value which we will retrieve in ADF
  • Keep Content type Empty and don’t use the activation or expiration date for this example
  • Make sure the secret is enabled and then click on the Create button

Go to created Key Vaults

Click on Secret under setting and then click on Generate to create secret. Here I am generating secret in order to connect Azure SQL database which can be later used in Data factory to create linked service for Azure SQL database

Copy the connection strings key, paste it to a notepad file, provide the password in the key, and then copy the key again.

Select the upload option, provide a name, paste the access key that we copied from our connection strings, and click on create.

How to create Access policies

Now we have to give ADF access to the Key Vault to read its content. You can now find ADF by its name so you don’t have to search for its managed identity guid, but using that guid is also still possible.

  1. Go to Access policies in the left menu
  2. Click on the blue + Add Access Policy link
  3. Leave Configure from template empty
  4. Leave Key permissions unselected (we will only use a Secret for this example)
  5. Select Get for Secret permissions
  6. Leave Certificate permissions unselected (we will only use a Secret for this example)
  7. Click on the field of Select principal to find the name of your Azure Data Factory
  8. Leave Authorized application unchanged
  9. Click on Add and a new Application will appear in the list of Current Access Policies

 How to Create Linked Service connection to Azure Key Vault

Now we need to let ADF know about your new Azure Key Vault by adding an extra Linked Service connection to your Key Vault.

  1. Go to ADF and open the Author & Monitor editor
  2. Within the new tab go to the Author section (Pencil icon) and click on connections to see all Linked Services
  3. Add a new Linked Service by clicking the + New button
  4. Search for (Azure) Key Vault and click on continue to enter all connection details
  5. First enter a descriptive name and optionally a description
  6. Select your subscription
  7. Select your newly created Key Vault
  8. Test the connection and if successful click on Create

Create Linked Service connection

Before you start make sure the Linked Service connection to the Key Vault has been  published. Otherwise you get an error message when hitting the create button

  1. Add a new Linked Service by clicking the + New button
  2. Search for (Azure) SQL Database and click on continue to enter all connection details
  3. First enter a descriptive name and a description
  4. Then use Connection string (not Azure Key Vault) to select your server and database
  5. After that use Azure Key Vault (not Password) to retrieve the password
  6. Select the Key Vault from the previous step
  7. Enter the name of the secret (that you created in step 2)
  8. Test the connection and if successful hit the create button
  9. Now you can use this Linked Services connection in your pipelines

Inline Feedbacks
View all comments
Would love your thoughts, please comment.x