How to Set up a Self-hosted Integration Runtime (IR) in Azure

Posted by

Setting up a Self-hosted Integration Runtime (IR) in Azure is crucial when you need to securely transfer data between on-premises and Azure services or even between different cloud environments. Below is a step-by-step guide with examples to help you set it up.

Step 1: Create an Azure Data Factory (ADF)

Before setting up a self-hosted IR, you need to create an Azure Data Factory in your Azure portal.

1.1. Sign in to the Azure Portal

  • Go to Azure Portal.
  • In the search bar, type “Data Factory” and click Create.

1.2. Configure Azure Data Factory

  • Subscription: Choose your subscription.
  • Resource Group: Select an existing resource group or create a new one.
  • Name: Provide a name for your Data Factory (must be globally unique).
  • Region: Choose your region.
  • Git Configuration: Skip this if you’re not using Git.
  • Click Review + Create, then Create.

Step 2: Create a Self-hosted Integration Runtime (IR)

After setting up Azure Data Factory, you’ll need to create a Self-hosted IR.

2.1. Navigate to Integration Runtimes in ADF

  • Go to your newly created Azure Data Factory.
  • In the left-hand menu, go to Author & Monitor.
  • In the ADF UI, click on the Manage tab (gear icon).
  • Under Connections, click Integration Runtimes.

2.2. Set up a New Integration Runtime

  • Click + New.
  • In the Integration Runtime Setup window, choose Self-hosted and click Continue.
  • Name your integration runtime (for example: “MySelfHostedIR”).

2.3. Download the Integration Runtime Installer

  • After naming your IR, the next screen will show the option to download and install the integration runtime on a machine.
  • Click Download and install the Integration Runtime on your on-premises or virtual machine.

2.4. Install the Integration Runtime on a Local Machine

  1. Run the downloaded installer on the machine that will host the IR.
  2. Follow the prompts:
    • Choose Express Setup.
    • Click Next and complete the installation.

2.5. Configure the Integration Runtime Installer

  • Once installed, a configuration window will open. Choose Register to connect this self-hosted IR to your Data Factory.
  • Go back to the Azure Data Factory and copy the Authentication Key from the integration runtime setup.
  • Paste the Authentication Key in the installer window on your local machine and click Register.

Once the IR is successfully registered, you will see its status as Online in the Azure Data Factory portal.

Step 3: Configure the Self-hosted Integration Runtime

3.1. Set Up High Availability (Optional)

To ensure high availability for your Self-hosted IR, you can install the integration runtime on multiple machines and register them under the same IR setup. This allows the load to be balanced, and if one machine fails, others will continue processing the data.

3.2. Configure Proxy (If Applicable)

If your organization uses a proxy, you can configure the IR to work behind the proxy:

  • During the installation, you’ll have the option to provide proxy details, or you can go into the settings after installation and update the proxy configuration.

Step 4: Test the Integration Runtime

Once your Self-hosted IR is set up, it’s essential to test it by running a data transfer task to ensure everything works as expected.

4.1. Create a Pipeline in Azure Data Factory

  • Go to the Author tab in ADF Studio.
  • Click + New Pipeline.
  • Use an activity like Copy Data and set the Source and Sink (destination) to use the Self-hosted IR.

4.2. Select the Self-hosted Integration Runtime

When configuring your Linked Services for the pipeline, select the Self-hosted IR as the integration runtime.

4.3. Run the Pipeline

  • Test your pipeline by running it and confirming that the Self-hosted IR is used to transfer data between your on-premises or local environment and Azure services.

Step 5: Monitor the Self-hosted IR

You can monitor the performance and health of your Self-hosted IR:

5.1. Go to the Monitoring tab in Azure Data Factory

  • In the Azure Data Factory UI, click on the Monitor tab.
  • From here, you can view the status, logs, and performance metrics of your Self-hosted IR, including data transfer times and machine usage.

Example Use Case

Let’s say you have an on-premises SQL Server database and need to securely transfer data to an Azure SQL Database using Azure Data Factory.

  1. Create the Self-hosted IR on a server that can access the on-premises SQL Server database.
  2. Install the Integration Runtime on that server and link it to the Azure Data Factory.
  3. Create a pipeline that copies data from the on-premises SQL Server to Azure SQL Database, and ensure that the Self-hosted IR is selected for the Source connection.
  4. Run the pipeline to validate the setup.

This setup ensures that you can securely move your data to the cloud without exposing it to the public internet.

Final Notes:

  • The Self-hosted IR machine needs to remain online and connected to the internet.
  • You can configure high availability by registering multiple machines under the same IR.
  • Regularly monitor the IR to ensure it’s running correctly and handling data transfers smoothly.
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x