,

DBFS002 – DBFS Mount Failure in Databricks

Posted by

Introduction

The DBFS002: DBFS mount failure error indicates that Databricks File System (DBFS) failed to mount cloud storage. This error can happen when mounting an external storage system like AWS S3, Azure Data Lake Storage (ADLS), or Google Cloud Storage (GCS) to Databricks. When this failure occurs, you cannot read from or write to the mount point.

🚨 Common symptoms of DBFS mount failure:

  • Error: “DBFS002: Cannot mount or access external storage.”
  • Mount points do not appear in dbutils.fs.mounts() output.
  • Cluster jobs fail with file-not-found or connection errors.

This guide will walk you through common causes of DBFS002 errors and their solutions.


1. Incorrect Storage Account Credentials or Access Key

Symptoms:

  • DBFS002: Mount failure due to incorrect credentials.
  • Cannot list or access files from the mounted path.
  • HTTP 403 Forbidden errors in logs.

Causes:

  • The provided storage account access key or credentials are incorrect.
  • Expired or rotated keys without updating the mount configuration.

Fix:

Verify your storage credentials:

  • Check if the access key or SAS token is valid.
  • For Azure, regenerate the access key if needed: az storage account keys list --account-name <storage-account>

Update the mount with the correct credentials:

dbutils.fs.mount(
  source="wasbs://<container>@<storage-account>.blob.core.windows.net",
  mount_point="/mnt/my-mount",
  extra_configs={"fs.azure.account.key.<storage-account>.blob.core.windows.net": "your-access-key"}
)

For AWS S3, use the correct credentials in configuration:

dbutils.fs.mount(
  source="s3a://my-bucket",
  mount_point="/mnt/my-s3",
  extra_configs={"fs.s3a.access.key": "your-access-key", "fs.s3a.secret.key": "your-secret-key"}
)

2. Insufficient Permissions on Cloud Storage

Symptoms:

  • DBFS002: Permission denied.
  • Cannot access or list files from the storage mount.
  • HTTP 403 Forbidden or 401 Unauthorized errors.

Causes:

  • Incorrect IAM policies (AWS) or RBAC settings (Azure).
  • The Databricks service principal does not have permissions to access the storage.

Fix:

For AWS S3, ensure the IAM role has the required permissions:

{
  "Effect": "Allow",
  "Action": ["s3:ListBucket", "s3:GetObject", "s3:PutObject"],
  "Resource": ["arn:aws:s3:::my-bucket", "arn:aws:s3:::my-bucket/*"]
}

For Azure Data Lake Storage, assign the correct role:

az role assignment create --assignee <databricks-service-principal> --role "Storage Blob Data Contributor" --scope <storage-account-scope>

Check permissions using the Azure CLI:

az storage blob list --account-name <storage-account> --container-name <container>

3. Network Connectivity Issues

Symptoms:

  • DBFS002: Cannot connect to storage endpoint.
  • Mount command fails with connection timeout.
  • Cluster logs show NoRouteToHostException or Connection Refused.

Causes:

  • Storage account is in a different region than the Databricks cluster.
  • VPC or firewall rules block traffic between Databricks and cloud storage.
  • Private endpoint misconfigurations for Azure or AWS PrivateLink.

Fix:

Ensure the Databricks cluster can reach the storage endpoint:

ping <storage-endpoint>
nc -zv <storage-endpoint> 443

For Azure, check the firewall and private endpoint settings:

  • Whitelist the Databricks IP range for your storage account.
  • Ensure Private Endpoints are correctly configured.

For AWS S3, verify VPC Endpoint settings:

aws ec2 describe-vpc-endpoints --filters "Name=service-name,Values=com.amazonaws.<region>.s3"

4. Using Legacy DBFS Mounts

Symptoms:

  • DBFS002: Deprecated mount point error.
  • Mount points are missing or unresponsive.
  • Mounts worked earlier but suddenly stopped.

Causes:

  • Databricks deprecated certain legacy DBFS mounts.
  • Upgraded Databricks Runtime does not support the old mount configuration.

Fix:

Migrate to a new mount using the latest storage APIs:

dbutils.fs.mount(
  source="abfss://<container>@<storage-account>.dfs.core.windows.net/",
  mount_point="/mnt/new-mount",
  extra_configs={"fs.azure.account.auth.type": "OAuth"}
)

Use the ADLS Gen2 endpoint (abfss://) for Azure Data Lake Storage.


5. Misconfigured Spark Configuration

Symptoms:

  • DBFS002: Mount failure with Spark-related errors.
  • Cluster logs show java.lang.IllegalArgumentException or NoSuchMethodError.
  • Mounts fail only on specific clusters.

Causes:

  • Missing or incorrect Spark configuration for cloud storage.
  • Cluster initialization scripts are misconfigured.

Fix:

Check and set the correct Spark configuration for storage:

spark.conf.set("fs.azure.account.key.<storage-account>.blob.core.windows.net", "your-access-key")

Ensure cluster initialization scripts are properly configured.
Restart the cluster after updating configurations.


6. Expired or Revoked OAuth Tokens (Azure Key Vault)

Symptoms:

  • DBFS002: OAuth token expired error.
  • Mount worked previously but fails after token expiration.

Causes:

  • The OAuth token used to authenticate with Azure Key Vault expired.
  • No token refresh mechanism is configured.

Fix:

Renew the OAuth token and update the mount configuration:

dbutils.secrets.get(scope="my-keyvault-scope", key="my-access-token")

Use Azure Managed Identity for automatic token refresh:

spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ManagedIdentityTokenProvider")

7. Incorrect Mount Point Path

Symptoms:

  • Error: “Mount point already exists.”
  • Mount point path does not appear in dbutils.fs.mounts() output.

Causes:

  • The specified mount point path is incorrect.
  • The mount point already exists but is not properly mounted.

Fix:

List all mount points to verify the path:

dbutils.fs.mounts()

Unmount and remount the path if necessary:

dbutils.fs.unmount("/mnt/my-mount")
dbutils.fs.mount(source="...", mount_point="/mnt/my-mount", extra_configs={...})

Best Practices for Avoiding DBFS Mount Failures

Use OAuth or Managed Identity for Secure Authentication

  • Avoid using static access keys; prefer token-based authentication.

Ensure Correct Storage Permissions

  • Grant minimum required permissions to your Databricks service.

Monitor and Refresh Credentials Regularly

  • Rotate access keys and OAuth tokens periodically.

Use abfss:// for ADLS Gen2 Storage

  • Ensure you are using the latest Gen2 endpoint for Azure Data Lake Storage.

Conclusion

DBFS002 mount failures in Databricks can occur due to misconfigured credentials, insufficient permissions, network connectivity issues, or deprecated configurations. By following the troubleshooting steps and best practices outlined here, you can quickly diagnose and resolve mount failures to ensure reliable access to external cloud storage in Databricks.

guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x