Mohammad Gufran Jahangir August 17, 2025 0

What is column‑level masking?

Column masks hide or transform sensitive values (PII/PCI/PHI) at query time. Every read of a masked column is replaced by the output of a masking function, so users see only the permitted representation (for example, j***@company.com). Masks are defined with a SQL UDF and attached to a column. One mask per column.

Key behaviors

  • Runs at scan time; filters, joins, and ORDER BY operate on the masked value. Plan accordingly.
  • Mask UDF must return the same data type as the column
  • Works everywhere: notebooks, SQL Warehouses, dashboards. Manage centrally in Unity Catalog.

Minimal setup (copy–paste)

Assume a UC catalog demo, schema curated, table customers(email, phone, ssn, region, ...).

1) Create a masking function (SQL UDF)

USE CATALOG demo; USE SCHEMA util;

-- Mask email: keep first char + domain
CREATE OR REPLACE FUNCTION demo.util.mask_email(s STRING)
RETURNS STRING
RETURN regexp_replace(s, '(^.).*(@.*$)', '$1***$2');

-- Mask phone: show only last 4 digits
CREATE OR REPLACE FUNCTION demo.util.mask_phone(s STRING)
RETURNS STRING
RETURN concat('(***)***-', right(regexp_replace(s,'\\D',''), 4));

Tip: You can make role‑aware masks using predicates like is_account_group_member('auditors') inside the UDF to show full values only to certain groups.

2) Attach masks to columns

USE CATALOG demo; USE SCHEMA curated;

ALTER TABLE customers
  ALTER COLUMN email SET MASK demo.util.mask_email;

ALTER TABLE customers
  ALTER COLUMN phone SET MASK demo.util.mask_phone;

Mask is active immediately for all queries. To remove: ALTER TABLE customers ALTER COLUMN email DROP MASK;.

3) Grants you need

  • Readers need SELECT on the table.
  • Callers of the UDF need EXECUTE on the function (or on the schema containing it).
GRANT USE CATALOG ON CATALOG demo TO `analysts`;
GRANT USE SCHEMA  ON SCHEMA  demo.util TO `analysts`;
GRANT EXECUTE     ON SCHEMA  demo.util TO `analysts`;
GRANT SELECT      ON TABLE   demo.curated.customers TO `analysts`;

Parameterized masks (use other columns)

You can pass other columns into the mask to vary behavior (for example, by region or role).

-- Function that reveals full email for auditors only
CREATE OR REPLACE FUNCTION demo.util.mask_email_role(email STRING, role_hint STRING)
RETURNS STRING
RETURN CASE WHEN role_hint = 'auditor' OR is_account_group_member('auditors')
            THEN email ELSE demo.util.mask_email(email) END;

ALTER TABLE demo.curated.customers
  ALTER COLUMN email SET MASK demo.util.mask_email_role USING COLUMNS (email, role_hint);

When to use masks vs. row filters vs. dynamic views

  • Column masks: hide/transform specific fields, keep same table name.
  • Row filters: restrict which rows a user can see (RLS). Often used alongside masks.
  • Dynamic views: build a curated read‑only projection across one or more tables; good for sharing or cross‑table logic.

Gotchas & performance notes

  • Joins on masked columns use masked values—may break equality joins; join on surrogate keys or unmasked fields instead.
  • One mask per column: if you need multiple behaviors, encode logic inside the UDF using identity/group checks.
  • Auditing/governance: Manage masks in Catalog Explorer or via SQL; document in data catalog. Some tools show an indicator that a column is masked.

End‑to‑end example: email masking with ABAC (tag‑based)

If you use attribute‑based access control, associate a governed tag like sensitivity = PII with the column and bind a policy that applies a mask everywhere the tag appears. (Centralized and scalable.)

High‑level steps:

  1. Create a governed tag and policy that maps sensitivity = PIIdemo.util.mask_email.
  2. Tag columns in Catalog Explorer or via SQL.
  3. Verify queries show masked values for non‑privileged users.

Troubleshooting checklist

  • Mask not applied? Ensure you attached it to the exact column; re‑run ALTER COLUMN ... SET MASK. Confirm you aren’t reading a different table/view.
  • Permission errors? Grant EXECUTE on the UDF (or schema) and SELECT on the table
  • Unexpected query results (joins/order): remember the mask rewrites the expression; test with EXPLAIN and sample queries.

Cheat‑sheet

  • Create function: CREATE OR REPLACE FUNCTION cat.sch.fn(args) RETURNS TYPE RETURN <expr>;
  • Attach mask: ALTER TABLE t ALTER COLUMN c SET MASK cat.sch.fn [USING COLUMNS (...)].
  • Drop mask: ALTER TABLE t ALTER COLUMN c DROP MASK.
  • Grants: EXECUTE on function/schema + SELECT on table; USE CATALOG/SCHEMA to browse.
  • Plan impacts: joins/order/filter on masked values.

Masks are a powerful, low‑friction way to keep sensitive columns safe while preserving the original table names. Start with a simple UDF, attach it to a column, and scale up with tags/ABAC when you’re ready.

Category: 
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments