Click on the banner to learn about and purchase my database training on Azure

SQL Server 2016 - Data Masking with Dynamic Data Masking (DDM)

Views: 7.647 views
This post is the 3 part of 5 in the series. Data Protection
Reading Time: 7 minutes

Hello people,
All right?

Introduction

In this post I would like to demonstrate a very interesting feature of SQL Server, available from the 2016 release, which is Dynamic Data Masking (DDM) and allows us to quickly and conveniently mask and hide sensitive information from certain users.

Unlike the encryption capabilities of SQL Server Transparent Data Encryption (TDE) and Always Encrypted, which physically encrypt database data and files (even backups and log backups are generated already encrypted, and MDF, LDF, and NDF files are also encrypted). Dynamic Data Masking does not really encrypt the data, it only masks database queries for non-privileged users.

To apply data masking, you must run an ALTER command on the column and you want to apply the mask (data masking is at the column level) or you can add this definition in the table CREATE.

Limitations and Restrictions

You cannot define a masking rule for the following column types:

  • Encrypted Columns (Always Encrypted)
  • FILESTREAM
  • A mask cannot be set on a computed column, but if the computed column depends on a column with a MASK, it will return masked data.
  • A data masked column cannot be a key to a FULLTEXT index.

Mask Types

In Dynamic Data Masking, you can define how data will be masked using some data masking functions, which will be detailed below:

FunctionDescriptionExamples
DefaultMasking according to the data types of the designated fields.

For string data types, the original values ​​will be replaced by the characters XXXX. Applies to data types tank, nchar, varchar, nvarchar, text e ntext

For numeric data types, the original value will be replaced by the number 0 (zero). Applies to data types bigint, bit, decimal, int, money, numeric, smallint, smallmoney, tinyint, float e real.

For date and time data types, the date 1900-01-01 00: 00: 00.0000000 will be used. Applies to data types data, datetime2, datetime, datetimeoffset, smalldatetime e team.

For binary data types, the original value will be replaced by a single ASCII 0 value byte.
Applies to data types binary, varbinary e image.
ALTER TABLE dbo.DDM ALTER COLUMN Name ADD MASKED WITH (FUNCTION = 'default ()')
EmailThe masking method that exposes the first letter of an email address and the constant suffix ".com" in the form of an email address

Ex: [email protected] will be masked as [email protected]
ALTER TABLE dbo.DDM ALTER COLUMN Email ADD MASKED WITH (FUNCTION = 'email ()')
RandomThe random masking function (start_number, end_number) allows you to replace a numeric value with a random number generated from a predefined interval.

Columns that have data types that accept decimal values ​​can accept ranges in the form of decimal or integer numbers in the random () function. If the range is composed of 2 integers in a column that accepts decimal values, the generated random values ​​will also have decimal places with random values ​​(see examples).
ALTER TABLE dbo.DDM ALTER COLUMN ADD MASKED WITH Salary (FUNCTION = 'random (0.5, 0.99)')

ALTER TABLE dbo.DDM ALTER COLUMN Weight ADD MASKED WITH (FUNCTION = 'random (70, 120)')
PartialThe partial masking function (prefix, fill, suffix) shows the first N letters (N = prefix) and last N letters (N = suffix) and adds a string S of custom fill characters in the middle (S = fill).

Note: If the original value is too short to complete the entire mask, part of the prefix or suffix will not be exposed, ie if the string is "Dirceu" and the masking function is partial (4, "XXXX", 4) , the string will be masked as "XXXX", disregarding the prefix and suffix values ​​that will not be shown.
ALTER TABLE dbo.DDM ALTER COLUMN CPF ADD MASKED WITH (FUNCTION = 'partial (0, "XXXXXXXXX", 2)')

ALTER TABLE dbo.DDM ALTER COLUMN ADD MASKED WITH Position (FUNCTION = 'partial (3, "XXXXXXXXXXX", 4)')

Examples and demos

To better demonstrate the features and functions of Dynamic Data Masking, I have prepared an example below, demonstrating how to create a table with some fields already masked, then we will view these fields with the mask.

I will alter the table to mask more columns, view the masked data (now with all columns) and then view 5x the result to observe the behavior of the random () masking function when executed multiple times.

Dynamic Data Masking Example

Viewing original data (with sysadmin user):

Viewing masked data after table creation (with ordinary user):

Viewing masked data after all changes:

Security and Permissions

To create or alter a table using Dynamic Data Masking, you do not need any special permissions, just the traditional table creation / modification permissions (CREATE TABLE and / or ALTER).

To add, change, or remove a column mask, the user must have the ALTER ANY MASK and ALTER permissions.

Users with SELECT permission on a table will be able to view table data, and masked columns will display masked data. In order for the user to be able to view the original data, ie without masks, he must have the UNMASK TO GRANT UNMASK TO [User] permission.

In addition, the CONTROL permission on the database includes ALTER ANY MASK and UNMASK permissions, ie users of roles db_owner or sysadmin can view the original data from masked columns.

Which columns are masked?

To identify which columns of your database are masked using Dynamic Data Masking, simply execute the query below:

Return example:

Breaking Dynamic Data Masking

Once the features of Dynamic Data Masking are presented, we need to also say that this method should not be the only way to ensure the security of your data. While it helps protect sensitive information, it has some flaws that allow unauthorized users to access that information using brute force techniques.

In the example above, I entered a record in the table where the salary was $ 12.345,67. Using brute force (or trial and error), we can find the approximate value of this salary (with patience or by creating an algorithm for it, up to the exact amount)

Example 1 - Numeric value:
In this example, I will demonstrate how it is possible to identify masked numeric values ​​from simple SELECT's in the table. In the example below, I only made 6 queries and I was able to discover a value very close to the real value. In case I wanted to find out the exact value, a few more queries were enough for that:

Want to find out the exact value? Let's use a simple script for this:

Result: Joãozinho's salary revealed!

Example 2 - Text
Here, I will demonstrate how to return the exact string that was masked using a character table and join letter by letter with that table. Each letter of the masked string will join my character table to find out what the original character is.

That's it folks!
I hope you enjoyed this post and even more!