SQL Server 2016 - How to archive historical tables in Azure with Stretch Database
Views: 449 views
In this article
Reading Time: 14minutes
Introduction
In today's post, I would like to demonstrate to you how to archive historical tables in Azure with Stretch Database, available from SQL Server 2016. Stretch Database is a feature of the DBMS that migrates your cold data (data that is not changed or queried often) automatically, transparently and securely to the Microsoft Azure cloud because data is stored encrypted on the Azure (using the Always Encrypted) and allow you to access cold data (in Azure) and hot data (local) in a transparent way for the application, ie the whole process is done by Database Engine in a totally transparent way, where in the same query, you may be browsing data in the cloud and locations without even realizing it.
Among the advantages of using Stretch Database is reduced local disk space consumption on hot data as historical data (which is often the largest amount of data) has been moved to the cloud. This also ensures a faster query on hot data (fewer pages to process) and facilitates the maintenance of this data as backups will be performed faster as well as many other administrative tasks such as index rebuild, checkdb, etc.
How Stretch Database Works
After enabling Stretch Database for an instance of SQL Server and a database, and selecting at least one table, it silently begins migrating cold data to Azure.
If you store cold data in a separate table, you can migrate the entire table.
If the table contains hot and cold data, you can specify a filter function to select the rows to migrate.
You do not need to change existing queries and client applications.. You still have direct access to local and remote data, even during data migration. There is a small amount of latency for remote queries, but you can only be affected by this latency by querying cold data, which is in Azure.
Stretch Database ensures no data will be lost. if a failure occurs during migration. It also has retry logic to address connection issues that may occur during migration. The sys.dm_db_rda_migration_status DMV provides the current data migration status.
You can pause data migration to troubleshoot the local server or to maximize available network bandwidth.
To make sure your data is actually stored in the cloud, we can use the sp_spaceused To identify how many rows and the amount of data in the table, you can also filter data that is local and remote:
How to implement Stretch Database using SSMS
View content
A very simple way to deploy Stretch Database and start storing your tables in the Microsoft Azure cloud is by using the SQL Server Management Studio (SSMS) interface. I really enjoy using the Wizard to implement Stretch Database, especially in the first table, as it already allows you to create databases and encryption keys (Always Encrypted) without even opening Azure Portal.
The first step is to select the table you would like to store in the cloud using the “Object Explorer”, right-click and select the “Stretch Database” menu and click on the “Enable” option.
The first screen of the Stretch Database Wizard displays a summary of what this feature will look like.
In the case of the SalesOrderHeader table, the limitations that are preventing Stretch Database usage are check, default, and foreign keys constraints. To get around this, I will create a new table (SalesOrderHeader2) with the contents of the original table (SalesOrderHeader), but without the contraints.
Now, let's try to activate Stretch Database on the newly created table. Note that the default option is “Entire Table”, that is, the entire table will be stored in the cloud. If you want to keep a part of the data in the local infrastructure, even to have a better performance in the most frequent queries, click on this link to open the configuration screen for migration to the cloud.
In case you want to customize the way of identifying the “cold” data (data rarely used - stored in the cloud) and “hot” data (data used frequently - stored in the local infrastructure), you can select the option “Choose Rows”, define a name for this filter and choose the column and expression that will be used to filter the data, as shown below:
And this is the Stretch Database configuration screen after creating a filter to select the rows to migrate:
On this screen, we must configure the Azure account that will store the “cold” data from the source table.
Already in this screen, we will set a strong password for the database master key (DMK), which will be used to encrypt the data in the cloud (using Always Encrypted, all transparently to you)
And here, we can configure the Firewall of this database, either using your current IP or providing a range of IP's to allow access. This setting depends a lot on your network, so it can be changed according to your scenario. If you want to change these firewall settings after enabling Stretch Database, this can be done through the Azure Portal.
Summary of actions to be performed:
Data migration successful! Now just wait for the data migration to finish in the background, in a totally transparent way. To track data transfer progress, use the sys.dm_db_rda_migration_status DMV:
Note that after creating Stretch Database in at least one table, the database icon changes in the SSMS Object Explorer screen.
Additionally, new 2 objects are created: 1 External Data Source (Stretch Server) and 1 Sorting Function (fncLast_4_Years)
Video demo
How to implement Stretch Database using Transact-SQL (T-SQL)
View content
After demonstrating how to implement Stretch Database using the SSMS interface, I will now share with you how to do the same thing using only T-SQL commands.
The first step is to enable Stretch Database on the instance:
Transact-SQL
1
2
3
4
5
EXECsp_configure'remote data archive','1';
GO
RECONFIGURE;
GO
Before you can enable Stretch Database on individual tables, you must enable it on the database. Enabling Stretch Database on a database or table requires db_owner and CONTROL DATABASE permissions.
You will need to manually log in to Azure Portal to create a new Stretch Database server (if you don't have one yet). Remember to configure firewall rules to be able to access it.
Now is the time to configure the encryption of data that will be sent to Stretch Database:
It is noteworthy that only deterministic functions can be used in the Stretch Database filter, ie you will not be able to use a GETDATE (), for example, to make a dynamic date filter. If you need this, you have to create new functions by changing the filter, and applying an ALTER TABLE to change the filter function:
Transact-SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- Criando uma nova função de filtro para aumentar 1 ano da função criada no exemplo anterior
If you analyze the execution plan of a query before you activate Stretch on that table, you will see that the plan will not have the “Remote Query” operator in reading the data from the tables (unless you use external data, such as Linked Server ), only local data operators such as Index Seek, Table Scan, Index Scan, etc.
However, when we activate Stretch in this table, we see that when consulting the data that is on the remote server, the “Remote Query” operator becomes part of our execution plan. This occurs both when you query the entire table and when you query data that is only on the remote server.
If the result of your query is stored only locally, the “Remote Query” operator will not appear in the execution plan.
In the case of the example above, the column “Discontinued” was used as a filter to determine the data that should be migrated or not (1 = Migrate, 0 = Remains local), that is, if no filter is performed (bringing the entire table ) or some other filter that returns data with this flag with both the values 0 and 1 or just the value 1, the “Remote Query” operator will be present, as part of the data will be on the remote server.
If a query returns only the data with this flag = 0, the “Remote Query” operator will not be part of the execution plan, since only data that is in the local infrastructure will be returned.
Pausing, Disabling, and Resuming Data Migration in Stretch Database
View content
To pause or resume data migration in Azure, choose Stretch to a table in SQL Server Management Studio and choose Pause to pause data migration or Resume to resume data migration. You can also use Transact-SQL to pause or resume data migration.
To disable data migration and bring migrated data to the cloud back to your local infrastructure (copying remote data to the Azure table back to SQL Server generates data transfer costs), use the command below :
To disable data migration and abandon migrated to the cloud, use the command below:
Transact-SQL
1
2
3
4
5
6
7
8
USEAdventureWorks
GO
ALTERTABLESalesOrderHeader2
REMOTE_DATA_ARCHIVE=OFF_WITHOUT_DATA_RECOVERY (
MIGRATION_STATE=PAUSED
)
GO
Remember to pause data migration on individual tables when you want to troubleshoot local server problems or to maximize available network bandwidth. Besides that, Disabling Stretch Database from a Table or Database Does Not Delete the Remote Object. If you want to delete the remote table or remote database, drop it using the Azure Management Portal. Remote objects continue to incur Azure costs until you delete them.
How to disable Stretch Database for a database
To disable Stretch Database for a database, you will first need to disable Stretch for all individual tables that are enabled on this database. Use the sys.remote_data_archive_tables view to know what these tables are.
After disabling Stretch from all tables, use the command below to disable Stretch from the database:
Transact-SQL
1
2
3
ALTERDATABASEAdventureWorks
SETREMOTE_DATA_ARCHIVE=OFF
GO
Remember that Disabling Stretch Database from a Database Does Not Delete the Remote Database. If you want to delete the remote database, drop it using the Azure Management Portal. The remote database continues to incur Azure costs until you delete it.
How to monitor data migration progress in Stretch Database
View content
A simple way to monitor the progress of data migration in Stretch Database is by using SQL Server Management Studio (SSMS), by clicking on the desired database, right-clicking, selecting Tasks> Stretch> Monitor, as shown below:
This will open a report where you can view details of the number of rows eligible for migration and how many have already been migrated, as well as the loads (lots of up to 9.999 records) made:
One way to be able to track the progress of data migration using Transact-SQL (T-SQL) is by looking at the sys.dm_db_rda_migration_status view, which will return the information for each load batch:
For information on which databases have Stretch Database enabled, see the sys.remote_data_archive_databases
or the view sys.remote_data_archive_tables to query the tables that have Stretch Database:
Base Backup and Restore with Stretch Database
View content
If you enable Stretch Database in your environment, you often have questions about how backup / restore routines look for those databases that have local data and also in the cloud. Well, as for backup, the process does not change anything.
Base backup with Stretch Database enabled
The backup of local data (“hot” data), that is, that are not eligible for migration yet and are stored in your local infrastructure, continue to be done normally, using BACKUP DATABASE commands and automated through jobs, along with the databases / tables that are 100% local and in the same way that you back up databases / tables that do not have the Stretch Database, including data that are eligible but have not yet been migrated at the time the backup is generated.
The data that has already been migrated and is in the Microsoft cloud, is part of the Azure automatic backup routine, done every 8 hours (at least) in the Staging tables and with data retention of 7 days (point-in-time recovery ), that is, Azure already backs up the “cold” data for you.
Base restore with Stretch Database enabled
When we talk about restoring data from banks / tables that have Stretch Database enabled, the situation is a little different. If the data that needs to be restored is ineligible data, that is, the data that is in your local infrastructure (“hot” data), you will restore the data in the traditional way, using the RESTORE DATABASE command, in the same way how you restore database data without Stretch enabled.
After completing the restore process, you will need to run the command below to reestablish the connection between base and remote base in Azure:
Transact-SQL
1
2
3
4
5
6
7
USE[AdventureWorks]
GO
EXECsys.sp_rda_reauthorize_db
@credential=N'MeuStretchDatabase',
@with_copy=1;
GO
Note: You can look up the credential name in the sys.database_scoped_credentials view.
To restore data that is in Azure, if the problem has been with data that has already been migrated, you will need to perform the restore database in Azure Portal. If you want to connect to a restored Azure database with a different name or in a different region, you can use SP sys.sp_rda_deauthorize_db (requires db_owner permissions) to remove the authenticated connection between a Stretch-enabled local database and the Azure remote database, also useful when the remote server is inconsistent or unavailable.
After executing sp_rda_deauthorize_db, all queries on Stretch-enabled tables and databases will fail because the query mode is set to DISABLED (disabled). To reuse these queries, you must choose one of the 2 options below:
Use the sys.sp_rda_reauthorize_db to reconnect to the remote Azure database. This operation automatically resets the query mode to LOCAL_AND_REMOTE, which is the default behavior for Stretch Database. That is, queries return local and remote data results.
Run to sp_rda_set_query_mode with the LOCAL_ONLY argument to allow queries to continue running against local data only, ignoring remote data.
Recover an Azure dynamic database
The SQL Server Stretch Database service on Azure takes snapshots of all dynamic data with a minimum frequency of 8 hour intervals using Azure Storage Snapshots. These snapshots are kept for 7 days. This allows you to restore data to at least one of the 21 specific points from the last 7 days until the time the last snapshot was taken.
To restore an Azure dynamic database to a previous specific point through the Azure portal, do the following:
Log in to the Azure portal.
On the left side of the screen, select "BROWSE" and "SQL Databases".
Browse to the database and select it.
At the top of the database sheet, click "Restore".
Specify a new database name, select a "Restore Point" and click "Create".
The database restoration process will be started and can be monitored with “NOTIFICATIONS”.
Recover a deleted Azure database
The SQL Server Stretch Database service on Azure takes a database snapshot before a database is removed and retains it for 7 days. After that, it no longer retains database snapshots. This allows you to restore a deleted database to the point where it was deleted.
To restore a deleted Azure database to the point where it was deleted using the Azure portal, do the following:
Log in to the Azure portal.
On the left side of the screen, select "BROWSE" and "SQL Servers".
Browse to the server and select it.
Scroll down to "Operations" on the server blade and click on the "Deleted Databases" block.
Select the deleted database you want to restore.
Specify a new database name and click "Create".
The database restoration process will be started and can be monitored with “NOTIFICATIONS”.
Stretch Database Limitations
View content
Limitations for Stretch-Enabled Tables
Uniqueness is not enforced for UNIQUE constraints and PRIMARY KEY constraints on an Azure table that contains the migrated data.
You cannot UPDATE or DELETE rows that have been migrated or rows that are eligible for migration in a Stretch-enabled table or in a view that includes Stretch-enabled tables.
You cannot INSERT rows in a Stretch-enabled table on a Linked Server.
You cannot create an index for a view that includes Stretch-enabled tables.
Filters on indexes are not propagated to the remote table.
Limitations that currently prevent you from enabling Stretch for a table.
Tables that have more than 1.023 columns or more than 998 indexes
FileTables or tables containing FILESTREAM data
Tables that are replicated or actively using Change Tracking or Change Data Capture (CDC)
Memory Optimized Tables (In-Memory OLTP)
Text, ntext and image, timestamp, sql_variant, XML, CLR data types, including geometry, geography, hierarchyid, and user-defined CLR data types
Computed columns
Default constraints and check constraints
Foreign key constraints that reference the table. In a parent-child relationship (for example, Order and Order_Detail), you can enable Stretch for the child table (Order_Detail), but not for the parent table (Order).
Full-text, XML, or spatial indexes
Indexed Views Referencing the Table
The Stretch Database Price
View content
When I first started writing this article about Stretch Database, I didn't go into much research about the price of this feature, because in my view it would only be charged for data trafficked and stored in Azure. When you are using the SSMS Wizard, it even shows you a cost forecast of your current Stretch Database usage:
As you can see in the picture, the estimated cost is 61 dollars every Terabyte per month. However, this estimate is only for the transferred data. When you create a Stretch Database server, you're charged for that server as well, and unlike this estimate, it's not cheap at all. I noticed this when I received an email, 4 days after my Stretch Database tests, saying that I had already consumed my actual Azure 1.000 credit (and had not used anything yet).
After this scare (kkkkk) I ended up researching the price of this feature, below to let you know the cost of this service:
Dirceu, good afternoon.
I really liked the post.