Click on the banner to learn about and purchase my database training on Azure

SQL Server - Knowing the In-Memory OLTP (Hekaton) Durable vs Non-durable tables

Views: 2.721 views
Reading Time: 12 minutes

What's up guys!
Ready for one more tip?

Introduction

In this article I would like to demonstrate to you some ways to load data quickly and efficiently into the database using as little logging as possible. This is especially useful for staging scenarios in BI / Data warehouse processes where data must be loaded quickly and possible data loss is acceptable (as the process can be redone in the event of failure).

The purpose of this paper is to compare the performance of various forms of data entry, such as temporary tables, table variables, recovery_model combinations, compression types, and In-Memory OLTP tables, to prove how much This feature is efficient.

It is noteworthy that the In-Memory OLTP feature is available since SQL Server 2014 and has significant improvements in SQL Server 2016 and SQL Server 2017.

Testing using disk-based solutions

View content
For the tests below, I will insert 100k records in each test without joining and 10k records in each test with some joins. All databases use the same SSD disk (Samsung 850 EVO).

Base script used:

Using physical table (recovery model FULL)

In this test, I will use a physical table with Recovery Model in FULL.

Result:

Using physical table (SIMPLE recovery model)

In this test, I will use a physical table with Recovery Model in SIMPLE, which generates less information in the transaction log and theoretically should deliver a faster load.

Result:

Using physical table (recovery model BULK-LOGGED)

In this test, I will use a physical table with Recovery Model in BULK-LOGGED, which is optimized for batch processes and data loads.

Result:

Using physical table and DELAYED_DURABILITY (recovery model BULK-LOGGED)

In this test, I will use a physical table with Recovery Model in BULK-LOGGED, which is optimized for batch processes and data loads and also the DELAYED_DURABILITY = FORCED parameter, which causes log events to be written asynchronously ( learn more about this feature accessing this link ou this post here).

Result:

Using temporary table (recovery model SIMPLE)

Widely used for dynamically generated tables and fast processes, I will perform the test by inserting the data into a # temporary table (#Staging_Sem_JOIN and #Staging_Com_JOIN). My tempdb is using Recovery Model SIMPLE.

Result:

Using table type variable

Widely used for dynamically generated tables and fast processes, as well as table # temporary, I will perform the test by inserting the data into a @table (@Staging_Sem_JOIN and @Staging_Com_JOIN) variable.

Result:

In-Memory OLTP (IMO)

View content
Well, during testing, we noticed that the fastest way to load data into SQL Server is by using table-type variables or temporary tables, right? It depends on the version of your SQL Server.

Starting with SQL Server 2014, Microsoft has released a feature called In-Memory OLTP (IMO), also known as Hekaton, that enables improved transaction processing performance and reduced data locking by storing data in physical memory only. .

The In-Memory OLTP engine is designed for extremely high concurrency in OLTP operations. For this, SQL uses multi-versioning latch-free data structure in concurrency control. The result is predictable: elimination of containment, low latency, high throughput with linear scaling and all with guaranteed data durability. Actual performance gain depends on many factors, but we commonly see improvements in the order of 5 to 20 times.

Most specialized systems, including Complex Event Processing (CEP), DW / BI, and OLTP, optimize data structures and algorithms by focusing on structures in memory! So it is important to be connected to the technologies we have at our disposal for each of these scenarios.

A very practical way to identify potential tables or SP's candidates to use In-Memory is to use the Memory Optimization Advisor, which tell you which tables in your database will benefit if they are moved to use this feature.

To identify which tables in your database are using In-Memory, you can use this query:

To know all the DMV's used by In-Memory OLTP, visit this link.

To learn more about In-Memory OLTP, I recommend these two links:
- Official documentation
- Érika Madeira Post, part of the SQL Server Product Team

Implementing In-Memory OLTP

View content

Adding In-Memory Support to Your Database

In order to test how fast In-Memory OLTP can be compared to other methods used, we first need to add a filegroup to our in-memory optimized database. This is a prerequisite for using In-Memory.

To add this “special” filegroup, you can use the SQL Server Management Studio interface:

But when it comes to adding a file to filegroup, the SSMS interface (17.5 version) does not support it yet, not showing me the In-Memory filegroup I had already created, having to add the file using T-SQL commands.

To add filegroup as well as files, you can use the following T-SQL command:

Note that in In-Memory, unlike a regular filegroup, you do not specify the file extension because, in fact, a directory is created with multiple files hosted on it.

IMPORTANT: Until the current version (2017), it is not possible to remove a filegroup of type MEMORY_OPTIMIZED_DATA, ie once created it can only be deleted if the entire bank is dropped. Therefore, I recommend creating a new database for In-Memory tables only.

In-Memory OLTP: Durable vs Non-durable

Now that we have created our filegroup and added at least 1 file to it, we can start creating our In-Memory tables. Before we get started, I need to explain that there are 2 types of In-Memory tables:
- Durable (DURABILITY = SCHEMA_AND_DATA): Data and structures persisted on disk. This means that if the SQL server or service is restarted, data from your in-memory table will still be available for consultation. This is the default behavior of In-Memory tables.
- Non-durable (DURABILITY = SCHEMA_ONLY): Only table structure is persisted to disk and LOG operations are not generated. This means that writing operations on such tables are much faster. However, if the SQL server or service is restarted, your table will still be available for queries, for it will be empty because the data is available only in memory and was lost during crash / restart.

In-Memory OLTP Restrictions

View content
One of the constraints of In-Memory tables is the explicit need for a PRIMARY KEY defined in the table. Otherwise, you will receive this error message:

Msg 41321, Level 16, State 7, Line 5
The memory optimized table 'Staging_Sem_JOIN' with DURABILITY = SCHEMA_AND_DATA must have a primary key.
Msg 1750, Level 16, State 0, Line 5
Could not create constraint or index. See previous errors.

Another obvious restriction of In-Memory OLTP is regarding the amount of server memory. For this feature to work well, you need to evaluate very well which tables are good candidates to be stored in memory. If you have a table of 50 GB and 32 GB of RAM, you will not be able to put such a table stored in memory, right. Also, if you start storing very large tables in memory, this can degrade the overall performance of the instance. and the server as a whole. Ideal for In-Memory use is for DW loads or small, heavily accessed tables.

If you try to allocate very large tables in memory, you may encounter this error message:

The statement has been terminated.
Msg 701, Level 17, State 103, Line 63
There is insufficient system memory in resource pool 'default' to run this query.

Still in the restrictions for using In-Memory tables, we can list DDL triggers. If you created server triggers or database triggers for auditing, as I showed in the post How to create an Audit trigger to log object manipulation in SQL Server, you will receive this error message:

Msg 12332, Level 16, State 111, Line 5
Database and server triggers on DDL statements CREATE, ALTER, and DROP are not supported with memory optimized tables.

The solution to this case is to remove the server triggers and create these triggers on all databases that do not contain in-memory tables. Preferably create a database just for your In-Memory tables.

In the 2014 version, all string columns that were part of the non-clustered index should have collation * _BIN2. As of SQL Server 2016, this restriction no longer exists.

Other restrictions:

Feature / Limitation
SQL Server 2014SQL Server 2016 CTP2
Maximum size of memory used
Recommendation (not a set limit): 256 GB
Recommendation (not a set limit): 2TB
Collation Support
String columns that are part of the index or comparisons / sorting in natively-compiled modules must use the * _BIN2 collation.All collations are supported.
Changes to memory-optimized tables after creationNot supportedSupported
Changes to natively-compiled stored procedures
Not supportedSupported
Parallel plan for operations accessing memory-optimized tables
Not supportedSupported
Transparent Data Encryption (TDE)Not supportedSupported
Using the following commands in natively-compiled stored procedures:
  • LEFT and RIGHT OUTER JOIN

  • SELECT DISTINCT

  • OR and NOT operators

  • Subqueries

  • Nested stored procedure calls (Nested stored procedure calls)

  • UNION and UNION ALL

  • Mathematical functions

Not supportedSupported
DML Triggers in Memory-Optimized Tables
Not supportedSupported (AFTER triggers, natively-compiled)
Multiple Active Result Sets (MARS)
Not supportedSupported
Large Objects (LOBs):
  • varchar (max)

  • nvarchar (max)

  • varbinary (max)

Not supportedSupported
Offline Checkpoint Threads
1Multiple threads
Natively-compiled, scalar user-defined functions
Not supportedSupported
Indexes on columns that accept NULLable columns
Not supportedSupported

To view all restrictions on In-Memory tables, take a look at this link here.

Testing with In-Memory OLTP Tables

View content
The script I used to create the tests using the In-Memory OLTP tables is this:

Note that I specified the Primary Key HASH BUCKET_COUNT of the same size as the number of records. This information is important for assessing the amount of memory needed to allocate a table, as we can learn more in the article. this link here. The optimal number equals the number of distinct records from the original table for temporary load processes (ETL) or from 2x to 5x this number for transactional tables.

In-Memory OLTP: Durable

In this example, I will create the tables using the Durable type (DURABILITY = SCHEMA_AND_DATA) and see if the performance gain on data insertion is that efficient.

Result:

In the above print, it was shown that the measured time for the insert without join was not very satisfactory, being well behind the insert in table @variavel and table #temporaria, while the time with JOINS was the best measured so far, but it was not. nothing surprising.

In-Memory OLTP: Non-Durable

In this example, I will create the tables using the Non-Durable type (DURABILITY = SCHEMA_ONLY) and see if the performance gain on data insertion is that efficient.

Result:

Here we can find a very interesting result, with the shortest time with JOIN and 2º the shortest time without JOIN.

In-Memory OLTP: Non-Durable (All in Memory)

Finally, I will try to decrease the load time with joins by creating all the tables involved for memory and testing if this will give us a good performance gain.

Script used:

Result:

Conclusion

In the above tests, it was clear that, for this scenario, In-Memory OLTP turns out to be the best solution for both the simplest example, just entering data and entering data with joins.

Test Summary:

* Test 1 = INSERT Only / Test 2 = INSERT with JOINS

If we compare the results with the physical tables, the result is very expressive and a great incentive for their use in BI scenarios, because, in the examples presented, the tables were only 100k records in a VM with 4 colors and 8GB of RAM. .

The trend is that the better the hardware and the larger the data volume, the greater the performance difference between physical tables and memory tables. However, the result was not as expressive when compared to the table-type variable, for example, which even makes sense, since 2 are stored completely in memory.

Of course, an In-Memory table has several advantages over the table-type variable, especially its useful life, as the table-type variable is only available during batch execution and the In-Memory table is available while the service SQL Server remains active.

As I was not convinced by the test results, I decided to increase the volume of the data. Instead of 100k, how about entering lots of 20 records, totaling 1 million records entered per test and repeating more 2x for each form of assessment?

Let's get to the results:

In this article, I only demonstrated its potential for writing, but In-Memory OLTP has very good read performance too, especially if the table is heavily accessed.

I hope you enjoyed this post. If you didn't know In-Memory OLTP, I hope I have demonstrated a little of the potential of this great feature of SQL Server.

A hug and see you next!