In this mega expected post, I will comment on the news we can expect from SQL Server 2019, which has long since gone from being a Database Management System (DBMS) to becoming a true Microsoft data platform, joining database. data, BI, Machine Learning and Big Data / Analytics.
Would you like to see news about previous versions?
SQL Server 2019 What's New
The institutional video at the beginning of this post already has some highlights from SQL Server 2019 (which I will comment below) which shows us how Microsoft is looking at the big data market and also bringing significant improvements to its data platform. .
Video Transcript - SQL Server 2019 + Polybase + Spark + Big Data + HDFSView content
With SQL Server 2019, you can create a single virtual data layer accessible to almost every application. Polybase data virtualization deals with the complexity of integrating all your data sources and formats without requiring you to replicate or move them. You can simplify data management by using SQL Server 2019 Big Data Clusters deployed in Kubernetes. Each node in a Big Data Cluster includes the SQL Server relational engine, HDFS storage, and Spark, which let you store and manage your data using the tools of your choice.
SQL Server 2019 makes it easy to build smart applications with big data. You can now perform Spark tasks to analyze structured and unstructured data, train models on data from anywhere with SQL Server Machine Learning Services or Spark ML, and query data from anywhere using a rich notebook experience built into Azure Data Studio. The data stream isn't slowing, but it doesn't have to sink your business. Browse with SQL Server 2019 and shorten the distance between data and action.
Big Data + AnalyticsView content
Here's how we can make it easier to eliminate barriers to the perception of all your data by providing a view of your data across the organization:
- Simplify big data analysis for SQL Server users. SQL Server 2019 makes it easy to manage big data environments. It comes with everything you need to create a data lake, including HDFS and Spark provided by Microsoft and analysis tools, all deeply integrated with SQL Server and fully supported by Microsoft. You can now run applications, analysis and AI on structured and unstructured data - using familiar T-SQL queries or people familiar with Spark can use Python, R, Scala or Java to perform Spark tasks for preparing or analyzing data. the same integrated cluster.
- Offer developers, data analysts and data engineers a single source for all your data - structured and unstructured - using your favorite tools. With SQL Server 2019, data scientists can easily analyze data in SQL Server and HDFS through the works of Spark. Analysts can perform advanced big data analytics using SQL Server Machine Learning Services: training large data sets on Hadoop and operationalizing on SQL Server. Data scientists can leverage a new notebook experience running on the Jupyter notebook engine in a new Azure Data Studio extension to interactively perform advanced data analytics and easily share analysis with their colleagues.
- Divide data silos and provide a visualization of all your data using data virtualization. Starting with SQL Server 2016, PolyBase has allowed you to run a T-SQL query within SQL Server to extract data from your Data Lake and return it in a structured format - all without moving or copying the data. Now, in SQL Server 2019, we are expanding this concept of data virtualization to additional data sources, including Oracle, Teradata, MongoDB, PostgreSQL and others. Using the new PolyBase, you can split data silos and easily combine data from multiple sources using virtualization to avoid the time, effort, security risks and duplicate data created by moving and replicating data. New, extensively scalable “data pools” and “data pools” make querying virtualized data faster, caching data and spreading query execution across many instances of SQL Server.
Performance ImprovementsView content
- The feature family of Intelligent Query Processing is based on SQL Server 2017 Adaptive Query Processing performance tuning features, including Row mode memory lease feedback, approximate COUNT DISTINCT, Batch mode in rowstore
- O persistent memory support has been enhanced in this release with a new optimized I / O path available for interaction with persistent memory storage.
- Lightweight query profiling infrastructure It is now enabled by default to provide per-query operator statistics anytime and anywhere you need it.
- Row estimation changes when using variable tables (deferred table variable compilation). Until SQL Server 2019, the query optimizer ALWAYS estimated 1 rows returned when using table type variables, often generating incorrect operators when using large masses of data on this object type and performing very poorly. This caused many people to use temporary tables or hint OPTION (RECOMPILE) to avoid this kind of behavior. In the 2019 version, the query optimizer will try to estimate a number closer to the real, making results using table-type variables generally better than results in previous versions. To learn more about this news, see this post from Brent Ozar
- Row mode memory grant feedback. SQL Server 2017 introduced Row mode memory grant feedback, which is described in detail here. Essentially, for any memory grant involved with a plan that involves batch mode operators, SQL Server will evaluate the memory used by the query and compare it to the requested memory. If the requested memory is too low or too high, leading to spills or wasted memory, it will adjust the memory lease associated with the execution plan the next time it is executed. This will reduce the lease to allow greater competition or increase it to improve performance.
We now also get this behavior for row mode queries under the 150 compatibility level. If a query is found on a disk, the memory lease is increased for subsequent executions. If the actual memory used by the query is less than half of the leased memory, subsequent lease requests will be smaller. Brent Ozar goes into more detail in his post about Adaptive Memory Grants.
- Rowstore Batch Mode - Since SQL Server 2012, queries against tables with columnstore indexes have benefited from batch mode performance improvements. The improvements are due to the query processor performing batch processing instead of line by line. The lines also arise from the batch storage mechanism and parallelism switch operators can be avoided.
Under the 150 compatibility level, SQL Server 2019 will automatically choose batch mode in certain cases, even when there is no columnstore index, which often cannot be created by various technical and / or conceptual constraints, such as lack of trigger support
- New aggregate function APPROX_COUNT_DISTINCT - This new aggregation function is designed for data warehouse scenarios and is equivalent to COUNT (DISTINCT ()). Rather than performing expensive, separate sorting operations to determine actual counts, it relies on statistics to get something relatively accurate. You should find that the margin of error is within 2% of the accurate count 97% of the time, which is usually sufficient for high-level analysis and the memory savings used are quite considerable.
- Query-scoped compatibility level hints - Do you have a specific query that works best at a certain level of compatibility that is not the same as the current database? You can now do this with the new query hints, supporting six different levels of compatibility and five different cardinality estimation models. The following are the levels of compatibility available, the example syntax and the CE model used in each case. You can see how this can affect estimates, even for system catalog views:
To view all available hints, you can refer to the sys.dm_exec_valid_use_hints DMV.
- Columnstore index compression estimates - In current versions, the sys.sp_estimate_data_compression_savings procedure has a check for the type of compression (NONE, ROW and PAGE). In SQL Server 2019, this check has been changed to allow calculation of columnstore index estimates.
This is great news, as it allows you to predict the impact of adding a columnstore index to a table that has none, or converting a table or partition to a more aggressive columnstore format without having to restore the table to another system and actually apply it in practice. to do this analysis.
- New function to retrieve page information - DBCC PAGE and DBCC IND have been used for a long time to gather information about the pages that make up a partition, index or table. But they are undocumented and unsupported commands and can be very tedious to automate solutions around problems that involve more than one index or page.
Along came sys.dm_db_database_page_allocations, a dynamic management function that returns a set representing all pages in the specified object. Not yet documented, this function exhibits a predicate pushdown issue that can be a real issue in larger tables: even to get information about a single page, it needs to read the entire structure, which can be quite prohibitive.
SQL Server 2019 introduces another DMF, sys.dm_db_page_info. This basically returns all the information on a page without the overhead of DMF allocations. In current versions, however, you may already know the page number you are looking for to use the function. This may be intentional as it may be the only way to guarantee performance. Therefore, if you are trying to determine all pages in an index or table, you still need to use DMF allocations.
Advanced Security - Confidential ComputingView content
An enclave is a protected area of memory in which these calculations and filtering can be delegated (on Windows, this uses virtualization-based security) - the data remains encrypted in the engine, but can be safely encrypted or decrypted within the enclave. Just add the ENCLAVE_COMPUTATIONS option to the master key, which you can do in SSMS by checking the “Allow enclave computations” checkbox when creating a column's Master Key
In SQL Server 2019, there is a new command for this metadata, now available in Azure SQL Database, called ADD SENSITIVITY CLASSIFICATION. This allows you to do the same type of SSMS wizard, but the information is no longer stored as extended property, and any access to this data is automatically displayed in audits in a new XML column called data_sensitivity_information. Contains all types of information that were accessed during the audited event.
Mission critical availability - high uptimeView content
Example of use:
CREATE INDEX Idx01 ON dbo.Teste(Nome)
WITH (ONLINE = ON, RESUMABLE = ON, MAX_DURATION = 10 MINUTES);
Best developer experienceView content
In SQL Server 2019, there are new collations to provide native support for UTF-8 datastore in SQL Server. Thus, you can create a char or varchar column to properly store UTF-8 data using new collations with a _SC_UTF8 suffix, such as Latin1_General_100_CI_AI_SC_UTF8. This can help improve compatibility with external applications and other platforms and database systems without paying nvarchar performance and storage penalties.
Multiplatform ImprovementsView content
SQL Server 2019 Support in Azure Data StudioView content
Increasingly, these users want to use the same tools to work seamlessly on-premise and in the cloud, using unstructured relational data, working with OLTP, ETL, analytical, and streaming workloads.
Azure Data Studio (formerly SQL Operations Studio) offers a modern publisher experience with ultra-fast IntelliSense, code snippets, source code integration, and an integrated terminal. It is designed with a data platform user in mind, allowing you to generate graphs and insights from your queries, an integrated notebook, and customizable dashboards. Thus, Microsoft has been focusing on this profile in Azure Data Studio, and keeping SQL Server Management Studio focused on the Database Administrators (DBAs) profile.
Azure Data Studio currently offers built-in support for both on-premise and on-premise SQL Server Azure SQL Database, as well as support for Azure Managed Instance and Azure SQL Data Warehouse.
Azure Data Studio is today introducing a new preview extension to add support for some features of SQL Server 2019. The extension offers connectivity and tools for SQL Server big data clusters, including a preview of the first experience of the annotations feature in the suite SQL Server tools and the new “PolyBase Create External Table” wizard, which facilitates access to data from remote instances of SQL Server and Oracle.
To download Azure Data Studio, click this link here.
New posts on sys.messagesView content
Among the extensive list that changes and news in the messages, I highlight some of them:
- 3911 - Persistent version store is full. New version (s) could not be added. A transaction that needs to access the store version may be rolled back. Please refer to BOL on how to increase database max size.
- 10661 - The refresh operation for all snapshot views failed because there was another refresh operation (either for all or a single snapshot view) in progress.
- 9113 - Warning: Creating and updating statistics will force FULL SCAN in this version of SQL Server. If persisting sample percent, persisted_sample_percent will be 100.
- 12112 - Warning:% ls statement is being forced to run WITH (% S_MSG = ON) because the ELEVATE_% S_MSG database scoped configuration is set to FAIL_UNSUPPORTED. The statement may fail. See the SQL Server error log for more information.
- 5871 - Cannot set the encryption column enclave type to Virtual Secure Mode (VSM) - the operating system does not support VSM.
- 2628 - String or binary data would be truncated in table '%. * Ls', column '%. * Ls'. Truncated value: '%. * Ls'.
Yes, you saw it right! Who has never wasted precious minutes trying to identify which column and which value blew the boundary of a varchar field? In SQL Server 2019 this should not happen anymore (Note: I did a test and it still happens .. let's wait for the final version).
New system objects and DMV'sView content
New Stored Procedures
New Tables / Views
Here's another video with a summary of features featured:
What's up? Did you like everything you saw?
Please download SQL Server 2019 (vNext CTP 2.0 - Preview) and start testing all the new features.
A hug and see you next!