Data virtualization integrates data from disparate sources, locations, and formats, without replicating or moving the data, to create a single virtual data layer. The virtual data layer, or PolyBase, allows users to query data from many sources through a single, unified interface in SQL Server.
- [Instructor] In 2016 Microsoft introduced a feature called PolyBase that allowed SQL Server to process Transact-SQL queries against data stored in external HDFS compatible Hadoop distributions and file systems. Now in SQL Server 2019, PolyBase has been expanded to connect to remote data stored in more remote sources. With PolyBase external tables, you can read and query data stored in Big Data providers such as Cloudera, Apache Spark, or Hortonworks. You can also access data in Relational data base sources such as Teradata, Oracle, and external SQL Server instances or in NoSQL databases such as MongoDB, Cassandra, and Azure Cosmos DB. Finally you can also access any external source that could be accessed through an ODBC connection, such as IBM DB2 and even Microsoft Excel spreadsheets. SQL Server does this through a process of virtualization that uses external tables to represent the remote data source. This allows you to access the data used in the same T-SQL commands that you already use in SQL Server while leaving it on the remote source. Basically almost anywhere that you currently store your data can be accessed and queried directly without having to first copy it into SQL Server through an Extract Transform and Load process. Further these external tables will appear in your databases right along you locally stored relational data tables. This allows you to write queries that join data from the remote sources to the data that you're storing in your SQL databases which streamlines reporting and analysis across the organization. Support for PolyBase is added to SQL Server during the initial setup of the server instance or by going back into the setup application and choosing add features and checking PolyBase query service for external data. Once PolyBase is installed, it needs to be enabled on the server using the T-SQL command shown here. You can also check if PolyBase support is installed with this command.
- Intelligent query processing
- Improvements to persistent memory
- Table virtualization with PolyBase
- Creating a unified data cluster
- Training and creating machine learning models
- Running SQL Server in containers
- Updates to the SQL Server Management Studio (SSMS)