Join Sheeri Cabral for an in-depth discussion in this video What is the CONNECT storage engine?, part of MariaDB for MySQL Users.
The Connect Storage Engine was built to replicate the functionality of federated or federated x in MySQL. Those storage engines allowed connecting to a remote database however, they were very limited and only allowed you to connect to a remote MySQL database. Maria DB's connect storage engine, allows you to connect to many different types of databases. It also allows you to connect to data stored in files. Like in SQLA data base tech space files and configuration files.
Using remote connection as a foundation, the connect storage engine can also perform other functions, creating easy pivot tables and, inversely, unpivoting data, querying filesystem information as if it were a table, and easily querying remote database metainformation even for remote databases that are not MySQL. We will have examples of all these functions of the CONNECT storage engine, and we'll also show two other advanced features of the CONNECT storage engine.
Manipulating lists of data in one multivalued field, for example comma separated lists. And being able to access multiple similar tables as one table. Why are all these functions grouped together in one storage engine? They all have the underlying goal of being able to display and manipulate data in MariaDB. When the original form might not be in MYSQL format. The CONNECT storage engine was developed to allow you to use the SQL syntax you are accustomed to, once you have created the tables.
They're a special syntax to create the tables, but once they are made there are no special SQL extensions. This means that you can run complicated MySQL queries on any combination of MySQL tables, tables from other data sources, text files, and metadata. Because the Connect storage engine stays close to the SQL you are used to, indexes are supported in Connect tables as well as special fields like dynamic columns. There are optimizations in place to make sure that the Connect storage engine is as performant as possible.
For example, only fields that are used in the query are retrieved so that excessive data scanning of a remote source is not done. As well, filtering is pushed down to where it can be done so that filtering is done as close to the remote source as possible. Partition tables can take advantage of parallel execution. There are many different types of data that you can access with the CONNECT storage engine. If you are so inclined, the CONNECT storage engine also provides an API if you want to use C++ to write your own foreign data wrappers.
Tables that use the CONNECT storage engine are virtual tables. I will use the term Virtual Table to mean that the table itself does not have any data in it. A Virtual Table gets its data from some other source. In that way it's similar to a view in MySQL, another term that is used to describe it, is that the table is not materialized. A materialized table saves the data in the table itself. Like a regular table. Tables using the CONNECT storage engine are not materialized.
I have given a brief introduction to the reasons to use the CONNECT storage engine. Let's start using it.
- Why use dynamic columns?
- Creating, updating, and querying dynamic columns
- Installing the CONNECT storage engine
- Connecting to remote data
- Connecting to CSV, INI, and XML files
- Using PIVOT tables
- Accessing database metadata
- Working with the Cassandra storage engine