Join Sheeri Cabral for an in-depth discussion in this video Using persistent virtual columns, part of Advanced Topics in MySQL and MariaDB.
- View Offline
- Virtual columns are great. If we use data in a virtual column over and over, it would be better performance to calculate the value once and materialize it, storing it in the table. For this, there is another type of virtual column called persistent. This is a bit confusing as persistent means the opposite of virtual. What a persistent virtual column does is calculate the data once and then store it in a table. This has better performance for fields that are used over and over and it also allows indexes.
Let's go to an example. If we wanted to find all of the email addresses from email.com, we could search for SELECT email FROM virt_col WHERE rev_email LIKE moc.liame%. You can see how that's the reverse of email.com. We get foo and email@example.com as our results.
However, if we have a lot of rows, there is no index on the reverse email field, so our query might do a full table scan. No keys are allowed on the virtual virtual columns, but the persistent virtual columns allow indexes including foreign keys. Another limitation of the virtual virtual columns is that they cannot be altered with an alter statement. In order to change an existing virtual column to be a persistent virtual column, we have to drop it and re-add it.
So, let's take a look at how that's done. ALTER TABLE on virt_cols. So, let's first DROP COLUMN called rev_email, because we're going to want to change it to a persistent virtual column so that we can then add an index. So, now let's add it back. ADD COLUMN rev_email VARCHAR (100) AS (REVERSE function on our (email) field.
But instead of putting the virtual keyword here, we're going to put the PERSISTENT keyword here. That's how you define a persistent virtual column. Now, let's add an index on the reverse email field. So, let's ALTER TABLE virt_cols ADD INDEX on (rev-email). We now have an index on the reverse email table so we can run the query we did before and have it use an index. In fact, let's test this out by looking at the index statistics.
If you do not have user statistics turned on, turn them on with SET GLOBAL userstats=1. Actually, the variable name is called userstat, so pardon my typo. Let's get a baseline for how many rows have been read from the index. It should be none because we have not used the index. So, let's do SELECT index name, rows read FROM INFORMATION SCHEMA.index statistics, and put that on another line to make it more easy to read, WHERE TABLE NAME='virt_cals' and let's LIMIT 5 just in case.
As we expected, we have not used any indexes yet. Now, let's run our query again. We'll hit up until we get to it. The one that finds the reverse of the email and hopefully uses the index and that's this one. SELECT email FROM virt_cals WHERE rev_email LIKE this string. Let's look at the index statistics again. We can see that we have in fact read two rows using the index. I have touched upon some of the limitations of virtual columns, but there are a few more.
I mentioned you could not change a virtual column with alter table, but you also cannot change a regular column into a virtual one with alter table. You'll have to drop and add the column as we just did when trying to change a virtual column. Another important limitation is that neither a virtual virtual nor a persistent virtual column can be part of a primary key. Even though persistent virtual columns can be used in foreign keys, there are some limitations. The foreign key cannot be defined as on cascade update, on update set null, nor can it be defined as on delete set null.
The expression used in the virtual column which is what is in the parenthesis following the as keyword, also has limitations. The expression cannot exceed 252 characters, although the field itself can be any size. This means that you cannot have complex logic in a virtual column. Just simple functions like we did, reverse and from Unix time. An expression cannot use information external to the row it's on. So, you cannot use subqueries or do something like depend on the count of the number of rows in the table.
An expression can also not be a constant or a calculation that only has constants like one plus one. Finally, an expression cannot use another virtual column. The last point to make is that most, but not all storage engines in MariaDB support virtual columns. InnoDB, MyISAM, CONNECT and the Aria storage engines, all support virtual columns. So, it is very likely that you will be able to use virtual columns where you need to. I have shown what persistent virtual columns are and how to use them and how to index them.
I've also explained the limitations of virtual columns.
Want more information about MariaDB? Check out Understanding MariaDB for MySQL Users.
- Comparing MariaDB and MySQL
- Creating and using virtual columns in MariaDB
- Building Sphinx tables
- Installing HandlerSocket
- Setting up GTIDs in MariaDB and MySQL
- Audit logging in MariaDB and MySQL
- Using GIS in MariaDB and MySQL