Sometimes data is stored in a format different from what you need to use. Find out how to transform that data with the CAST command.
- [Instructor] Just because data is stored one way in the database doesn't mean we can't transform it into something we need with our query. Some common functions we'll look at are changing the case of a string, converting a value into a different type, trimming a value, and replacing a particular string in a field. Let's start with a basic query like we've seen before. I'll write SELECT first_name, last_name FROM people and I'll run that.
Now let's take a few liberties with the capitalization here. On the first field, I'll add a function called LOWER to tell the database to give me a lowercase representation of the field. And on the last name I'll add an UPPER function, again wrapping the field in parentheses, to transform the text into its uppercase equivalent. When I run this, keep an eye on the results. Those are fairly straightforward, they just take a string and transform it.
We can also chop up pieces of a string, say if you need to get the first five characters of a name for some reason. And for that we'll use the SUBSTR function, which stands for substring, or a smaller piece of a larger string. Let's wrap the substring function around last name, instead of UPPER. This function takes a few arguments. First the field, or a string literal and then a number representing the number at which to start counting and the length after which to stop in characters. I want to start at the first character and then proceed five characters after that to get the first five characters of the field.
I'll run this and I can see here in the result it looks like it did what it says on the 10. We can change the numbers around, too, to get a sense of how it works. I'll start at character two and proceed four characters in. That's a little different segment there. If we leave off the last field, the length, we just get back a string cut at whichever character would represent the start of the range, in this case the o in Howell, the l in Alvarez, and so on.
And if we turn that number negative we get a count coming from the end of the string, instead of the beginning. So negative two gives us the last two letters of the string, negative four gives us the last four and so on. We can also replace text in a string as we return it from the database. And for that we'll use the REPLACE function. I'll write REPLACE here and as the first field in the function I'll pass in the field that I want to work with.
As with the other functions, this could be a string literal, too, a string that isn't a value in the database, but is just specific text. The next item that I'll give the function is the string to replace, in this case I'll say the letter a, but it could be a longer string. And the third value I'll give here is the string to replace that previous string with. I'll use an underscore character, so it's really clear where the substitution's happening. This can also be a longer string, or an empty string if you just want to omit a particular string from your results.
And I'll run that and there we go. Throughout all of our data in the last name column the lowercase a letter has been replaced with an underscore. This replacement is case sensitive, so keep an eye out for that. We can use the CAST keyword to tell the database to interpret one data type as another. This can be useful if we don't have control over the schema of the database and we can't make changes to how the data is stored. To use CAST, we'd write CAST and then put in a field name, say quiz_points and the AS keyword and a data type, say character.
This type could be any of the types we saw before, though in some cases the result won't make sense like casting a date to a Boolean. One way we can see a difference in how data types matter is to take something we know as a number, like the quiz_points in the database and sort them. I'll move down a few lines and write SELECT quiz_points FROM people ORDER BY quiz_points. And I'll select this statement and run it.
Here the quiz points are sorted in numerical order. That makes sense, 10, 20, all the way up through 100. But if I take this quiz_points field and tell the ORDER BY clause to treat it as character data with CAST AS CHAR and select it and run it again, I can see that changes how it works.
Now it treats the values of the text sorting them into what would be called an alphanumeric sort. Now the values as they're sorted go from 10 to 100 and then to 20, 30, 40, 50 and so on. This is an alphanumeric sort. What's happening here is the system is taking a look at these values character by character. 10 and 100 both start with one and then a zero and then 100 has another zero, so it sorts next.
And then along comes two, which comes after one, regardless of what's after one, when sorting as text. We can check this behavior out another way, too, with the MAX function. I'll write SELECT MAX CAST quiz_points AS CHAR FROM people.
I need to make sure my parentheses match. And now the MAX value that's returned from the quiz points is 90, even though I know the points go up to 100. That's because a text string that starts with nine will sort with a higher value than one that starts with a one, so 90 would be considered higher than 100, 1,000, 1,000,000. If I change this to an integer, then I get the result that I expect. If you don't have control over how the data types were set up, you can use CAST to treat fields differently to some degree.
Treating data differently than it's stored in the database is something you can do in SQL, or something you or your developers can do later on down the line, when you're working with the data in a different program, like a Web app or reporting software.
- Name the predicate of the following statement: SELECT EyeColor, Age FROM Student WHERE FirstName = 'Tim' ORDER BY LastName ASC;
- Explain what to use to enforce the order in which an expression must be evaluated if the WHERE clause contains multiple expressions to evaluate.
- Identify the best option to join two tables in a database to be able to display data from both.
- List a data type that is not numeric.
- Determine the result of running the following statement on a table containing columns col_1 and col_2:
- INSERT INTO Box (col_1, col_2) VALUES ('A', 'B'), ('A', 'B'), ('A', 'B'), ('A', 'B');
- Determine the best approach of deleting Jon Ramirez (ID 3452) from a Student table.