Join Gini von Courter for an in-depth discussion in this video Transform data, part of Power BI Pro Essential Training.
- [Voiceover] We're going to continue with the PBIX file that I saved at the end of the last movie. And it is called ReducedStates.pbix, and that file type will automatically load, Power BI Desktop for me. Now, it won't take me into the query, and it doesn't automatically show me the data. If I wish to go back and edit my query, I can drop back into the Query Editor from here. I'd like to show you the Transformation Tools, that are available to you.
First, we have a Subset of Transformation Tools, that allow us for example, to Split a Column in half. If we have for example, city comma state. Then we can split that based on a delimiter, or if we have account numbers, and the first five digits perhaps are the department and the next four are the cost center, and so I can do that by a number of characters. I'm also allowed to Group, based on values, and a selected column.
So I might do this, if for example, I had a region and I had then states, or a territory and states, then I could select Territory and Group by Territory, and it would group the states. I can Modify the Data Type for a column, and here are my Data Types. One of the things that has improved in Power BI, particularly in Power BI Desktop is, it is better at reading numbers as number types, when I imported this in prior versions of Power BI Desktop, this exact data set had all of the numbers represented as text rather than numbers.
And then I needed to transform them, but not hard, that's what this is for. Remember that, when I pulled our data in, one of the choices was to bring in the data set that did not have the headers and had instead, column one, column two, column three and so on. Notice, Use First Row As Headers. Now, if it gets it wrong, I can say you know, Use the Headers as in the first row of my data. And check out what happened here. That's how it would've looked, had I brought the spreadsheet in, rather than the table. Or I can say Use First Row As Headers.
So, not hard to do at all. And then Replace Values, allows us to replace the values in a column, with a specific new value. In addition to these frequently used transformation types, particularly Using First Row As Header, and Modifying the Data Type. We have a whole tab of Transformational Tools. Again, well beyond the scope of this course, but we're going to use a couple of these right now.
The data set that I have here is not unusual. It's essentially a table, and sometimes what we get is a pivot table. Down the left-hand side, I have states, and across the top, I have years, and each of these intersections is a data point. However, often that's not what I want. What I would like, is to have Alabama, 1790, and a number. Alabama, 1800, and the number 1250. Alabama, 1810, 9046, and so on.
In other words, rather than having 43 rows of data, I wanna have about 400 rows of data, each representing a data point, rather than having a table, I want to De-table this. And the way to do that, is to actually to Unpivot it. This is such an amazingly powerful tool. I can either Unpivot all of the other columns that aren't selected, or Unpivot the columns that are selected. So one choice is for example, to choose these two columns, just Hold, Shift, and Select them, and say Unpivot Everything Else.
And look what I get when I'm done. Now, the problem is that, that key column is also interjected here. And I really don't want the key column so, you'll notice that you're not really seeing an Undo here, right? Cuz our Undo's over there, Unpivot Other Columns, bring it back, yayyy. Okay. First, let's get rid of this key column. We don't need it, at all. So, you'll remember that we have the ability, to get rid of columns, we saw that here.
Remove Columns, great. Now my key column's gone. Now, this will be a little bit better. Let's choose the columns that are going to appear in every single record. It might be that I don't care about the date they were admitted either, but assuming that I do, we'll just hang on to it. And now, let's go back to the Transform Tab, and let's Unpivot all the other columns. There's my data, row after row. I love this.
That's incredibly powerful. I have a pivot table, and what I want, is rows of data instead. That rocks. I have some other choices on this Transform Tab, that I'd like to point out to you. First, occasionally I will get a set of data, that is just wrong. It's really wide, and very short. Imagine that, rather than having the names of the states on the left, and having across the top, the years, the years were on the left, and all the states were across the top.
In that case what I'd like to do is to be able to Transpose my data, and it will then turn at 90 degrees. Columns become rows, rows become columns. I can also Reverse the Rows, and this would typically be, that I had a data set that was sorted smallest to largest, or largest to smallest, and I wanted to change that. I can find out the number of rows that are in this table. It's not hard now, and you might wonder well, why would you need to see that, it might be that, all my data isn't loaded for example, but I can pretty easily go see the count of rows, 242 right there.
I can again, look at my, Modify My Data Type, I can Rename a Column. This is so useful, because when we were working with catalog requests, in Power BI Services, it's called Cat, underscore, Request and it's hmmm, I'd like to be able to change that. But if that data in Excel needs to be called that for some reason, and the reason might be as simple as, it's not my data set, I don't get to modify that. Notice, that by bringing that data set in, to Power BI Desktop, I can rename it, and I'm not touching the data in Excel, I'm connecting to it, and I'm modifying it here.
If I wanna change this, so instead of saying Name, it says State. I've renamed the column, but Excel doesn't change at all. I have the ability to Fill, Down or Up columns. I'll typically do that if I have a calculation, and then want to drag it up or down the column. Some other choices, I can Extract data out of a string. I can either find out how long it is, or I can pull out first characters again, and account number, the last four digits from a credit card, the first five digits in a phone number, that includes the parenthesis to give me an area code.
So all examples of things you might like to do with data. And below that, the ability to Parse Data if you wish. To the right we have some statistics that are available to us, only in number columns. So if I choose a value for example, notice this all fires up. Here are my Aggregate Functions, Basic Math, Scientific, Trig, Rounding and so on, and if I had a date/time column, then I would have access to these transformations as well. There's a lot here, and when you're working with data, much of the data that you have access to, may not be as clean or pristine as you would like it to be, it's not unusual that I need to modify data types, that I've had somebody type text in a column that should have only values, and I need to remove everything that's not values, that I need to be able to rename columns and so on.
And this Query Editor Tool does that for us, every single time, and very easily. Even the most complex problems, can be solved here. It's just that you may have to take multiple steps, in order to do it. One more thing. You've seen me Remove Applied Steps. I'm allowed to remove an applied step way back, and this isn't like Undo in Excel, where it's going to undo everything back to there. It will actually only undo that step.
And you'll get a warning that says, this may actually break something else, cuz it's possible that something we did afterwards, depended on this, but it's possible also, that it didn't. I could bring that United States row back right now, it just wouldn't fit any longer. But if I had done it earlier, before I had Unpivoted, after I had Removed columns, but before the Unpivot, it would not have posed any kind of a problem here. So, if you look and say oh, I, I would lose everything I did after that, not necessarily. I'm finished with this data set for now.
I'm going to Close the Query Editor, and it will apply any of my changes if it hasn't. And then I'm going to say File, Save As, and I'm going to save this as ReducedStates-Final, and I'm going to Exit Power BI Desktop, and I'll see you in the next movie.
- Signing up for Power BI Pro
- Connecting to data sources
- Uploading data such as CSV and XLS files
- Creating reports, visualizations, charts, and maps
- Filtering, sorting, copying, and pasting visualizations
- Downloading custom visuals from the gallery
- Modifying existing reports
- Creating and managing data dashboards
- Querying data with Power BI Q&A and Microsoft Cortana
- Sharing report and dashboards
- Using Power BI Desktop and mobile apps