Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member
So far we've seen Unix tools that will let you find, translate, and replace. The next useful Unix tool I want to introduce you to is cut. Cut allows you to cut out selected portions of each line of a file. We could probably write a sed command that would do something similar to what cut does, but cut is much simpler and easier to use. The first thing you need to know about cut is that it can cut three things, characters, bytes, or fields, and we are always going to need to pick one of those. We are only going to be looking at characters and fields, and the reason why is that bytes in English is going to be exactly equivalent to characters, because every character takes up exactly 1 byte.
And really, bytes are there mostly for when you are thinking about raw data, not actual characters, but raw data, and you want to grab a certain amount of bytes out of it. We're working with text files here, so we're really thinking about characters, and most times that's what you will be doing. So characters and fields is where we're going to focus. So let's start with characters. Notice that I am inside my user directory, inside unix_files, and inside there I have this file that I created earlier called dir_content.txt. If we take a look at that file you will see that it's just a directory listing.
It looks very much like the output from ls-la, because that's exactly what I did, ls-la earlier, and I directed that output into a file. So this is not my current director listing. It's a snapshot of what it was previously. So here's the scenario. Imagine that we have a file like this and we say, you know what, I like the data here, but I really wish I could just grab a selected portion of it. For example, I really wish that I could grab these permissions out of here, just that line and that part of this line, all the way down, essentially grabbing a column.
Just wish I could grab that column of these permissions out of it. That's the purpose that cut serves. So we say cut and then we need to specify an option. Always have to specify an option. Either -c for characters, which is what we'll be doing, or -b for bytes, or -f for fields, which we will see in a moment. And then we tell it what characters we want to cut. Well, out of each line we'd like to cut characters 2-10. That's what represents these characters, starting with character 2, going until we get to character 10. And then we need to say the file that we want to do that from, and there it is. It's that easy.
Now, notice that it also grabbed this up here, total 144. That was the top line here. It went ahead and grabbed that as well. So be aware of that. You actually, I believe, can suppress the output of this from your directory listing if you needed to, but for our purposes, we just really want to illustrate the way that cut works. Now, let's say we wanted to cut something else out of this. We can cut more than one thing. We can grab more data. So in addition from cutting the permissions, let's say that we also wanted to keep the file size. So we wanted the permissions, followed by the file size, and then after that we will grab the file name as well.
So what we'll be left with is permissions, size, name. Everything else will end up being removed. So what we need to do is we look at the data here and you can see that this is the last column, so what we want to do is we want to find out how many characters is this? How many do I need to skip over essentially till I get to more data that I want? So I am going to use Command+C to copy that. Let's just do echo and we'll put that in quotes and pipe it into Word Count. Word Count tells me it's 21 characters. So now I know I need to skip over 21 characters, so 2-10, and then we'll do a comma, followed by skipping 21 characters starting with 10, so that would be 31.
Now, I would also like to leave a space between them, so I am actually going to keep the space in there before it. So that would be going to 30 and then I'll go up to 35. That will give me the space plus the four characters here that represent the size. So there we go, now I have the size. Let's say I also want to get the name, and it's here. I want to leave a space here so I'll keep this space in. So I want to know how wide is that? Copy it. echo it into Word Count. That's 14 characters.
So now I go back up here and I add 14 onto this and I come up with 49 to the end. If you want to go all the way to the end, you can just put a dash with nothing after it. You can do the same thing at the beginning too if you knew you wanted to get everything at the beginning. So now, look at that. I have sort of my own custom ls listing. We can take this same thing, let's copy it, and we can actually do that. ll, pipe through this cut statement, and look at that. I can see exactly the data I want. I can leave everything else out of there.
You could take, for example, your history file and let's grep that for everything that has fruit in it and then let's take that and let's pipe that into cut, and I am going to cut everything 24 characters to the end. So I have got just then the command itself. I left off everything that was at the beginning of it. Or we could take ps aux, your listing of all your processes, and let's cut 11-15 and 72-end. Now, I have just got the process ID and what's actually running? Everything else that's inside ps aux. pipe it so we don't see everything.
Now, all this stuff in the middle, that's all gone. I can really condense it down to just what I want to see, just the process ID and what is running. I leave out all those other stats. So that's how cut works. Cut also has this -f option that is really nice. I have a file here that I created earlier called us_presidents.tsv, tab-separated values. So it's just an information about U.S. Presidents that is tab separated values. You could use any other tab delimited file that you wanted to for this.
The tabs are important, and the reason why I say that is because if we use cut -f, by default what it's going to do is use those tabs to figure out where the columns are. So if I say I want fields 2, 6 out of us_presidents.tsv, look at that. I get just the column that has the President's name in it and then I get just the state that they're from. Notice that it also kept the tabs as delimiters between these. So it's great! You can have something that's like tab- separated values from a spreadsheet and you can just grab the columns out of it that you want.
We have something here which we can then save as another tab-separated file. So we can save this as presidents_states.tsv. And now we have this file saved that we can then continue to work with. We can now join that with other things, we can use it in sed scripts, whatever we want to do. We've just zeroed in on the data that we want. Now, I want to show you that we have another example of a file. I have us_presidents.csv and that's comma-separated values. We worked with that earlier as well.
It's the same thing, but instead of tabs we have commas between the data. If we tried the same thing that we did before but on the csv file, let me just clear this so you can see, using csv this time, asking for columns 2 and 6, it doesn't work. It gives me everything back, because it was expecting the tab to be the delimiter. If you want to change the delimiter, then you have to use the -d option. So -d and then tell it what delimiter is. So if it's a comma, now it does it.
Notice that it still kept the comma as the delimiter in the output as well, but the main thing is that this -d followed by what it should be looking for. By default it's tab. There are a couple of other options that it offers. There is a -s option and that would do nothing to lines that don't have delimiters. It would just let them through on their own. That's all there is to using cut. It's a fairly simple tool to use, but it's really quite powerful, especially when used in conjunction with a lot of the other Unix commands and techniques that we've learned.
Get unlimited access to all courses for just $25/month.Become a member
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.
Your file was successfully uploaded.