Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member
Python makes a distinction between text files and binary files. Even on operating systems with file systems that don't make that distinction, Python still does. So how you read and write binary files, even though it's still very simple, is different in significant ways from how you read and write text files. So let's make a working copy of files.py. files-working.py. Open that working file and we see we have our little loop here that reads lines of text and prints them to the screen, and let's just go ahead and take this out and put in olives.jpg and we will save that and run it, and we will see that we get this UnicodeDecodeError because Python is trying to decode the text in that JPEG file.
That's a JPEG file. If we open that up, we will see there is an image of some grapes on a vine there, and that it not a text file at all. So what we need to do is we need to open this in binary mode. So we are going to open it in Read Binary, like that. And now if we save this and run it, we are going to get this output, which is still not really what we want. What's happening is that the Print function is trying to print that in a way that's text readable, and we are not going to want to write that to a file if we are making a copy of this.
So what we need to do is we need to open an output file in binary mode and we need to use our buffered I/O. So we will call this infile, and we will open an outfile, and call that new.jpg. Open that in write binary mode, and we are going to do a buffered I/O here. So we will start by giving ourselves a buffersize. Let's see, how big is this file? Properties, it's 142K.
So we will just go ahead and make our buffersize 50000, and that will work fine. And the rest of it is very similar to how we would do this with the text file, infile.read(buffersize) and infile.read is not an iterable, so we have to use a while loop, length of buffer, and we will outfile.write from our buffer.
And we would go ahead and print a dot. So we see that something is happening on the screen and we will read the next buffer. When we are all done, we will print a blank line and we will print the word Done. You can see that we are done. So what's different here is that by opening the file in binary mode, we are no longer dealing with text.
The rest of it looks pretty much the same as how we do the buffered reading and writing with a text file, but the difference here is that we are not working with text at all. This buffer is now a binary object. It's not a text object at all. We will go ahead and we will save this and we will run it and we have got a few little dots there. And if we refresh our file system, which Eclipse does not do for us, we have this new.jpg. I open that up. There is our JPEG intact. So we know that copy worked.
I will look at the size of it. It's the right number of bytes, 142309. Is that the same as our olives? It is exactly the same. So we have an exact duplicate of that file. So reading and writing binary files, the methodology is very similar, except you have to use the buffered I/O. You are not going to want to use line oriented I/O for a binary file. And most of the time for text files, you are going to use line oriented I/O, although you can use buffered I/O for text files as well.
As a matter of fact, you can use binary mode for text files if you want to, and deal with them as bytes, and that will certainly work. But the distinction here is that with binary files, you have to use the buffered I/O and you have to use binary data types. So when you read that file, it's going to read it as an array of bytes and it's not going to read it as text. So this is how you do buffered I/O with binary files in Python. And it's really very simple and it's something that you are going to use now and then as you are dealing with binary files.
One final note. You will notice that the buffered read method is not an utterable. If I was going to be doing this a lot, I would go ahead and I would write a method for an object that is iterable. That would make this easier for me to do. It's certainly a matter of style. It's something that you might think about doing if you are going to be doing this a lot, and there is an example of how to make a generator function that generates an iterable in both the Functions chapter and the Classes chapter in this course. So that's how you do buffered I/O on binary files in Python.
Get unlimited access to all courses for just $25/month.Become a member
61 Video lessons · 100263 Viewers
56 Video lessons · 113233 Viewers
71 Video lessons · 82114 Viewers
131 Video lessons · 39408 Viewers
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.