From the course: Python: XML, JSON, and the Web

Working with internet data - Python Tutorial

From the course: Python: XML, JSON, and the Web

Start my 1-month free trial

Working with internet data

- [Instructor] Working with internet-based data is a fairly common task for today's modern applications. In this chapter, we'll take a quick look at two of the most common data formats found on the web and then learn about some of the different modules available in Python for working with these formats. These days, data on the internet usually comes in one of two main formats, XML and JSON. You might see additional formats every now and then, but these two are far more common than the others. XML stands for the Extensible Markup Language. It's also important to note here that XML is a lot more than just a language format. There are many related technologies in the XML family for working with XML content. For this course however, we're just going to focus in the language itself. XML is very similar to HTML, the language used to create webpages, but with a few changes to help it represent more generic kinds of data than just web content. It's not as compact or lightweight as JSON, but it's very rich and expressive and there are different ways to process XML depending on what your app needs to do. JSON stands for JavaScript Object Notation. It is essentially a way of taking arbitrary objects that contain data and representing that data in a very concise text-based format. This data format was originally derived from JavaScript, but is now supported by a wide array of programming languages. It doesn't have the same expressive format that XML has, but it can be processed quickly and for scenarios where the dataset is relatively small, it may be easier to work with. What both of these formats have in common is that they're independent of the platform or the programming language where they're used and that's true for both ends of the transaction. For example on the server side, you might have a web service that generates data that's written in Java. And on the client side, you might have an app that consumes that data which happens to be written in Python. And each app can share and consume JSON or XML without having to worry about platform or language details. This platform-independent nature is what makes each of these formats so useful. In the rest of this chapter, we'll learn more about each of these data formats in detail along with relevant Python modules for working with them.

Contents