From the course: Smarter Cities: Using Data to Drive Urban Innovation

What is open data?

- In the last video, we discussed that cities and their governments are creating, collecting, and storing vast repositories of data. The data is generated in the course of doing normal city business. Sometimes, there are requests for the data perhaps for academic research, media outlet use, litigation, or other uses like analytics that help and form city decisions. On occasion, systems consume the data for another purpose other than its original use. For example, data collected that is required for sending electrical bills to customers can be used later through another provider to help home owners make them form decisions and how they are using power in their homes. Data that is reused for other purposes makes it more valuable. In fact, repurposing city data is a remarkable way to inspire and power urban innovation. Many cities provide apps for their citizens to be able to report issues. This includes, reporting such items as potholes, street light outages, graffiti, and co-violations. Let's look at an example of how visualizing data collected can help with improving city operations. Now we're able to get a real sense of city challenges. Here, we have the X-axis for type of reports and the Y-axis for the number of requests. In Sacramento County, their top proprietary areas, full in categories such as waste management, animal care, and transportation. In this graph, from the city of Dallas, their top items are related to litter and signs in the public right of way. Finally, in Palo Alto, the city where I work, the top items spanned graffiti removal, sidewalk issues, and illegal dumping. These data sets confirm that issues always reflect a local perspective. Looking at this data makes it easier for city decision makers to make data-driven decisions. What makes this graphics particularly valuable, over historic approaches, is the ability for the data to be rendered close to realtime. This is because the capture method is an app and the data can be immediately sent to the cloud for easier visualization and analytics. The decision to provision an open API can also let anyone access an analyzed data. How might a city make its data easily available to anyone or anything that wants to help make a city smarter? Put it other way, how might a city open up its data? For this, there is a global movement called open data. I've created a separate course called Open Data: Unleashing Hidden Value. I suggest taking that course on completion of this one as I'm only going to, lightly cover this important aspect of building smarter cities. So, open data is about making data sets freely available without any restriction to anyone who wants them. Sure, it's about transparency and open government, but it's also about innovation. New ideas and new solutions can immerge when rich data is available. While open data is growing a popularity across the world, it is also being used in the private sector. Open data is defined more clearly by eight core principles. I'll share them here just as I do in my open data course. We'll start with the importance of open data being complete. Data sets that are provided should not be just a partial set or just one piece of the data. Where possible, open data requires the complete set for given data set. For example, if a government releases crime statistics for 2015, it should be for the whole year. Not just, say, January, April, and September. Next is the quality of being primary. This means that the data is from its source and is in its most granular form without being aggregated or modified. Let's use the example of visitors to a park. To be primary, the data provided should include all data that was collected on visitors. It shouldn't be grouped by, say, age or gender. While data may be processed for its final use for particular need, our open data should be the raw collected data. Another quality of open data is being timely. This one is quiet simple. It means that data should be made available as soon as possible. For example, if a government collects information on air quality, as soon as that data is collected, it should be made available. With data in general, the more current it is, offer results, and it be more valuable for those who want it. Next is accessibility. Quiet simply, open data should be available via the internet without any restrictions. To make that happen means, it should be available in multiple formats and not require any special technology to access. A file such as a comma separated values file, a CSV, one, where data attributes are separated by commas, is very favorable as it is so broadly accepted by computer systems. Now, here's an important quality of open data. It must be machine processable. If there's one aspect of open data that really helps define it, it's this one. Machine processable means that the data is easily consumed and processed by other computers and applications. Like accessibility, a CSV file is perfect. (laughs) This contrast with a file where data is made available as an electronic scam of a physical document. The nondiscriminatory quality of open data means that it's available to anyone without the requirements of, say, registering for the data. For example, users of a goverment's open data portal should not have to create a login name and password in order to access the data. Next, the nonproprietary requirement means that no one has exclusive control over the data. This could happen if the data was only made available in a computer format that require an expensive piece of software. If data is made available in a special format, other formats, such as a CSV file, should also be made available. In the spirit of open data, it should have the maximum accessibility and usage qualities. Finally, open data must be license free. This means that data should not be subject to any copyright, patent, trademark, or trade secret regulation. Of course, reasonable privacy and security restrictions should be allowable. As an example of license free, the open data should not require that the consumer of the data seek permission to use it. It would be assume that no attribution is required and there are no restrictions on use. With these eight qualities met, data is set to be open. It's open for use in urban innovation and for use by urban innovators. The big takeaways here are understanding open data and then recognizing that any smart city's strategy must have open data as one of its essential components.

Contents