This video discusses that decisions must be made on how the currency of data will be maintained.
- [Instructor] Now that we've briefly discussed the process for publishing data on a government open data portal, let's talk about how often we might update that data, and the different options for keeping data current. There are several reasons why we are concerned with how current data is. First, a mantra of data as an asset is that the more current it is, the more value it has. For example, if you are working on housing planning for a community, it's more useful to know what the current population is for an area than what the population was three years ago.
Another reason we are concerned with the topic of how current data is, is that the process of keeping data current has a cost to it. That cost also depends on the frequency of updates. If we just uploaded a data set at a point in time and didn't change it, it would not incur any effort on the part of the agency. If it was agreed to update each year, that would take some effort, but it wouldn't be that difficult. However, if we agreed to update a certain data set every week or every day, that would be more complex, and would take some effort.
Let's distill this topic into five types of data currency. We'll begin with static data. Static data means it's data that doesn't change. This could be because the data is simply a point in time, but it could also mean that the data has not been updated for a variety of reasons. One reason is that it's too burdensome to update. Another reason is that it is data that is no longer being collected and stored by an agency. Whatever the reason, many open data portals tend to have static data.
If new data is available, it's important that an agency find a way to update it. We call this data dynamic data. Dynamic data is data that needs to be updated because new data has become available. It's the most common type of data state. What dynamic data doesn't tell us however is how often it's updated. This may be at the discretion of the agency. As we discussed in the video on managing risk, an agency should communicate the frequency of updates for a given data set. Next, we have near real-time data, and real-time data.
As these terms suggest, data is not only dynamic, but it is being updated frequently. In the case of near real-time data, this means that the data is being updated fairly close to the time in which new data is available. This might mean daily or even hourly. Either of those frequencies would be highly desirable for data users. Real-time data is the best state for data currency. As it suggests, this means that the data set that is real-time has the most current data reflected in it.
Both near real-time and real-time data are not yet that common on open data portals. As you can imagine, it's complex to set this up and maintain. The setup costs are the main overhead. In my experience, particularly when relying on data from a third party, there may be a significant amount of maintenance to ensure that the data is being updated. In other words, the updating process can often stop. The final currency I want to mention is archive data. In some sense, this has similarities to static data, but in this instance, the data is usually segmented by date.
For example, an archive data set might include individual years, or groups of years. Matching crime data stored in each year for 2013, 2014, and 2015 for example. The current data set for crime data might be near real-time. Archive data is often thought of as historical reference data. Determining the currency of data for each data set on an open data portal is an essential step in a program. It will determine cost, effort, and data value, and also set expectations for those that use the open data portal.
Dr. Jonathan Reichental introduces real-world use cases for open data, as well as the steps you need to take to develop and operationalize an open data program. He also explains how data scientists use open data to tell stories and drive data visualizations. Along the way, he provides numerous examples of open data in action: improving government, empowering citizens, creating opportunity, and solving public problems.
- Understanding what open data really is
- Current open data efforts around the globe
- Open data in action
- Designing an open data governance process, including policies
- Monetizing open data
- Storytelling with open data
- Selling the value of open data
- Measuring the value of open data