Join Barton Poulson for an in-depth discussion in this video Data sources for nonprofits, part of The Data Science of Nonprofit Service Organizations, with Barton Poulson.
- [Instructor] If you've ever watched competitive cooking shows like Chopped or Master Chef, then you know how creative people can get trying to make something amazing with the limited or possibly bizarre ingredients they have to work with, not to mention the limited time. It's an exercise in creativity under pressure. And to a certain extent, doing data science in a nonprofit environment has some of the same feel. In addition to the constant time pressure, you often have to get very creative with the ingredients you have to work with. In this case, those ingredients are the data sets that are available to you.
Every nonprofit has certain data already at their disposal, and other data sources may be available with a little work. I want to talk about both kinds and how they can be used for creative data science within nonprofits. The first are existing data sources. Now, these include some very obvious things, like your Rolodox, that is, your existing list of donors, people who have already given you money. That's gonna be critical to you. Or email lists of potential donors or current customers.
You may have service records, what you have done, for whom and when and how often. You may have evaluation data where you have gathered some basic metrics about the outreach and impact of your service. And you may have web analytics data, too, that tells you, for instance, how many people have visited your website, or visited or forwarded your social media posts and so on. Now, what you want to do is not just have these lists separately, 'cause usually they're housed in different places, but you want to combine the lists when possible.
So much of the value of data science comes from putting together these various sources. And you can then answer questions like, are the donors on your email list, have they been patrons? And if you have access to login data or data from tracking pixels as part of your analytics, then maybe you can tell how often each person visits your website, and what sorts of things they click on, and that can be used to predict their involvement with your organization. And so, these are the sources that you already have, although they may be scattered around, get them unified.
And then what you want to do is start looking at ways that you can augment your existing data. One of the easiest ways to do this is with open data, that is, data that is shared by large organizations, usually governments, the federal government, a state government, and it may have information on economics and demographics. Also, it can be shared by an organization, like Google has information about searches, and Yahoo has financial data. And there's a lot of other potential sources for data.
So for instance, the Jane Goodall Institute for Wildlife Research, Education, and Conservation uses open data sources to monitor the number of chimpanzees still residing in the wild, as well as patterns of deforestation in chimpanzee habitats. In addition to those, they also use apps that give local residents the ability to track activity on smartphones, tablets, and the cloud. This is using free software from the Open Data Kit, that's at opendatakit.org.
A historic example of non-governmental agencies, that's nonprofits, utilizing open sources of data from the government goes back to when volunteers on the ground use the United States CIA World Factbook, that's the Central Intelligence Agency. This was, for many years, the best way to get demographic information about different countries. And it could then be used to also get cultural information and anticipate some of the things happening in places.
And there were often partnerships between these nonprofits and the open source data from the CIA and other places. On the other hand, this information is much less frequently used, but it serves as a historical touchpoint. Also, internal actors and nonprofits within the United States have also leveraged government census data to help drive memberships by seeing where people are located, the density, what the ages of the people, other demographic characteristics.
And they can use that to drive membership, they can improve programs based on, for instance, an aging population. This was done, for example, by the YMCA in Austin, Texas, where they were able to target the people whom they would best be able to serve using these existing patterns in readily available census data from the US federal government. You also have the option of gathering new data. Now, you can do this in a lot of different ways. You can do online interactions, and that actually makes things really simple, as opposed to having to do the old paper surveys.
The thing about gathering new data is it allows you to do things like surveys or interviews or other forms of data to fill in specific gaps in your existing data. So sometimes you're able to infer things with a certain degree of probability, but you need more than that, then you can just go out and ask. You can also get data that fits well with your analytical paradigm, the kind of approaches that you're using. Some methods work very well with dichotomous outcomes, where you have either this or that.
Others work much better with quantitative outcomes that might, for instance, be normally distributed like a bell curve. So get the kind of data that's gonna fit well with your analytical goals, the methods you're going to use. But one important thing is that, no matter how you decide to gather new data, is really to try to keep the methods as simple as possible. In my experience, most data surveys are way, way, way too complicated, and they got too many questions and too many options, and people are assuming for too much detail.
You want to get a very quick impression. You want to get people to engage, so you can't ask them too much. You want to get a very reliable score. So a yes/no is usually the best. But there are other places where you can learn more about the details of constructing surveys and getting the kind of information you need. But the point is, when you are working with data in your nonprofit, you have multiple existing data sources. You have multiple sources like open data from the government and other places.
And you have the opportunity of gathering new data. You get that data, you combine it, and then you have the power that you need to do something amazing in data science.