(upbeat music) - [Interviewer] What can I do? I'm interested in this thing called data science. How can I prep for it? - Yeah, I think the first question is, what everyone should be asking is, hey, how do I get interested in using data? And how could I play with it? And how could I see what data can do? And that's the first set of things. And what are the classes, the courses, and everything that you can roll up into that to make that real? The first one that I would say, which is not the answer that people are probably going to be looking for is you start with the liberal arts.
Because you actually have to first ask what problem are you trying to solve? And how is this problem going to actually have a differentiator? The best education that you can get in that is the humanities. Because it teaches you about what are the specific problems that add value and how to think about the complexity of problems. On top of that, you then need the classes that are gonna teach you the basics of how to actually interact with data as well as how to think about using techniques on top of that data.
And so those are the basic ones, which are some type of programming. Typically those are in computer science. But there's also the mathematics that you need to have. There's a lot of other areas where I think there's a tremendous amount of value added that are there and physics, and other type of sciences. You know for me, some of the best data scientists come out of programs like out of oceanography and astronomy. These places where you work with data because that's the only way we can understand the systems. The only way you can understand the weather out there is by taking lots of observations.
When you're finally getting a chance to work with the data, it's also important not to work alone. You have to work collaboratively with data. Data is a problem when you add more people, it gets better because there's one way of looking at it, there's another approach to looking at it, and it's just this complexity. And many times you have to bring lots of different data sets together. You have to go find those data sets. One of the things I tell college students all the time is why are you trying to go out and just scrape the web for other data when you could just go down the street to the local you know social services center, maybe it's data on a food pantry, or maybe it's data around traffic in the local city or something, go work on that data.
Because not only are you gonna have the data, you're gonna have people who have context around that data. If you're working on an abstract data set, sometimes you just don't have, you can't understand it. The other thing that's in there, if you're training to be a data scientist, is how do you start continuously asking what else you can do with the data? What if you brought it in with this other thing? He's augmenting the data, trying other things. But then also the storytelling of data. How do you actually showcase what is out there that you've done? You know there's three things we teach every day to scientists that you must be able to do.
Number one, when somebody sees your data set, or a way you're explaining it, what do you want them to take away? Like what's the actual thing that they should be able to go say this is what they showed me. Two, what action do you want to take, want the person to take? If you're showing that data you know maybe just on a blog post, what do you want them to do? Do you want them to share it? Do you want them to print it out? Do you want them to comment on it? If it's a dashboard, you know do you want the person to call a meeting? Do you want the person, what is the thing that is a concrete action? So those first two, for anybody from the liberal arts world, the humanities know that's like communication 101.
The third one is the one that we most often leave out, which is how do you want the person to feel at the end of that data, after seeing that data? Do you want them to feel inspired? Scared, angry, excited? We're humans. We shouldn't leave out this idea of how do we engage with the data? For information to stick, it has to be connected with an emotional response. And we've actually seen this concretely is when data is presented with that emotional response, people remember it.
People carry it forward. People know what to do. They take a different type of incentive. So in an educational system you have to kind of wrap all those things up. And the data science programs are starting to do that and they're necessary but not sufficient, meaning you can't just say, I'm just taking a data science classes. You gotta do more. You gotta take the additional classes. You've gotta engage with other data sets. You've gotta start going to other programs. (upbeat music)