In this interview with Diana Yoo from Capital One, learn about the particular challenges data visualization practitioners face when working within large organizations, and her concept of democratizing insight, which is at the heart of Capital One's data visualization work.
(upbeat music) - So I'm very happy to have Diana Yoo with me today. She is the head of data visualization design at Capital One. And we're here to talk about data visualization in the industry, so, Diana, thank you very much for joining me here today. - Thanks for having me. - So I do want to talk first off about your title. You are the head of data visualization design, which means that Capital One has a department of data visualization essentially focusing on this, which is fantastic. And I'd love to learn a little bit more about your role and the culture at Capital One that has this focus on this craft that we do. - Absolutely. So I am the head of data visualization design at Capital One. And that means I lead a team of data vis designers, UX designers, and front-end developers whose mission is to create products, tools, and works of art that allow people to ask richer questions of their data and find more humanity in their answers. And Capital One is a company that was born with a reverence for data and the competitive advantage that it offers. Back when Capital One was first founded it employed an information-based strategy rooted heavily in data analysis and modeling, which was a significant key to its success in acquiring new customers early on. And kind of the idea of grounding all decisions in unassailable data is just part of the DNA at Capital One. We're a company with a culture of creating presentations and decks the way you might write a research paper with a thesis statement and supporting evidence. So we take it really seriously. I think when you have an environment where data fuels decision-making, and there's an ever-increasing amount of data, not just in volume but in variety, there's a need to make it consumable for the human beings who are analyzing and investigating that data. And about five years ago, Capital One invested heavily in human-centered design as it looked to develop technology that would transform banking. And we recognized that tech adoption is rooted in human experience. So we poured a lot of effort into transforming our digital presence, especially for our customers. So I see Capital One's investment in data visualization as a just natural confluence of our investment in data and our investment in design. - That's great. It's so interesting to hear you even use the word "art." I heard you throw in that word "art." I mean, that's a bank actually thinking that way is amazing, it's fantastic. So yeah, you just said it's been five years, you've been doing this for awhile, you're beyond the hey, we're hiring a few first staff members in this area, we're beyond let's define what this means that we have a data visualization department, so you're really well into the process of operationalizing this. So where are you now? What does that really mean today? - Yeah, that's a great question. I'd say that the operationalizing, I can't even say that, operationalizing of the practice is still very rooted in the individuals on the team. As you know, no two data vis practitioners have the same exact training and skills. On our team we have a mix of data analysts, data engineers, visual designers, user experience designers, and front-end developers. And for the first few years we operated as a bit of an internal agency or a consultancy to support efforts around the organization. We worked heavily with Capital One Center for Machine Learning, where there were opportunities to create consumable visualizations of the results of the machine learning models they were working on. And we also worked with the team that developed Eno, which is our intelligent digital assistant. And we produced some lightweight visualizations of conversation data that helped them understand conversation patterns and improve the interactions with having with customers. And those efforts were good pilot opportunities. But as we've matured, we've looked to have a bigger impact on the enterprise, and we're doing that by embedding our practice in some key enterprise platform initiatives as part of a multidisciplinary group of UX designers, engineers, and product and business stakeholders. They're data intensive products for a mix of technical and nontechnical users. One of our team mantras is that we create experiences that clear doubt in the minds of our users. And my team is employing both human-centered and data-lead design in which we're considering data as a raw material in the products that we build. We're avoiding the trap of building a tool or a platform and then having a section where the charts are. And it's just so amazing how often the way data is stored just somehow naturally shows up in the interface and you end up seeing software with a lot of spreadsheets or data table-like screens. And when we're working, we're questioning why are we showing this? Why does the user, what does the user need to do next? We're surfacing patterns and we're finding ways to help tell the story so that it breaks through the noise. And I think beyond that, again, as a mature part of the organization, we're working on building a broader community at practice around data visualization. Because we're certainly not the only team doing data vis. We're a smallish team, especially considering how many employees there are at Capital One. It's somewhere in the realm of 50,000. So there are thousands of data scientists, engineers, and data analysts producing visualizations of varying types on a regular basis. And nearly every person at Capital One is also a consumer of data vis in some way. So my team runs a monthly meetup for these practitioners and we call it We See Data. Where we share work in progress, we invite outside speakers, conduct critiques and design jams. It helps us connect as a community and inspire each other as well as elevate the standard of the practice around the organization. I think we're also developing training content with the goal of spreading awareness, about the fundamentals of cognition and perception that underpin good data visualization design, and there's a huge appetite for it internally. Last month two members of my team presented on storytelling with data at our internal data week conference that we put on every year. And hundreds of people attended. They couldn't fit them in the room that they'd booked. It was beyond capacity. So these are all folks who are skilled at accessing, and processing, and analyzing data, and they're looking for better ways to convey their findings. - Yeah, I think that's a really good point. Even going beyond data scientists and data analysts, these days just about every person in a company is working with data, right? They're all doing PowerPoint decks, they have KPIs they're tracking to tell their managers how great they're doing, or where they need to improve things, et cetera. And particularly for our audience here on LinkedIn, yeah, we have data visualization people and data analysts and data scientists who are going to be seeing this, but we have HR people, and IT people, and all kinds people in all kinds of roles who all touch PowerPoint and have to create decks based on data. Which is a good segue to my next question. I know that when we spoke a couple months back you talked about this term the democratization of data, which is just as hard to say, by the way, as operationalizing. But the democratization of data, which is this idea that, kind of as you were just alluding to, data is at the core of everything. It's not an afterthought. It's central to everything that we do. And also it's central to the organization and everything that you do. So can you define that term democratization of data, and then explain what that means in terms of your practice and at Capital One generally? - Absolutely. I think about democratizing data as a way of making insights accessible and understandable. In some cases, answering questions before they're even asked so that we can ask an even richer question. At Capital One, data is so foundational and ubiquitous it's almost like water. In fact, we refer to our primary cloud data storage as a data lake. And if data at the company is like water in a lake, how do we get it in a form that we can use? When you turn on the faucet in your house, for example, there's a whole lot that goes into getting that water from its source, because the way the system is designed, I don't need to be a plumber, you don't need to be a plumber to get access to the water and use it. And at Capital One, like many organizations, data scientists, analysts, and data engineers have specialized skills that allow them to query the data to transform it and analyze it. And that work they do is critical, but it's easy for those data activities to be hyperlocal and not discoverable by others around the organization, the folks in HR and others that need to use that information. So my team is looking at how do we give the rest of the company the ability to make sense of and glean insights from the vast amount of data that we have and how do we do that in a way that improves the way we do work and provides line of sight for those we need it? And so the way that is going at Capital One is that we're taking both an explanatory and an exploratory approach to it. Explanatory visualizations that we do reveal unexpected patterns and provoke new ways of thinking about data. And explanatory data storytelling can have different levels of zoom depending on the audience. However, we spend much more of our time in the space of building exploratory tools and platforms that offer self-service access to data and customize views to streamline and democratize data analysis. And I think that's what I really think of when I'm referring to democratizing data. - That's great. As part of that process in trying to bring data to everybody, that's about data access, that's about seeing the insights, it's about so many different things. Are there any particular challenges that you've faced in doing this and interesting solutions that you've come up with that maybe others can benefit from? - Let's see. What can I share? (laughs) I think this kind of gets to a question of building and building data visualizations that scale, for example. And I think one of the biggest challenges that we have at Capital One, we're probably not unique in this, but because we're part of a highly regulated industry, our data ecosystem has additional complexities. What we store, how we store it, who has access to it all need to be tightly controlled to protect our customers. And all of our digital work must meet a very high standard for security and resiliency, which can limit our ability to build off of, say, existing frameworks or opensource libraries depending on how they were built. So I can't get into the details exactly of how we've solved that, but I think there are maybe some more universal challenges with data visualization at scale that might be worth talking about. The first is performance. We're dealing with huge datasets. And things can really fall down if you don't get the data engineering right. In every other area of our digital lives we have expectations of immediate load times and crisp responsive UI. And then we have batch data pulls that take 24 hours to run because they're so massive. In one example for an anomaly detection-based visualization tool we needed to be able to load up to a few thousand anomaly records depending on the parameter set by the user. And our data engineers needed to refactor that data storage and the method of calling it a few times before getting those load times where we wanted them. I can't say much more about that or I'd be giving away some secrets, but that's one of the things that we've tackled. I think the second challenge is creating more of an integrated experience that goes beyond just the visualization itself. And I think that's kind of where things have started to evolve here. A single chart or a bespoke visualization that lives out of context doesn't scale. To be truly useful in an industry setting, you need an application that serves up the data in a way that you need it and allows you to move seamlessly between filtering and annotating, communicating with colleagues, producing and preserving views, and presenting for executive consumption. I think that's one of the key challenges we're tackling right now in our work. How does the visualization fit into the daily routine of the person producing and consuming it? What actions might they need to take? What other data views might be relevant? And how do we make sure the most important information or insights rise to the surface without requiring any action from the user? We've had to balance the flexibility of the tool and the ease of adoption. The more flexibility a tool has, the steeper the learning curve, especially for those that are more nontechnical users. And I think my team has found some ways to do this well applying UX principles and building some modular components, and I have a few examples that I can go into a little bit more detail about, if that's helpful. - So I'm going to ask you sort of a slightly different question. You just used the word "bespoke." And earlier I made a joke about it how you used the word "art," which are surprising terms and when we're talking about data visualization at scale in industry especially at a giant bank. But you brought up the word "bespoke," and it brings up another question which is related to all of this. So data visualizations that win awards, data visualizations that we see in "The New York Times" and all these wonderful publications that are doing some of the best work on the planet right now, they are highly designed, very custom, beautiful, wonderful, unique, special things. And I'm wondering, you've used the terms, how are you doing that, or if you are doing that, and are you able to figure out how to create these beautiful, bespoke, wonderful, special things at scale and make them repeatable, et cetera? So I'd love to hear your approach to this. - Sure, well, first of all, I would love to think that scalable design will also win awards, 'cause I have an amazing team that deserves to be recognized. I think it also depends on what you mean by scalable. One of the differences that I see that much of the bespoke visualization, I see winning awards and getting visibility is primarily a fixed dataset explanatory visualization, perhaps with some exploratory angle, but always with a fixed known and often well-researched dataset. And in the cases of "The New York Times" and other publications, there's a point of view, there's a point to be made. You may be able to find additional points if you were to explore, but there is typically some sort of point. I think building effective design for a streaming or live dataset is much more difficult because there are many unknowns about scales, the minimum and maximum values, not to mention what may turn out to be interesting or relevant in the data to surface as that data changes. Whether it's a streaming data source or a more batch regularly updated source. So I think it's certainly possible to make that type of visualization beautiful. I think Moritz Stefaner's work for the train system in Germany and his seasonal wind prediction visualizations are case studies in both beauty and scalability. And because much of what we're doing here is exploratory, the beauty and the utility is in the flexibility of the levers that we offer the user. How do we create a visualization that's flexible enough to adapt to changing parameters without creating an inscrutable sea of dropdowns and menus and filters? I think a really elegant and simple solution we've implemented is just a dynamic visualization header that was built upon syntax describing the nature of the parameters the user had selected. And those selection dropdown menus are built into the header itself, so that not only do you know what can be modified about it, but you know exactly what view you're looking at at any given time. And I think some of the work that I'm most excited about scaling in industry is work like the example that I can show where the data and the visualization itself is the interface, the user navigates through the data the way they might in a video game, and if we're approaching it right, the interface should adapt and scale to the data as it changes. - Hmm, that's very interesting. I look forward to seeing that. That sounds fascinating. Yeah, I think that we've all suffered through inscrutable dashboards that offer you a thousand dropdowns and a million filters and buttons because every variable, every field in the database, well, let's make 'em able to filter on that too, why not? And while that sounds like a good idea, it always, from a user experience standpoint, almost never is. And so that is, I would imagine, one of the great challenges you face, 'cause exposing limitless data with limitless fields to people in a way that they can find new patterns and stories is almost an impossible task. So if you're solving that, then I think we will all benefit from that in the future. Speaking of the future, we only have a short little bit of time left. I just wanted to ask you where do we think we're going in the future? Do you think that, and especially in your role in data visualization and industry, is there a revolution coming? Is it more of an evolution? Where do you see the opportunities for change, exciting developments, et cetera? - Yeah, I think probably a few places. Certainly I would venture to say our tools are going to evolve. When I think back to some of the clunky data processors we were using just 10 years ago, I have to imagine that the next 10 years will bring the same level of improvement and evolution there. But when I think about the future of data visualization, I think there's two big things on my mind. One may be more evolutionary, and the other may be a bit more revolutionary. And the evolution I'm thinking about is already well underway. Because Bill you spoke to this for a minute earlier. For a time, data visualization the way I think about it has been a specialized field. Practitioners who sit at the intersection of expertise in several areas would be data visualization practitioners. I think the evolution will be that data visualization becomes a discipline that is ubiquitous to life in the modern workforce. The same way that office workers are expected today to have a baseline understanding of using a word processor, for example. We'll all be expected to have a baseline data literacy and graphicacy that goes along with that. There won't be data people anymore. We'll all be data people. And we'll all regularly need to process, understand, and communicate insights from data. And in that world, tools that provide a low barrier to entry are going to be tools that will enable that to happen. The other thing that I see for the future is the ways that machine learning will augment and transform how we develop visualization, again, I think this is already happening. We've reached a point with multidimensional data so big that it's not possible for humans to pore through visualizations of all permutations to find points of interest. We need those things served up by machines and let them do 95% of that work so that humans can do the 5% of the work that requires that nuanced judgment and consideration of context in a visualization. - Yeah, that's a great point. And in fact, I was just looking at an update to I think it was Power BI recently that now has AI built into it. You can just say "show me interesting things" and it will generate a dozen or however many visualizations of the interesting things that it determines based on the data that it sees. And so, yeah, I think machine learning is going to takeover everything and that's clearly one avenue that's going to be very beneficial to all of us. I also have to make the point. So I actually interviewed Francis Gagnon the other day and he had a great line talking about how, you just said, we're all going to be data people. Which just reminds me of it. 'Cause what he said was I asked him about his path and how he became an information designer. And he said, "Well, I did it the way," or, no, what'd he say? He said, "My story is very common "in that it's completely unique." Because it's not a discipline. Most of us, we're not trained in this. This is a weird field to be in now. But yeah, in 10 years, maybe even starting now, 'cause I've met one person who was actually trained as an information designer, people will be learning specific skills to do this work. And to your point, people will be, have to be data literate, have to be graphically literate no matter what their actual official role is. So that was a very long-winded response to what you said, but listen, I really appreciate you joining me here today. That was a great conversation. I wish we could go into more detail and hear some more of the stories from Capital One, but we'll have to do that another time. So thank you, Diana, very much for joining me and sharing what you could share with us about Capital One's efforts today. - Absolutely, thanks, Bill. Thanks for having me.