#90 How Data Science is Transforming the Healthcare Industry (with Curren Katz)

The Importance of Long-Term Projects and Quick Wins in Data Science

When it comes to data science, it's essential to strike a balance between long-term projects and quick wins. As Jacqueline, Chief Data Science Officer at Johnson & Johnson Janssen R&D, notes, "You get that mix and then I think it's important to look at your portfolio you have for data science and go through and see how many of these are really going to be years before we see the value." This approach allows data scientists to plan their work effectively, ensuring that they're investing time and resources in projects that will yield significant returns.

In contrast to long-term projects, quick wins provide a sense of accomplishment and motivation for data scientists. "You get those short pieces and those quick wins as you say a lot to get you there," Jacqueline explains. These small victories help build momentum and confidence, which can then be channeled into more substantial projects. By balancing both types of projects, data scientists can create a sustainable workflow that keeps them engaged and motivated.

Portfolio Management

When it comes to managing a data science portfolio, it's essential to regularly assess the distribution of work across different categories. Jacqueline emphasizes the importance of "looking at how many long-term projects we have how many short quick wins do I have" in order to strike a balance between these two approaches. This self-assessment helps data scientists identify areas where they need to adjust their workflow and ensure that they're allocating resources effectively.

Additionally, it's crucial to consider the type of work being done within the portfolio. "It's okay to have purely exploratory i'm gonna play around with this data see if i can develop this model," Jacqueline notes. This approach allows data scientists to stay curious and experiment with new ideas, which can lead to breakthroughs and innovations.

The Role of Unmet Needs in Data Science

In any industry, including pharmaceutical R&D, there is often an unmet need that data science can help address. As Karen explains, "where is there an unmet need where we can bring data science in." This question should guide the development of new projects and solutions within a data science portfolio.

In the context of pharmaceutical R&D, identifying unmet needs can have significant implications for the company's overall strategy and direction. By understanding the specific challenges and opportunities facing their industry, data scientists can develop targeted solutions that address these needs and drive meaningful impact.

The Importance of Fairness in Data Science

As Karen notes, fairness is a crucial area of focus in data science, particularly in high-stakes industries like healthcare. "This one comes up a lot and it really affects any kind of high stakes industry," she explains. The concept of fairness encompasses several key aspects, including the detection of bias and unfairness in algorithms, as well as the ability to correct and fix these issues at scale.

The development of fairness capabilities is essential for unlocking the full potential of data science in healthcare. By creating systems that are fair, transparent, and unbiased, data scientists can build trust with patients, clinicians, and other stakeholders. This, in turn, can lead to better health outcomes and a more equitable distribution of care.

Future Trends in Data Science

Looking ahead, several trends are likely to shape the future of data science. Firstly, scalability will become increasingly important as companies look to leverage advances in AI and machine learning to drive growth and innovation. As Karen notes, "we're in a nice position to leapfrog other industries that have really perfected or made huge advancements in embedding AI into every part of their business."

Secondly, the importance of fairness will only continue to grow as data science becomes more ubiquitous in healthcare. By prioritizing fairness and transparency, data scientists can build trust with patients and clinicians, driving better health outcomes and a more equitable distribution of care.

Conclusion

In conclusion, striking a balance between long-term projects and quick wins is essential for creating a sustainable workflow that drives meaningful impact in data science. Regular portfolio management, identifying unmet needs, and prioritizing fairness are all critical components of this approach. By focusing on these areas and embracing the latest trends and innovations in data science, companies can unlock their full potential and drive significant value in healthcare.

"WEBVTTKind: captionsLanguage: enyou're listening to data framed a podcast by data camp in this show you'll hear all the latest trends and insights in data science whether you're just getting started in your data career or you're a data leader looking to scale data-driven decisions in your organization join us for in-depth discussions with data and analytics leaders at the forefront of the data revolution let's dive right in hello everyone this is adele data science educator and evangelist at datacamp two years into the pandemic the potential for data science and machine learning in health care has never been more apparent whether it's drug discovery acceleration operational innovation virtual assistance and disease prevention the margin of opportunity for data science and health care is massive however that doesn't come without its own set of unique challenges and risks that require unique solutions this is why i'm excited to have current cats on today's episode of data framed curran is a senior director for data science portfolio management at johnson johnson she has decades of experience at the intersection of healthcare and data science and is deeply attuned to the state of data science and healthcare today throughout our conversation we discuss where the landscape of data science and healthcare is today the unique challenges of applying data science and healthcare the importance of ethical ai when working on healthcare use cases how to solve some of the data challenges of the healthcare industry use cases she's been excited about how data science was used to tackle covet 19 and much more if you enjoyed this podcast make sure to rate us and subscribe and add a comment but only if you enjoyed it now let's dive right in karen it's great to have you on the show yeah great to be here thank you for having me i'm excited to talk to you about data science and machine learning and healthcare your experience leading data teams and complex organizations and how you've led r d at johnson johnson but before i'd love to learn more about your background and what got you into the data space yeah absolutely so i guess like most people i've always loved data and my first statistics courses i started to think oh this could be really really fun and especially when i started applying it to data i had collected as a research assistant it was pretty addictive and then as i moved along in my career i'm a cognitive neuroscientist by training but did fmri research as well as looking at some like large epidemiology data sets and 20 years ago wrote a paper on predictors of suicide attempts not exactly an ai ml approach to it but that interest in like how can we predict some event and then i had been in neuroscience studying neural networks all of these things and applying actually machine learning techniques to fmri images which are images while someone's doing something so it's a fairly complex although clean data set got me really excited and then i've always been passionate about healthcare and solving problems in healthcare and my first corporate data science job was at highmark health so i started on the payer side building a bunch of models and seeing how those models impacted care and was hooked and then moved to the parent company it's an integrated health care system second largest integrated payer provider system in the u.s and started a data science department at that parent company looking at the payer the insurance side the provider side and a few other diversified healthcare businesses and then came to johnson johnson where i am now and it's been a really exciting career where i get to see a lot of impact from data science to start off our conversation i'd love to understand the current state of data science machine learning in healthcare early in my career about five years ago and that's not too long ago healthcare was often and still is talked about as an industry with a large margin of opportunity for data science but it comes with its own unique sets of challenges which makes it slower in comparison to other industries given your experience as a data leader in healthcare i'd love to first start off our conversation by understanding how you would describe what the current landscape of data science and healthcare looks like today and how has it evolved in the past few years oh yeah that's an exciting question and it's it has evolved and different parts of healthcare i'll say are probably at different places and evolving and at different paces out of sometimes necessity and you say there's a lot of opportunity in healthcare there is and i think it's one of those industries where you have to take a bit of a careful approach to anything new they're practically their regulations and there's a lot of risk for something going wrong but huge benefits but what i've seen over uh the last few years is really a couple things that we're seeing in a lot of industries but in healthcare as well scale as we're moving into hey data science can be very very useful for solving real problems in healthcare there's a focus on deploying these models and not just having perfect concepts but really using them to drive core business decisions and core insights and and that requires data science at scale where at first it was a little more experimental a little more well let's just see how this goes alongside what we do today but we're not going to go all in and really use this to drive our business but we're moving towards that the other change i i guess are the problems that that we can solve or just we're realizing them right we're expanding the scope of what data science can do in healthcare and of course there's diagnostics there's also operations there's clinical trials and how those are run how patients are found there's so many things we can do and then a third i really important i wouldn't say change but something that's just continues to mature and we think about and i think it's helped accelerate data science and healthcare it's just thinking about the ethics of what we're doing considering it's impacting people and the care they receive and it can be life or death or it can either help or hurt the disparities we're seeing in care so really have thinking about ethics which is important in healthcare and then having tools and ways to address that at scale has really evolved over the past few years that's really great and i'm excited to unpack these with you even more so you mentioned at the beginning some of the areas of impact that data science and machine learning have in healthcare do you mind expanding on these main areas of value where you've seen data science and machine learning push the envelope forward within the healthcare space it's hard to pick a few but one i love to talk about and this is something my former team did and i really i loved the way they approached this and i saw to impact patients was looking at operations so sometimes in healthcare we go at the we're going to cure this disease we're going to diagnose this disease and of course how do we not say we're gonna put every data science tool we have towards cancer and we should but a safer way in in a way in that makes a huge impact can be the operations of healthcare itself or the operations of a clinical trial so i'll give you an example when i was at highmark health we built a tool to help schedule patients receiving chemotherapy and a big thing for me to start with the problem we heard about hey we're scheduling patients for chemotherapy they have long wait times which seemed not great we notice we're really busy in the mornings and then things are empty in in the afternoon so our clinicians are either overwhelmed or don't have a lot of patience and we dug in that was two things they didn't know how long a treatment could take and there could be side effects and clinicians want to care for their patients and make sure they have plenty of time so they're blind to how how long each patient might need staying there in that location so if we're able to predict that we can start efficiently scheduling and then just optimizing the scheduling optimizing the operations where in the calendar can this go where location wise can this go and we had this tool ready when the pandemic started and it became even more important to space vulnerable patients out it started with an operational challenge though scheduling very practical thing to solve and it made a huge difference i i've heard and stories from patients and saying hey i can get on and back to my life and not wait i can come at times convenient to me another area that i've seen an impact and a lot of promises diagnosis or detection early diagnosis early detection to give clinicians some some time to intervene we've heard about this in things like sepsis or acute diseases we're talking about early detection of things like pulmonary hypertension which is frequently diagnosed late and i know that's something where we're doing now these are big big areas of opportunity where we can treat patients because we can detect these diseases and diagnose them and then the third is patient's own experience like with the operational component of course that had a patient experience piece but just understanding patients their journeys where they're facing challenges how they're experiencing the healthcare system and where we're not maybe delivering care in the way we should data can help us see that and help us deliver a better experience deliver more personalized tailored experience on a biological level as well as just an individual level preferences ways of interacting and ways of receiving care i love how you frame the operations component here because whenever we talk about data science and machine learning in healthcare we always talk about aspirational use cases that i think we're all in agreement are extremely important for example i'm very excited to see the impacts of deep minds alpha fault and direct discovery but that doesn't mean we cannot create impact on people's lives right now with data science just by solving operational challenges when talking about data science and healthcare we often talk about challenges unique to the healthcare space such as access to relevant interoperable data ethics of ai and a host of other challenges i'd love it if you can break down what are the main data challenges you think that the healthcare industry is facing today i talk to my colleagues across industries everything manufacturing automotive just very different industries and no one tells me our data is perfect clean haven't really had a problem there or thought about it of course you're not surprised to hear this and in healthcare we base that as well and interoperability and different formats of data we're facing the same things but i think we're realizing that a other industries that face this and be you know there are solutions that will work here as well it's the whole topic the ethics of ai is is huge a huge one here and really really important so this becomes crucial in in healthcare i'm not saying if if you're selling a consumer good of course you don't want to make a mistake but if i get a recommendation to buy a toaster oven and i just bought a toaster oven so i'm probably not going to buy a second one and this just happened to me it's not a big deal it didn't really affect my life you can experiment with those algorithms get them out there and get them out there quickly and in healthcare we've obviously had to think and other industries face this as well there's risk so you have to really think through what you're doing and what could happen and how this algorithm is going to work what how you're going to build this process and get it right that's not to say there aren't things we can do there's a lot because there are a lot of problems and things we're not doing really well today so as long as we're not making it worse we should try some things but that's always going to be a pretty big challenge and an important challenge that we should take on relative to other industries it's just talking about the data obviously the sensitivity of the data itself makes it maybe a little harder to get access to data or think about how to use it share it what kinds of environments that data can be in and it should be i mean that's a challenge we should take on as a good challenge and the one we say we were never good enough because this is the most sensitive data in people's lives so that we should be continuously improving and thinking about how we protect this data how we use it how we make sure we're using it in a way that decreases inequalities in how we deliver care which i think it can but we have to use the data responsibly and consider it is very very sensitive data maybe more so than if there's a a leak of that i bought a toaster oven not that exciting i bought a coffee baker not that not that exciting but this this is a pretty big one i completely agree here and let's spark the chat a bit and talk about the ethics of ai in healthcare when we talk about using machine learning and ai in healthcare there's this aversion that whatever we develop will end up creating harmful outcomes or that it could be used irresponsibly and oftentimes the response is not to leverage machine learning in ai so i'd love to understand how you evaluate the risk of harmful outcomes of machine learning and ai in healthcare and how do you go about minimizing it well a great question one big thing to understand the potential harmful outcomes you have to understand the problem that you're solving be working collaboratively with a cross-functional team with clinicians with whoever is using and implementing and acting on your model with patients you have to have everyone in the room and involved in this process and understand that end-to-end because that's the only way you're going to find where the risks might lie you have to understand how how they're going to use this information and make a decision what mitigations can you build in where are the risks at every point in this system in that is sometimes something data scientists especially when they get started they're excited to build models and they skip over this piece of it unintentionally and when i read about you know resumes from the hr world like the algorithm is going to learn what you feed it and historically data reflects our human biases so the algorithm if you don't think about it and you don't account for that is going to learn to do exactly what people have done which is not uh really necessarily ethical but when with data and with an algorithm we have an ability to fix that and to control that a bit more than than we do in people but i always think about the end end how the decisions being made it can't just be about the algorithm and another part is it sounds kind of simple but empathy and the human centered design thinking approach is very valuable for data science because you start putting yourself in the shoes of the the person who's affected by this the patient all of the things they may be facing and all of the things that may happen based on the algorithm so you've got to really think about it from that angle and then it's of course the technology the data itself what biases are there the algorithms you're choosing the ways you can mitigate and correct it can you and that's job a technical expertise a data scientist has to have and it's essential now especially in in healthcare but everywhere we want to think about that the other obvious one is really going way back and saying did we pick the right use case and like the operations example there's a lot of problems to solve in healthcare we should be thinking about all of them but maybe the easier quick wins are ones where there's a little less opportunity for harm if it's maybe we're just randomly we're communicating with everyone in the same way today and maybe if we try to figure out some preferences and try to customize a bit and learn from there that may be lower risk than detecting a disease or changing the course of care and in medicine and healthcare this doesn't replace a clinician we want this to enhance the clinicians decision making that's awesome and i love how you draw inspiration from other fields like human centered design given that do you think also healthcare can draw from risk management risk analysis to create ai governance frameworks i think that is a great question and absolutely there is no industry we can't learn from we have to be looking outside of healthcare all the time and looking across healthcare to different parts of healthcare but definitely looking outside that's why i've very intentionally hired people from other industries on my teams i've wanted people from manufacturing and and it has worked they've come in and looked at things and said this is not an easy but a pretty easy problem to solve we deal with this all the time and something that someone my background is mainly in healthcare i would think certainly movement of chemotherapy drugs around to different locations that i i thought as though that's a pretty big challenge but i knew that other industries had solved it and so i looked to people from those industries to come in and bring some of that thinking to healthcare risk management of course that is something we do we have uh risk mitigation plans for everything we do think through everything early the every industry we need to be looking outside all the time in healthcare when thinking about some of the other obstacles that are unique to healthcare such as data access interoperability and collection what needs to change so that data science healthcare innovation accelerates here is it regulatory innovation industry standards that need to evolve the regulatory component is there it's important there's collaborative work and discussions going on across healthcare to make sure the the regulatory environment meets the needs of data science that's an ongoing process another one though that maybe is every industry but i see it a lot in healthcare the systems are very complex we have different emr systems those have a lot of steps and pieces data scientists don't always understand how a clinician interacts with that system yet that's that may be the place where their solution is delivered and acted on where the value is realized but they're very complicated systems and to get them all to connect maybe we want to use multimodal data from multiple sources imaging devices everything to really get a full picture of the patient at different time scales to really scale that solution and implement it we need those systems connected you can do it once grab all the data put it together build a model but how do you then deploy that model seeing some simplification of these systems and some consideration of hey it's very important to use this data to deploy solutions and to seamlessly connect and simplify things would be great to see and i think we're probably going to see that and i as i said it probably exists in in other industries as well um the other one is experience with data science data literacy or ai literacy we don't need clinicians and hospital administers they don't need to be experts in data science but i think as we all bring up that level of understanding and understanding how data science works how some of this stuff can be used and be able to speak a bit of the same language that would help and we're seeing that again in every industry but one i think we have a good chance of solving in in medicine a lot of people have a scientific background and it's data science has the science so it should be a good place and i've seen a lot of engaged clinicians and a lot coming in with a lot of knowledge experimental design and that's moving along but we could be better there and we need to keep pushing and that data literacy component is huge from a data quality perspective because a lot of healthcare professionals are the ones who are inputting this data into these systems and if they do not recognize the role the data plays in the value chain of data science then that value chain will end up breaking because no one is paying attention to the data quality right that's a great point and it actually that data literacy then it's going both ways it's a business literacy on the data scientist part of understanding how a clinician is inputting data and how they're interacting with an emr system or how on you know the insurance side maybe a care manager is identifying and reaching out to members of an insurance plan to help them coordinate their care and manage a chronic disease but we we have to understand how that data comes in and conversely if we show the value of data science the the people delivering care and part of that healthcare ecosystem are going to be able to work with us and say okay like i can i can see the value of uh this distinction as long as we don't take time away from their interactions with patients and make it harder don't want to do that that's awesome and given we're discussing the value of data science and healthcare i'd like to pivot to discuss your experience as a data and ai leader at johnson and johnson i'd love to understand and dig through some of the most exciting use cases you've seen data teams working on especially in healthcare at johnson and johnson especially given what must have been a very interesting time for r d teams with the release of the j j kovit 19 vaccine yeah there there are three that really come to mind and one we all are so deep in it it's always a great example so this is this is something i think is an excellent example of using data science to solve a real problem and make an impact when clinical trials are planned as you can imagine they're complex there's a lot of planning and you need to decide where to have those trials in the case of the vaccine we needed to find places where kovid was spreading so that we could see whether this worked quickly and get it out to people and what the teams were able to do using data science was predict where these future hotspots would be and plan the clinical trials in those places and it was effective and it allowed us to accelerate that and be really targeted and where we were doing clinical trials and where we're seeing high levels of covet so i think that's just a very great example and it shows data science can rise to the challenge and really solve big problems under pressure when it counts with there is no bigger really pressure in recent times than the whole world's in this pandemic and we need to do something about it with data science i'm really proud of that the other i think i mentioned the pulmonary hypertension example but just one example of how we can bring data together and use ai to diagnose a condition earlier and that and that's something we're doing and working on that's very very exciting this is an under diagnosed disease or it's not diagnosed early when when we could treat it and make an impact so if we can bring together diverse data sources and predict that diagnosis we can really make a difference in people's lives and then the third is just generally using data to accelerate what we're doing and how we're doing it at every part of the process we could talk about that all day but using digital data and digital endpoints to better measure outcomes using real world data claims data ehr data to really make sure we understand the patients we understand their needs we're developing drugs that are going to to make a difference and we're doing it efficiently and quickly because it always strikes me that every day that this is not out there a patient's not getting this treatment so i love that we are always focused on how do we get medicines to patients faster because this matters and we all either have been know someone or will be affected by this i absolutely love the kovit 19 use case here and it's really exemplary of a data science use case that requires relatively simple data science that can provide value now for patients and healthcare providers so i'd love it if you can impact that use case even more and maybe discuss the methodology used here i think it's a general process that really is important for solving any data science problem and at a high level and i've done this set up very multiple companies it starts with identifying a clear problem in this case right it was clearly we don't know where to plan to have these clinical trials and it's not something we can spin up in a day it takes some time so how could we know earlier it's finding that problem that can be solved with data science that's one piece that was crucial here and then it's collaborating working together with the business clinical areas to design and implement that solution in time sometimes data science if it gets too exploratory or just experimental we're not thinking about the urgency in the timelines where we need to deliver and working closely as a core member across the team and to to make something like this happen you have to do that those are just two key things that have to happen in any high impact data science use case and i think ones that have served well and then the third a piece of advice i got very early and i've always used and i've seen as a component of successful projects is really understanding how the solution you're building is going to be used and making sure the people who are going to use it are involved in the planning and have bought into this because you if you don't have adoption you're you're not going to solve the problem that that you wanted to solve so i think one thing that's evident is that there's a lot of different data teams at j doing different work it's one challenge to do this data science and health care but it's another challenge to work in a large matrix organizations where there are tons of stakeholders and a lot of different teams working on different problems i'd love to know how you ensure that you're staying effective despite this complexity and some of the best practices you can share in managing and working with data teams in large matrix organizations with other data leaders i think a big one is coming back to the shared mission vision what you're trying to do because in a healthcare organization or any organization but definitely in healthcare and at johnson and johnson it is very clear we are getting medicines to patients we're saving people's lives at the end of the day that cuts the the matrix the complexity of a large company sure it's there but the culture and the focus on the patient and what we're doing unifies and brings us all together and breaks down those silos and i think if at any company if you find and focus on that the problem and what you all care about how everyone's benefiting it it really helps the other is something i i think is just crucial bring people in early from across your company it becomes more complex when data science happens in the silo and then you show up with a solution and different parts of the business are thinking oh no we needed to be involved earlier this is slightly off here and it it can be harder than it needs to be which is brings me to the good part of a large matrix organization and why i keep working for them and i love to be at one i love to be the leader in a large matrix organization you have incredible resources you have experts you have legal teams you have supply chain there's there's so many experts in the area where you're developing solutions that's it is a luxury to have when you're a startup i talk to companies people that have great ideas and they have to work so hard to just get access to hey can you just tell me about some of the problems you have or how this works and they don't have all of these resources surrounding them at a large company you have so much support and you can never reach out too much or too early and think about hey you know what i'm struggling a bit with maybe how do you think about marketing oh we have a marketing team they everybody loves to get involved and they love to help and most companies i think you'll find this so reach out and use those resources that make a large company great because otherwise you're going to have all the bad parts of a big company and none of the good parts and that why do that that's great and it must be especially rewarding to have access to healthcare subject matter experts across the value chain because this will help you develop this empathy to create human centered data science solutions exactly no absolutely and we have that easily just phone call or quick message away like we're people are happy to talk and using that is key yes wonderful to have you great to use awesome so i'm sure these conversations with subject matter experts also influence your roadmap given the importance of r d in the healthcare space how do you ensure an adequate split between long-term research and short-term wins that can help you move the needle yeah absolutely and right now i'm in this r d environment developing medicines and it's a long-term view which is really interesting to see and to have that said there's a lot of short pieces and wins along the way to get to that end goal so if you're working with the clinical teams and as we do we really work together or in any company you're working with the business area and talking about what is that end to end what's the ultimate kind of long-term outcomes and then work backwards what are the short pieces and those quick wins as you say a lot to get you there you get that mix and then i think it's important to look at at the portfolio you have for data science and go through and see how many of these are really it's going to be years before we see the value and that's something in data science you need to know because you have to be careful not to let that timeline and that pace of technology and changes conflicts you've got to think about it early but yeah looking at how many long-term projects we have how many short quick wins do i have and then also it's okay to have purely exploratory i'm gonna play around with this data see if i can develop this model that's great to have too it's just looking across the portfolio and making sure that the percentage of work that's in all of these buckets is where you want it to be and need it to be and how do you determine which areas to research in your r d agenda the good thing is in an r d organization that happens at such a high high level but to bring it back to one simple concept it's unmet need and what do patients need and i think it's something that applies everywhere that where is there an unmet need where we can bring data science in but of course that's goes into the planning of what do we develop and it's a pharmaceutical r d organization it's a big process it's the core of of the business and then there's the data science component how does data science support and accelerate and enhance that that portfolio and that that r d process and as we mature and talk to each other and data science grows which we're doing at johnson johnson janssen r d which is pharmaceutical companies johnson johnson the data science team and capabilities are just exceptional jacqueline is our chief data science officer has built just a really incredibly advanced capability and and the company is putting a lot of investment into data science in r d and commercial and across the company it's great to see and that shows me that there is it right we've had the discussion about this can impact the r d portfolio this can um help you meet your goals and we've had that conversation conversations been successful and that's why we're able to to grow and really use data science now karen as we close out i'd love to have a look into the future and what you think are the data trends and innovations that you're particularly looking forward to see within healthcare one that is very important and i'm very excited about is the concept of fairness so we talked about the risks and reasons people don't want to use ai in healthcare and and this one comes up a lot and it really it any kind of high stakes industry it affects that industry but i'm really excited about the capabilities and the thinking that that is evolving around fairness both being able to detect bias and unfair pieces of the algorithm and then even fix that on the fly at scale make corrections i think that has the ability to allow us to really use data science ai and machine learning and healthcare but it really brings a ton of value to to people to patients and make sure they're getting care that is fair that we're considering things that maybe we haven't been great at in the past and maybe this can make medicine a bit better or any field a bit better so fairness is a huge one for me future trends of course i think we're going to continue to see scale we're going to continue to see a bit of a i don't want to say a ketchup but we're in a nice position to leapfrog other industries that have really perfected or made a huge a lot of the advancement in embedding ai into every part of their business we can take the technical learnings and platforms and pieces and start from there in healthcare and i think we're going to see that continue to grow because as we start making an impact we're going to need to consider how this becomes a core part of healthcare karen it was great to have you on the show do you have any final call to action before we wrap up you know it is to focus on the impact like i just always encourage data science and data science leaders to think through how is this data science solution solving a business problem how is it making an impact and how is it doing so the right way so focus on impact understand the context be fair but really go all and make a difference because data science we're ready for that thanks for being on data framed no thank you thanks for having me you've been listening to data framed a podcast by data camp keep connected with us by subscribing to the show in your favorite podcast player please give us a rating leave a comment and share episodes you love that helps us keep delivering insights into all things data thanks for listening until next timeyou're listening to data framed a podcast by data camp in this show you'll hear all the latest trends and insights in data science whether you're just getting started in your data career or you're a data leader looking to scale data-driven decisions in your organization join us for in-depth discussions with data and analytics leaders at the forefront of the data revolution let's dive right in hello everyone this is adele data science educator and evangelist at datacamp two years into the pandemic the potential for data science and machine learning in health care has never been more apparent whether it's drug discovery acceleration operational innovation virtual assistance and disease prevention the margin of opportunity for data science and health care is massive however that doesn't come without its own set of unique challenges and risks that require unique solutions this is why i'm excited to have current cats on today's episode of data framed curran is a senior director for data science portfolio management at johnson johnson she has decades of experience at the intersection of healthcare and data science and is deeply attuned to the state of data science and healthcare today throughout our conversation we discuss where the landscape of data science and healthcare is today the unique challenges of applying data science and healthcare the importance of ethical ai when working on healthcare use cases how to solve some of the data challenges of the healthcare industry use cases she's been excited about how data science was used to tackle covet 19 and much more if you enjoyed this podcast make sure to rate us and subscribe and add a comment but only if you enjoyed it now let's dive right in karen it's great to have you on the show yeah great to be here thank you for having me i'm excited to talk to you about data science and machine learning and healthcare your experience leading data teams and complex organizations and how you've led r d at johnson johnson but before i'd love to learn more about your background and what got you into the data space yeah absolutely so i guess like most people i've always loved data and my first statistics courses i started to think oh this could be really really fun and especially when i started applying it to data i had collected as a research assistant it was pretty addictive and then as i moved along in my career i'm a cognitive neuroscientist by training but did fmri research as well as looking at some like large epidemiology data sets and 20 years ago wrote a paper on predictors of suicide attempts not exactly an ai ml approach to it but that interest in like how can we predict some event and then i had been in neuroscience studying neural networks all of these things and applying actually machine learning techniques to fmri images which are images while someone's doing something so it's a fairly complex although clean data set got me really excited and then i've always been passionate about healthcare and solving problems in healthcare and my first corporate data science job was at highmark health so i started on the payer side building a bunch of models and seeing how those models impacted care and was hooked and then moved to the parent company it's an integrated health care system second largest integrated payer provider system in the u.s and started a data science department at that parent company looking at the payer the insurance side the provider side and a few other diversified healthcare businesses and then came to johnson johnson where i am now and it's been a really exciting career where i get to see a lot of impact from data science to start off our conversation i'd love to understand the current state of data science machine learning in healthcare early in my career about five years ago and that's not too long ago healthcare was often and still is talked about as an industry with a large margin of opportunity for data science but it comes with its own unique sets of challenges which makes it slower in comparison to other industries given your experience as a data leader in healthcare i'd love to first start off our conversation by understanding how you would describe what the current landscape of data science and healthcare looks like today and how has it evolved in the past few years oh yeah that's an exciting question and it's it has evolved and different parts of healthcare i'll say are probably at different places and evolving and at different paces out of sometimes necessity and you say there's a lot of opportunity in healthcare there is and i think it's one of those industries where you have to take a bit of a careful approach to anything new they're practically their regulations and there's a lot of risk for something going wrong but huge benefits but what i've seen over uh the last few years is really a couple things that we're seeing in a lot of industries but in healthcare as well scale as we're moving into hey data science can be very very useful for solving real problems in healthcare there's a focus on deploying these models and not just having perfect concepts but really using them to drive core business decisions and core insights and and that requires data science at scale where at first it was a little more experimental a little more well let's just see how this goes alongside what we do today but we're not going to go all in and really use this to drive our business but we're moving towards that the other change i i guess are the problems that that we can solve or just we're realizing them right we're expanding the scope of what data science can do in healthcare and of course there's diagnostics there's also operations there's clinical trials and how those are run how patients are found there's so many things we can do and then a third i really important i wouldn't say change but something that's just continues to mature and we think about and i think it's helped accelerate data science and healthcare it's just thinking about the ethics of what we're doing considering it's impacting people and the care they receive and it can be life or death or it can either help or hurt the disparities we're seeing in care so really have thinking about ethics which is important in healthcare and then having tools and ways to address that at scale has really evolved over the past few years that's really great and i'm excited to unpack these with you even more so you mentioned at the beginning some of the areas of impact that data science and machine learning have in healthcare do you mind expanding on these main areas of value where you've seen data science and machine learning push the envelope forward within the healthcare space it's hard to pick a few but one i love to talk about and this is something my former team did and i really i loved the way they approached this and i saw to impact patients was looking at operations so sometimes in healthcare we go at the we're going to cure this disease we're going to diagnose this disease and of course how do we not say we're gonna put every data science tool we have towards cancer and we should but a safer way in in a way in that makes a huge impact can be the operations of healthcare itself or the operations of a clinical trial so i'll give you an example when i was at highmark health we built a tool to help schedule patients receiving chemotherapy and a big thing for me to start with the problem we heard about hey we're scheduling patients for chemotherapy they have long wait times which seemed not great we notice we're really busy in the mornings and then things are empty in in the afternoon so our clinicians are either overwhelmed or don't have a lot of patience and we dug in that was two things they didn't know how long a treatment could take and there could be side effects and clinicians want to care for their patients and make sure they have plenty of time so they're blind to how how long each patient might need staying there in that location so if we're able to predict that we can start efficiently scheduling and then just optimizing the scheduling optimizing the operations where in the calendar can this go where location wise can this go and we had this tool ready when the pandemic started and it became even more important to space vulnerable patients out it started with an operational challenge though scheduling very practical thing to solve and it made a huge difference i i've heard and stories from patients and saying hey i can get on and back to my life and not wait i can come at times convenient to me another area that i've seen an impact and a lot of promises diagnosis or detection early diagnosis early detection to give clinicians some some time to intervene we've heard about this in things like sepsis or acute diseases we're talking about early detection of things like pulmonary hypertension which is frequently diagnosed late and i know that's something where we're doing now these are big big areas of opportunity where we can treat patients because we can detect these diseases and diagnose them and then the third is patient's own experience like with the operational component of course that had a patient experience piece but just understanding patients their journeys where they're facing challenges how they're experiencing the healthcare system and where we're not maybe delivering care in the way we should data can help us see that and help us deliver a better experience deliver more personalized tailored experience on a biological level as well as just an individual level preferences ways of interacting and ways of receiving care i love how you frame the operations component here because whenever we talk about data science and machine learning in healthcare we always talk about aspirational use cases that i think we're all in agreement are extremely important for example i'm very excited to see the impacts of deep minds alpha fault and direct discovery but that doesn't mean we cannot create impact on people's lives right now with data science just by solving operational challenges when talking about data science and healthcare we often talk about challenges unique to the healthcare space such as access to relevant interoperable data ethics of ai and a host of other challenges i'd love it if you can break down what are the main data challenges you think that the healthcare industry is facing today i talk to my colleagues across industries everything manufacturing automotive just very different industries and no one tells me our data is perfect clean haven't really had a problem there or thought about it of course you're not surprised to hear this and in healthcare we base that as well and interoperability and different formats of data we're facing the same things but i think we're realizing that a other industries that face this and be you know there are solutions that will work here as well it's the whole topic the ethics of ai is is huge a huge one here and really really important so this becomes crucial in in healthcare i'm not saying if if you're selling a consumer good of course you don't want to make a mistake but if i get a recommendation to buy a toaster oven and i just bought a toaster oven so i'm probably not going to buy a second one and this just happened to me it's not a big deal it didn't really affect my life you can experiment with those algorithms get them out there and get them out there quickly and in healthcare we've obviously had to think and other industries face this as well there's risk so you have to really think through what you're doing and what could happen and how this algorithm is going to work what how you're going to build this process and get it right that's not to say there aren't things we can do there's a lot because there are a lot of problems and things we're not doing really well today so as long as we're not making it worse we should try some things but that's always going to be a pretty big challenge and an important challenge that we should take on relative to other industries it's just talking about the data obviously the sensitivity of the data itself makes it maybe a little harder to get access to data or think about how to use it share it what kinds of environments that data can be in and it should be i mean that's a challenge we should take on as a good challenge and the one we say we were never good enough because this is the most sensitive data in people's lives so that we should be continuously improving and thinking about how we protect this data how we use it how we make sure we're using it in a way that decreases inequalities in how we deliver care which i think it can but we have to use the data responsibly and consider it is very very sensitive data maybe more so than if there's a a leak of that i bought a toaster oven not that exciting i bought a coffee baker not that not that exciting but this this is a pretty big one i completely agree here and let's spark the chat a bit and talk about the ethics of ai in healthcare when we talk about using machine learning and ai in healthcare there's this aversion that whatever we develop will end up creating harmful outcomes or that it could be used irresponsibly and oftentimes the response is not to leverage machine learning in ai so i'd love to understand how you evaluate the risk of harmful outcomes of machine learning and ai in healthcare and how do you go about minimizing it well a great question one big thing to understand the potential harmful outcomes you have to understand the problem that you're solving be working collaboratively with a cross-functional team with clinicians with whoever is using and implementing and acting on your model with patients you have to have everyone in the room and involved in this process and understand that end-to-end because that's the only way you're going to find where the risks might lie you have to understand how how they're going to use this information and make a decision what mitigations can you build in where are the risks at every point in this system in that is sometimes something data scientists especially when they get started they're excited to build models and they skip over this piece of it unintentionally and when i read about you know resumes from the hr world like the algorithm is going to learn what you feed it and historically data reflects our human biases so the algorithm if you don't think about it and you don't account for that is going to learn to do exactly what people have done which is not uh really necessarily ethical but when with data and with an algorithm we have an ability to fix that and to control that a bit more than than we do in people but i always think about the end end how the decisions being made it can't just be about the algorithm and another part is it sounds kind of simple but empathy and the human centered design thinking approach is very valuable for data science because you start putting yourself in the shoes of the the person who's affected by this the patient all of the things they may be facing and all of the things that may happen based on the algorithm so you've got to really think about it from that angle and then it's of course the technology the data itself what biases are there the algorithms you're choosing the ways you can mitigate and correct it can you and that's job a technical expertise a data scientist has to have and it's essential now especially in in healthcare but everywhere we want to think about that the other obvious one is really going way back and saying did we pick the right use case and like the operations example there's a lot of problems to solve in healthcare we should be thinking about all of them but maybe the easier quick wins are ones where there's a little less opportunity for harm if it's maybe we're just randomly we're communicating with everyone in the same way today and maybe if we try to figure out some preferences and try to customize a bit and learn from there that may be lower risk than detecting a disease or changing the course of care and in medicine and healthcare this doesn't replace a clinician we want this to enhance the clinicians decision making that's awesome and i love how you draw inspiration from other fields like human centered design given that do you think also healthcare can draw from risk management risk analysis to create ai governance frameworks i think that is a great question and absolutely there is no industry we can't learn from we have to be looking outside of healthcare all the time and looking across healthcare to different parts of healthcare but definitely looking outside that's why i've very intentionally hired people from other industries on my teams i've wanted people from manufacturing and and it has worked they've come in and looked at things and said this is not an easy but a pretty easy problem to solve we deal with this all the time and something that someone my background is mainly in healthcare i would think certainly movement of chemotherapy drugs around to different locations that i i thought as though that's a pretty big challenge but i knew that other industries had solved it and so i looked to people from those industries to come in and bring some of that thinking to healthcare risk management of course that is something we do we have uh risk mitigation plans for everything we do think through everything early the every industry we need to be looking outside all the time in healthcare when thinking about some of the other obstacles that are unique to healthcare such as data access interoperability and collection what needs to change so that data science healthcare innovation accelerates here is it regulatory innovation industry standards that need to evolve the regulatory component is there it's important there's collaborative work and discussions going on across healthcare to make sure the the regulatory environment meets the needs of data science that's an ongoing process another one though that maybe is every industry but i see it a lot in healthcare the systems are very complex we have different emr systems those have a lot of steps and pieces data scientists don't always understand how a clinician interacts with that system yet that's that may be the place where their solution is delivered and acted on where the value is realized but they're very complicated systems and to get them all to connect maybe we want to use multimodal data from multiple sources imaging devices everything to really get a full picture of the patient at different time scales to really scale that solution and implement it we need those systems connected you can do it once grab all the data put it together build a model but how do you then deploy that model seeing some simplification of these systems and some consideration of hey it's very important to use this data to deploy solutions and to seamlessly connect and simplify things would be great to see and i think we're probably going to see that and i as i said it probably exists in in other industries as well um the other one is experience with data science data literacy or ai literacy we don't need clinicians and hospital administers they don't need to be experts in data science but i think as we all bring up that level of understanding and understanding how data science works how some of this stuff can be used and be able to speak a bit of the same language that would help and we're seeing that again in every industry but one i think we have a good chance of solving in in medicine a lot of people have a scientific background and it's data science has the science so it should be a good place and i've seen a lot of engaged clinicians and a lot coming in with a lot of knowledge experimental design and that's moving along but we could be better there and we need to keep pushing and that data literacy component is huge from a data quality perspective because a lot of healthcare professionals are the ones who are inputting this data into these systems and if they do not recognize the role the data plays in the value chain of data science then that value chain will end up breaking because no one is paying attention to the data quality right that's a great point and it actually that data literacy then it's going both ways it's a business literacy on the data scientist part of understanding how a clinician is inputting data and how they're interacting with an emr system or how on you know the insurance side maybe a care manager is identifying and reaching out to members of an insurance plan to help them coordinate their care and manage a chronic disease but we we have to understand how that data comes in and conversely if we show the value of data science the the people delivering care and part of that healthcare ecosystem are going to be able to work with us and say okay like i can i can see the value of uh this distinction as long as we don't take time away from their interactions with patients and make it harder don't want to do that that's awesome and given we're discussing the value of data science and healthcare i'd like to pivot to discuss your experience as a data and ai leader at johnson and johnson i'd love to understand and dig through some of the most exciting use cases you've seen data teams working on especially in healthcare at johnson and johnson especially given what must have been a very interesting time for r d teams with the release of the j j kovit 19 vaccine yeah there there are three that really come to mind and one we all are so deep in it it's always a great example so this is this is something i think is an excellent example of using data science to solve a real problem and make an impact when clinical trials are planned as you can imagine they're complex there's a lot of planning and you need to decide where to have those trials in the case of the vaccine we needed to find places where kovid was spreading so that we could see whether this worked quickly and get it out to people and what the teams were able to do using data science was predict where these future hotspots would be and plan the clinical trials in those places and it was effective and it allowed us to accelerate that and be really targeted and where we were doing clinical trials and where we're seeing high levels of covet so i think that's just a very great example and it shows data science can rise to the challenge and really solve big problems under pressure when it counts with there is no bigger really pressure in recent times than the whole world's in this pandemic and we need to do something about it with data science i'm really proud of that the other i think i mentioned the pulmonary hypertension example but just one example of how we can bring data together and use ai to diagnose a condition earlier and that and that's something we're doing and working on that's very very exciting this is an under diagnosed disease or it's not diagnosed early when when we could treat it and make an impact so if we can bring together diverse data sources and predict that diagnosis we can really make a difference in people's lives and then the third is just generally using data to accelerate what we're doing and how we're doing it at every part of the process we could talk about that all day but using digital data and digital endpoints to better measure outcomes using real world data claims data ehr data to really make sure we understand the patients we understand their needs we're developing drugs that are going to to make a difference and we're doing it efficiently and quickly because it always strikes me that every day that this is not out there a patient's not getting this treatment so i love that we are always focused on how do we get medicines to patients faster because this matters and we all either have been know someone or will be affected by this i absolutely love the kovit 19 use case here and it's really exemplary of a data science use case that requires relatively simple data science that can provide value now for patients and healthcare providers so i'd love it if you can impact that use case even more and maybe discuss the methodology used here i think it's a general process that really is important for solving any data science problem and at a high level and i've done this set up very multiple companies it starts with identifying a clear problem in this case right it was clearly we don't know where to plan to have these clinical trials and it's not something we can spin up in a day it takes some time so how could we know earlier it's finding that problem that can be solved with data science that's one piece that was crucial here and then it's collaborating working together with the business clinical areas to design and implement that solution in time sometimes data science if it gets too exploratory or just experimental we're not thinking about the urgency in the timelines where we need to deliver and working closely as a core member across the team and to to make something like this happen you have to do that those are just two key things that have to happen in any high impact data science use case and i think ones that have served well and then the third a piece of advice i got very early and i've always used and i've seen as a component of successful projects is really understanding how the solution you're building is going to be used and making sure the people who are going to use it are involved in the planning and have bought into this because you if you don't have adoption you're you're not going to solve the problem that that you wanted to solve so i think one thing that's evident is that there's a lot of different data teams at j doing different work it's one challenge to do this data science and health care but it's another challenge to work in a large matrix organizations where there are tons of stakeholders and a lot of different teams working on different problems i'd love to know how you ensure that you're staying effective despite this complexity and some of the best practices you can share in managing and working with data teams in large matrix organizations with other data leaders i think a big one is coming back to the shared mission vision what you're trying to do because in a healthcare organization or any organization but definitely in healthcare and at johnson and johnson it is very clear we are getting medicines to patients we're saving people's lives at the end of the day that cuts the the matrix the complexity of a large company sure it's there but the culture and the focus on the patient and what we're doing unifies and brings us all together and breaks down those silos and i think if at any company if you find and focus on that the problem and what you all care about how everyone's benefiting it it really helps the other is something i i think is just crucial bring people in early from across your company it becomes more complex when data science happens in the silo and then you show up with a solution and different parts of the business are thinking oh no we needed to be involved earlier this is slightly off here and it it can be harder than it needs to be which is brings me to the good part of a large matrix organization and why i keep working for them and i love to be at one i love to be the leader in a large matrix organization you have incredible resources you have experts you have legal teams you have supply chain there's there's so many experts in the area where you're developing solutions that's it is a luxury to have when you're a startup i talk to companies people that have great ideas and they have to work so hard to just get access to hey can you just tell me about some of the problems you have or how this works and they don't have all of these resources surrounding them at a large company you have so much support and you can never reach out too much or too early and think about hey you know what i'm struggling a bit with maybe how do you think about marketing oh we have a marketing team they everybody loves to get involved and they love to help and most companies i think you'll find this so reach out and use those resources that make a large company great because otherwise you're going to have all the bad parts of a big company and none of the good parts and that why do that that's great and it must be especially rewarding to have access to healthcare subject matter experts across the value chain because this will help you develop this empathy to create human centered data science solutions exactly no absolutely and we have that easily just phone call or quick message away like we're people are happy to talk and using that is key yes wonderful to have you great to use awesome so i'm sure these conversations with subject matter experts also influence your roadmap given the importance of r d in the healthcare space how do you ensure an adequate split between long-term research and short-term wins that can help you move the needle yeah absolutely and right now i'm in this r d environment developing medicines and it's a long-term view which is really interesting to see and to have that said there's a lot of short pieces and wins along the way to get to that end goal so if you're working with the clinical teams and as we do we really work together or in any company you're working with the business area and talking about what is that end to end what's the ultimate kind of long-term outcomes and then work backwards what are the short pieces and those quick wins as you say a lot to get you there you get that mix and then i think it's important to look at at the portfolio you have for data science and go through and see how many of these are really it's going to be years before we see the value and that's something in data science you need to know because you have to be careful not to let that timeline and that pace of technology and changes conflicts you've got to think about it early but yeah looking at how many long-term projects we have how many short quick wins do i have and then also it's okay to have purely exploratory i'm gonna play around with this data see if i can develop this model that's great to have too it's just looking across the portfolio and making sure that the percentage of work that's in all of these buckets is where you want it to be and need it to be and how do you determine which areas to research in your r d agenda the good thing is in an r d organization that happens at such a high high level but to bring it back to one simple concept it's unmet need and what do patients need and i think it's something that applies everywhere that where is there an unmet need where we can bring data science in but of course that's goes into the planning of what do we develop and it's a pharmaceutical r d organization it's a big process it's the core of of the business and then there's the data science component how does data science support and accelerate and enhance that that portfolio and that that r d process and as we mature and talk to each other and data science grows which we're doing at johnson johnson janssen r d which is pharmaceutical companies johnson johnson the data science team and capabilities are just exceptional jacqueline is our chief data science officer has built just a really incredibly advanced capability and and the company is putting a lot of investment into data science in r d and commercial and across the company it's great to see and that shows me that there is it right we've had the discussion about this can impact the r d portfolio this can um help you meet your goals and we've had that conversation conversations been successful and that's why we're able to to grow and really use data science now karen as we close out i'd love to have a look into the future and what you think are the data trends and innovations that you're particularly looking forward to see within healthcare one that is very important and i'm very excited about is the concept of fairness so we talked about the risks and reasons people don't want to use ai in healthcare and and this one comes up a lot and it really it any kind of high stakes industry it affects that industry but i'm really excited about the capabilities and the thinking that that is evolving around fairness both being able to detect bias and unfair pieces of the algorithm and then even fix that on the fly at scale make corrections i think that has the ability to allow us to really use data science ai and machine learning and healthcare but it really brings a ton of value to to people to patients and make sure they're getting care that is fair that we're considering things that maybe we haven't been great at in the past and maybe this can make medicine a bit better or any field a bit better so fairness is a huge one for me future trends of course i think we're going to continue to see scale we're going to continue to see a bit of a i don't want to say a ketchup but we're in a nice position to leapfrog other industries that have really perfected or made a huge a lot of the advancement in embedding ai into every part of their business we can take the technical learnings and platforms and pieces and start from there in healthcare and i think we're going to see that continue to grow because as we start making an impact we're going to need to consider how this becomes a core part of healthcare karen it was great to have you on the show do you have any final call to action before we wrap up you know it is to focus on the impact like i just always encourage data science and data science leaders to think through how is this data science solution solving a business problem how is it making an impact and how is it doing so the right way so focus on impact understand the context be fair but really go all and make a difference because data science we're ready for that thanks for being on data framed no thank you thanks for having me you've been listening to data framed a podcast by data camp keep connected with us by subscribing to the show in your favorite podcast player please give us a rating leave a comment and share episodes you love that helps us keep delivering insights into all things data thanks for listening until next time\n"