Machine Learning for Equitable Healthcare Outcomes with Irene Chen - #479

The Importance of Open Medical Data Sets in Advancing Clinical Machine Learning

As I reflect on my journey as a PhD student, I am reminded of the vast resources available to us in the field of clinical machine learning. One of the most significant ones is the availability of open medical data sets. These datasets have been meticulously curated and made accessible to researchers, clinicians, and students alike. The Zero Deaconess Medical Center has played a crucial role in providing access to these datasets, allowing researchers to analyze and learn from real-world clinical data.

The addition of emergency department data to this dataset has opened up new avenues for research and collaboration. This data has been made available through a credentialing process that ensures its accuracy and reliability. Researchers can now use this data to build predictive models that identify patients at risk of mortality in the ICU within the first 48 hours of clinical notes. This is a critical application of machine learning in healthcare, and I am thrilled to be part of a community that is working towards advancing our understanding of patient outcomes.

One of the most significant benefits of open medical data sets is their collaborative nature. As a PhD student, I have had the opportunity to engage with researchers from diverse backgrounds and disciplines. From clinicians to ethicists, machine learning experts to anthropologists, this ecosystem of collaborators has been invaluable in shaping my research and ideas. We share our work, we discuss our findings, and we learn from each other's perspectives. This collaborative approach is essential for advancing the field of clinical machine learning.

The experience of working with open medical data sets has also taught me the importance of community engagement. When I started my PhD, I was not aware of the vast resources available to us in the field of machine learning and healthcare. However, through my interactions with colleagues, peers, and mentors, I have come to understand that this field is not just about individual brilliance but about collective effort. We learn from each other's work, we build on each other's ideas, and we push the boundaries of what is possible.

The impact of open medical data sets cannot be overstated. They have revolutionized the way we approach clinical machine learning, enabling us to develop more accurate predictive models and improve patient outcomes. These datasets are not just a resource; they are a gateway to new discoveries, new collaborations, and new frontiers in healthcare. As I look to the future, I am excited about the prospect of contributing to this ecosystem of researchers, clinicians, and collaborators who are working together to advance our understanding of human health.

In fact, I have recently discovered a treasure trove of open medical data sets available online. Andrew Beam's webpage is an excellent resource for anyone interested in accessing these datasets. From mammography to colon cancer, PCOS to cardiovascular disease, there are numerous datasets available that can inform research and improve healthcare outcomes. My advice to those who are new to this field is to dive right in, explore the data sets, and get hands-on experience with data cleaning and analysis.

As I conclude my journey as a PhD student, I am reminded of the power of community and collaboration in advancing our understanding of human health. The field of clinical machine learning is not just about individual brilliance but about collective effort. We learn from each other's work, we build on each other's ideas, and we push the boundaries of what is possible. As I look to the future, I am excited to be part of this ecosystem of researchers, clinicians, and collaborators who are working together to advance our understanding of human health.

In recent years, the field of machine learning and healthcare has undergone a significant transformation. What was once considered a niche area has grown into a thriving community with numerous conferences, workshops, and research institutions dedicated to advancing our understanding of human health through data-driven approaches. The Fairness Field at MIT is one such example, which has evolved from a tiny workshop to a full-fledged conference series.

I have had the privilege of being part of this journey, from its humble beginnings as a small workshop to its current status as a leading conference in the field of fairness and machine learning. It has been an incredible experience, and I am grateful for the opportunities that I have had to engage with colleagues, peers, and mentors who have shaped my research and ideas.

As I reflect on my time at MIT, I am reminded of the importance of collaboration in advancing our understanding of human health. We are not alone in this journey; we are part of a larger community of researchers, clinicians, and collaborators who are working together to advance our knowledge and improve healthcare outcomes. This collaborative approach is essential for pushing the boundaries of what is possible and creating meaningful change in the world.

In conclusion, open medical data sets have revolutionized the way we approach clinical machine learning, enabling us to develop more accurate predictive models and improve patient outcomes. These datasets are not just a resource; they are a gateway to new discoveries, new collaborations, and new frontiers in healthcare. As I look to the future, I am excited about the prospect of contributing to this ecosystem of researchers, clinicians, and collaborators who are working together to advance our understanding of human health.

"WEBVTTKind: captionsLanguage: enall right everyone i am here with irene chen irene is a phd student at mit irene welcome to the podcast uh thank you so much i'm a huge fan of the podcast so i'm absolutely thrilled to be here today awesome awesome i'm really looking forward to digging into our conversation as a listener you know that we're going to start with a little bit about your background how did you come to work at the intersection of machine learning and healthcare uh it's a great question so my training is in applied math i did my undergrad at harvard and the applied math program there is essentially for people who can't pick just one field so technically my my training is applied math with an application in computer science and economics which just meant that i got to take uh different classes and a bunch of different fields and get a feel for you know how quantitative sciences can be applied pretty broadly um after i graduated i went to dropbox and i worked for two years and i think the combination of sort of the research focus i took in undergrad and then seeing algorithms at scale seeing how technology develops seeing how companies make pretty big decisions about when to deploy something and when not to deploy something made me realize that i really wanted to study how machine learning how ai can influence uh sort of the toughest problems things we hadn't even decided we haven't even figured out yet so i went back to school i got now i'm in my phd at mit focusing on healthcare which is a new part of my training but luckily mit has tremendous amount of classes and i've been able to take classes at harvard medical school and really dig into how to bridge the gap between machine learning and these questions of healthcare and is the degree program that you're in uh now still applied math or computer science or is it a what a healthcare degree whatever that might be oh so it's uh it's a eecs electrical engineering computer science and then there's a great certificate program through hst so that's harvard science technologies that allows you to get sort of a like a four class certificate um through basically harvard medical school where you can take i've taken pathology physiology and currently i'm doing a well remote clinical preceptorship where you get to hear from all these clinicians about sort of the questions that keep them up at night so it's a really good bridge between the computer science the math the technical stuff that i love and i'm very familiar with and then the medical field which is a new area for me and deciding how to best bridge that gap and make sure that collaborators and anyone we work with is engaged and are really wanting to work with us nice nice so what are the questions that keep uh practitioners up at night there are a lot of questions certainly i think from the machine learning side um machine learning on working on healthcare data is like everything you know about machine learning but then you add a question mark to it so the very classic machine learning scenario maybe the one of the first problem sets you do is you get a bunch of pictures of cats and dogs and you want to classify them this picture is either cat or a dog when you think about health care it's not so easy to say oh this person has diabetes or doesn't have diabetes and now we want to classify who has diabetes or not it turns out you know whether or not someone has diabetes that label can be wrong right maybe that person hasn't been diagnosed with diabetes yet maybe that person doesn't like doctors and hasn't had covenant that very much yeah so all of a sudden the labels are in question and then also the picture you get you know you don't get a clear picture of a cat you don't get a clear picture of a dog instead you get all of the longitudinal visits that person has ever had with one healthcare system but oh wait they moved so actually we don't know what happened to them or we have people who have lots of tests lots of information from them but we are not sure why there's huge fluctuations of what's going on maybe there are medications that make them have different readings than we would expect so the data we get is really noisy and confounded by a bunch of different things and the labels we have are also combined by different things and then also you know to make everything even worse healthcare is a very high-stakes field so if you misclassify a cat or a dog maybe you insult someone's pet beloved pet but if you misclassify someone getting diabetes that could be really detrimental later on as they you know can contemplate treatment plans and figure out how to manage a chronic disease so the uncertainties in in healthcare coupled with the stakes make it uh really confusing for practitioners to try to figure out that's exactly right sam and then i think on a technical level you know we think of our data as maybe a huge matrix where each row is a patient and then all the columns are you know for one visit all the data we've collected or the next visit what's going on and already you can start to imagine some things might be wonky one of which is that not everyone has all of the data measured not everyone comes in once a year or once every six months and gets the full slate of blood tests and everything so all of a sudden you have very sparse data data where you know most people have zeros for all of these fields and then also maybe you don't have a ton of data for all the talk about big data in healthcare a lot of times when you dice it down to one chronic disease there may be only a few thousand people who fit your inclusion criteria so all of a sudden the data we're working with is like pretty dicey uh pretty sparse and pretty hard to do you know classic machine learning on so a lot of the work that i do is developing machine learning methods that can handle these sort of more longitudinal long-term analyses and figure out how we make these best set predictions for stratification algorithms that clinicians are interested in yeah it sounds like from the examples that you're giving your focus is more on like population health and and healthcare delivery from a systems perspective as opposed to um you know a particular diagnostic approach medical images or or something like that um maybe let's take a step back and have you talk broadly about your research and you know what are some of the problems that that you're thinking about yeah uh so my research focuses on uh developing new machine learning methods specifically for healthcare and then through the lens of questions of equity and inclusion so a great example of this would be you know you're working in a hospital and you want to build a really good risk stratification algorithm so you mentioned that you know right now there's really a lot of work in scoped acute tasks that you could deploy at a hospital for example when someone comes into the icu if we could predict the patient mortality during the hospital stay that would be really beneficial to clinicians because they can allocate resources they can you know talk to the patients they can develop a treatment plan the problem is that when you develop a healthcare algorithm like this there are questions about fairness and bias that might arise because the data that we're training on may have systemic health disparities baked in so all of a sudden and this is something i found fairly early in my phd you might develop a you know a supervised trait a supervised learning algorithm that tries to predict based on the first 48 hours of a patient's stay who's going to live or die in the hospital during the rest of their hospital stay and what i found is that the algorithm that i developed was less accurate for some racial groups than for others and this is not great um this is problematic for a bunch of different reasons and least of which it makes the engineer the person who developed the model practitioner the person who developed algorithm very confused and they don't know why what's going on so a lot of my research looks at how can we think about these machine learning for healthcare algorithms from the risk stratification sort of the very scoped acute tasks all the way to thinking about like longitudinal chronic disease work and thinking about how can we ask questions about how these algorithms are affecting all populations and how can we design new models that work across the entire patient population not just the people that maybe are overrepresented in the data or have already benefited from the health care system maybe want to focus this on people who for whom there have already been health disparities enacted in the case of the model that you described what was happening that caused your model to um have such disparate results based on the ethnicity of the patient yeah so there's for for about a year my phd it became the mystery of what's you know what's going on with this model certainly i wasn't there sprinkling and bias being like ah my my turn to make this evil algorithm and if you talk to a bunch of different people a lot of people especially when they think about questions of fairness have different hypotheses so some people thought oh it's because certain racial groups are smaller compared to other racial groups and therefore there's not enough data we need to go out and collect more data for say the asian population which had two percent um in this data set compared to the white population which was 70 that would explain why the asian population is having higher errors or other people would say actually it's because the data we collect is just noisier for some groups some groups might have say historical mistrust of the day of the healthcare system maybe the measures we don't collect as many measurements as often and therefore we can't get any better they're sort of already baked in issues that we as algorithm makers aren't to blame and therefore we should just sort of say like actually this is what it is and we can't blame this person as it turns out um this algorithm was a little bit of both of those two so one of them is that the data set could have been bigger the data set for certain populations we're not measuring in the same way this patient population was too small for some groups that explained some of it and there are tools that we produced to be able to estimate what was going on there and the other half is that actually there are some groups and there are some conditions that we don't know as much or we're not collecting the right information or we don't we're not able to differentiate for those patients who lives and dies in the hospital based on the information we collect and that's sort of a like i wish sam that i had a very neat answer for you but the truth is that we only right now have a set of tools to be able to cross out hypotheses suggest new ones and then go back to the clinical collaborators and say like okay we're thinking of deploying this tool here are all the caveats do you think this would still be useful or not let's discuss you mentioned earlier uh i believe you said risk stratification can you elaborate on on that and what you mean yes there's a strata stratification in the sense of you know we might stratify a data set to address imbalances and class imbalances in the data set or is it something another use of the term risk stratification here applies to the idea that you um if you have some sort of adverse event say patient mortality you can stratify patients by their risks so you can predict essentially we're talking about essentially a binary classification problem but you can have the probabilities for example and say if one person is 98 likely to die versus two percent likely to die um maybe we should be allocating resources or um focusing more attention on the higher risk patient versus the lower risk patient already you might think well are these scores calibrated meaning does 98 really mean 98 or are we kind of being are things being a little weird here and that's also an important question especially when we're not just looking at classification we're not just looking at zeros and ones we're looking at probabilities as well and being able to see if we're ranking the patients inappropriately and if 98 is correct or maybe it should have been closer to 70. right right and then even if you've got uh correct numbers good numbers you've got this totally different question of where do you draw the lines which may or may not be a machine learning problem right yeah and i think that's maybe the thing that i have learned the most in the phd is that yes machine learning is fantastic and also terrible and also you know has all of these complexities but in fact it's sort of one piece that fits into this entire clinical care pipeline and then ultimately you need to figure out how this risk gratification tool for example would be used by clinicians is it a score that they sort of just look at on the patient um record and sort of assess what's going on there is it a like alarm that goes off when the score goes above something and everyone drops everything and runs over is it something that we're just sort of passively putting in the background and maybe if they want to check they can look it up and if they don't if for example if they're unsure about patients then they would check the score versus they don't and or and then they sort of proceed on their normal clinical delivery anyway i think understanding where in the healthcare system a score like this would be used can better help us understand you know what kind of machine learning methods we should be developing in the first place so in your the focus of your research is it primarily on developing tools to help uh either data scientists in this field or clinicians or is it more broadly kind of understanding the you know the nature of these disparities and how they're introduced in the system or you know something you know yet altogether different or some combination i like to think of i like to think of the machine learning pipeline from like all the way to the beginning where you're like collect you know you figure out what do you want to study in the first place and then you're collecting the data and then maybe you're you know deciding what kind of prediction task what are the x's and y is going to be based on the data you've collected and then we have the algorithm development which we of course spent a lot of time on and then we have deployment and then monitoring what's going on with the machine learning algorithm in the clinical context and my sort of thesis is that we can think about each step and think about questions about ethics and equity and inclusion at each possible step and so that's sort of what i've been focusing my research on which is that everything we do everything we touch in machine learning for healthcare should be thought about how we can make sure we include the entire patient population so the example about wrist stratification and then let's say auditing an algorithm for bias that comes at the very end like you've basically almost got an algorithm going you're a step away from pushing the button and having it in a hospital uh-oh what's going on there how could we debug what's gone wrong fairness-wise i have other work that sort of moves up the development uh moves up the pipeline and saying when you're developing the algorithm what kinds of considerations can we take into account maybe different people have different access to health care and we should sort of build that into the model so that we're not accidentally wrong for people that don't come into the hospital as much and therefore we're not effectively penalizing them and then i have also worked sort of at the very very beginning thinking about what problems are we even solving with machine learning so i recently started a project looking at domestic violence predicting which patients are high risk going back to risk stratification uh for being victims of domestic violence and thinking about this is a task that we don't really give much attention to clinicians are not really trained to assess this and and it's it's a tough problem because we don't even know what the labels are going back to the questions about cats and dogs you know who is a victim who is not or a survivor as some people like to call them uh and and then how do we figure out the right clinical data set to be even able to collect for that and how do we bring that through the entire pipeline so i would say my uh i i love my research and i love that it can focus on different parts of this pipeline hoping in the end that as we build out machine learning for healthcare as we start to get more into hospitals or better understanding how diseases progress or anything you know anything in between that we're able to better think about all of the patient population and who we might be forgetting along the way i'd love to hear more about this domestic violence project that's kind of at the beginning of the of this life cycle what it what's your ultimate goal there we're we're building a early detection program for um so intimate partner violence is a subset of domestic violence and that's what we're focusing on so this is violence between uh present and past uh intimate partners and our goal is to be able to build assess evaluate and then eventually deploy a detection algorithm at the clinical level so in the emergency department or at the written at the radiologists level if someone comes in multiple times um or if they have a series of markers for example um you know an ulna fracture exactly um so if they have series of markers or biomarkers that are sort of high risk that there would be able to be a flag or some sort of alert that would go off that would allow the clinician or any kind of healthcare practitioner to start a conversation and currently the state is that intimate partner violence is pretty widespread and urgent concern that is not uh there's it's not clear what there is to do a lot of survivors they are reluctant to come forward because of stigma about the situation or they don't have the resources or they distrust healthcare professionals there's sort of a lot of factors that go into play and the ideas if that we are able to do early detection then we can help sort of speed up some of these kind of uh this this process where eventually a healthcare professional could broach the subject provide resources or at least monitor what's going on this is work that's done in collaboration with some fantastic colleagues at brigham and women's hospital in downtown boston and we are sort of we've had we validated a healthcare algorithm a machine learning algorithm and now we're sort of working on enlarging our data set to even more patients and validating what's going on there before hopefully deploying it at somewhat large scale at least in a hosp in one hospital to start with okay and so within the context of applying machine learning uh to reduce inequality in health care uh is the idea with this particular project that the the the population itself is underserved and the application of machine learning here is what reduces inequality are you also or primarily focused on kind of at the micro level biases within um the detection and uh mitigating uh issues there all of the above samples i would say that you know nothing is off the table i like to think of it as kind of glass half full or half empty you know machine learning is this incredible tool and so glass half full would say now we have this tool that can allow us to close inequities to detect conditions that haven't been detected in populations that haven't been maybe seen in the healthcare system as as readily or as alertly as they should be and so that is a tremendous opportunity to mine you know all of this big data these electronic health records these imaging studies um you know any kind of wearables all of this data we can now feed into a machine learning algorithm and both build prediction models that can be deployed and also learn about these conditions and figure out what's going on so that's the glass half full version let's have empty version as as you may know and from all your interviewing is that oftentimes machine learning is like you know a robot gone wrong it's you know a knife that cuts you while you're trying to cut something else and there are so many things that can happen when you blindly learn an algorithm and potentially deploy it and so being able to mitigate some of the things that happen there and better understand what's going on is sort of the glass half empty approach where you're constantly fighting different algorithms and figuring out what's going on i would say actually they're both in the same picture so that like just as you can't build any algorithm without seriously considering the clinical landscape that you're developing for and the data that you're collecting and how it's being collected um that also feeds into figuring out what are the questions that we should most readily attack um that have questions of ethics and equity that you should be able to focus in the first place so um you're completely right that uh the domestic violence project for example is very exciting because it tackles the glass half full it's the pitch this is what machine learning was promised for us but also as we deploy that we're also seriously concerned about maybe um you know different socioeconomic statuses people might manifest differently and we might be omitting different people or if we're training what you know how are we determining a label for intimate partner violence are we saying people who come forward and say i need help i want help if we use those labels then we actually might be completely missing a whole other population that are not coming forward and are not being seen by healthcare professionals in the same context so thinking that through that carefully is is of ut most important is uh and is a topic for ongoing work right now to be honest and you know this may be part of this ongoing work but when you think about how do you think through in an example like this case the you know the implications of the prediction itself right um you know if someone comes in for treatment and uh they get a positive um prediction here you know that they their injury was potentially associated with some kind of active domestic about domestic violence that you know potentially sets off some chain of events that impacts their life um you know how do you uh from a research perspective like how do you unwind all of that is that in the scope of of your work i think i it's not i did not imagine it would be in the scope of my work but i actually think it has to be in the scope of everyone's work which is that no machine learning exists in a vacuum and so you know the math the mathy part of me would say something like oh well then the loss function should be weighted towards uh making sure that we have high you know no false positives and only false negatives but then also the public health part of me wants to say well false negatives aren't so good either right you want to make sure that you're actually just accurate all the time why don't we just be perfect all the time and i think thinking through these trade-offs figuring out the clinical protocol the when the score comes up and figuring out what happens afterwards and therefore how should we tune this threshold of specificity and sensitivity is a very key question and could be more important even than what kind of algorithm you know it's a linear model a non-linear model is it a you know self-attention reverse distillation fancy machine learning model and in fact you know the very end you don't want to take the you know take the football all the way to the you know one yard line and then completely miss so i think thinking through these questions are incredibly important um and you're completely right that you know figuring out what the what happens after the flag goes you know what happens after the alert goes off or for whom does the alert go off and it's right or it's wrong um thinking through all those questions is sort of a perfect merging of the clinical collab the clinical domain who has been thinking about these questions for a long time in the machine learning domain where we have the tools the computational tools to be able to parse out is this error if we have errors are they coming from lack of data aka variance or are they coming from noisy measurements aka noise or are they coming from the model that we're using aka bias and so that is a lot of my focus my research but we can never ever ever forget that it comes from the other side of what happens after the model comes out there's been some interest so it's not my line of work but there's been some interesting work about how doctors interact with uh machine learning and the results are interesting in that you know often times the more experienced a doctor is uh in their own professional career the less likely they are to be swayed by the ai algorithm correctly or incorrectly maybe they're just disregarding they you know they already have a high prior in their own medical knowledge and therefore don't need the algorithm the same way whereas younger thinking of a study that specifically looks at dermatologists younger dermatologists that are earlier in their career they might be more swayed by a machine learning algorithm in this study they made them the machine learning algorithm be correct or incorrect and then the younger dermatologists would be you know easily swayed either way because they don't have a strong clinical knowledge domain expertise built up just yet and so i think i think thinking through all of that is really important as well as we as we you know hopefully ultimately is the goal to be able to work alongside doctors in a more concrete way are there any techniques that you've used or or recommend when you're kind of dealing working across this interface between the the data scientists the researcher and the or the machine learning researcher in particular and the clinician uh to for example tune the sensitivity of a particular algorithm uh you know what language are you speaking you know they're are you speaking uh you know false positive rates and the like to them are you showing them examples uh what have you found to work given the different languages that these uh two groups are um most comfortable in i i would say the best thing you can do is have as tight of a feedback loop between clinicians and machine learning practitioners as possible so my advisor is david sontak at mit and i think he's done a fantastic job in the lab of building uh clinicians who actually sit in the lab not currently because we're all remote but before there was a day you know there were desks and clinicians would just sit there and they would do their work and we would do our work but if something happened oh wait this blood test is giving me weird things or you know does this model look reasonably correct does this seem like something that's working we could just swivel over and talk to the clinicians who are right there and i think that's you know such an underrated part of clinical collaborations which is that you need you need to be able to talk to each other you need to be able to um you know have small back and forth about you know if for the same direction i would say that the mental model a lot of people have is that you know you build your machine learning model and you throw it over the wall and a clinician says like yes or no and then throws it back over the wall and anything we can do to lower that wall as low as possible so that we're just kind of tossing a bean bag back and forth would be you know advantageous the other thing i would say is that um better understanding from the machine learning side of what works and doesn't work for the clinician so you know do when we talk about this patient mortality model you know one of the things that surprised me was talking to the clinician and he said you know i know i know which patients are going to die or aren't going to die you know this model isn't telling me anything i don't already know and being able to think oh wait our baseline is not a logistic regression and seeing whether or not a linear model fit on these co these features will you know predict this outcome our baseline is does the clinician know like is this even helpful um is the problem worth solving even useful and having thinking that through is is incredibly important and that to me is a way of bridging sort of the the divide of interdisciplinary work um you've you uh you worked on a paper focused on probabilistic uh approaches to machine learning for healthcare can you talk a little bit about that work yeah so um one of the fun things that i get to do in addition to sort of going super deep uh into one topic and trying to push the frontier of knowledge a little bit one of the things that i find really important is taking a step back and saying actually what is the field doing right now what are the things that we can take into account and through that last summer i wrote this review of looking at probabilistic machine learning for healthcare one of the things that comes up often is how do we build in these levels of say uncertainty estimates or how do we express how a data is distributed and a lot of this comes back to basic ideas of probabilities and so thinking about how we can express you know it's not just zero one it's everything in between um into concepts like fairness uh if we're making a prediction or if we're um thinking about how a data is distributed being able to have the expressivity expression expression degree the degree of expression uh for the of using probabilities is incredibly important um probably the most uh i mean something that resonates a lot with clinches is having uncertainty estimates you know the first time i used my iphone and i asked siri a question and she said you know i don't know actually gave me a lot of confidence in siri for the first time when i used her and she you know she was able to express hey i thought about this or i searched my database and actually rather than give you a bad answer or make a guess i'm just going to tell you i don't know and i think having expressing that mathematically is probably you know on the back end if i had to imagine had to be something along the lines of if the probability of you know y given x is above some level then we say it and if it's below some level um or if they're equal across all of the classes or something like that then we say we don't know and similarly for clinical decision making um it shouldn't just be zero ones this person has this disease this person doesn't have a disease you could also give an uncertainty estimate in thinking through how we can use probabilistic machine learning in that respect but i think maybe the larger point that i want to emphasize is that i really enjoyed writing that review of probabilistic machine learning in healthcare i actually have another review article about ethical machine learning and health care and and i've written a few commentaries about different parts of the field and i think that that's something that i encourage you know all academics to sort of take a step back and say thinking super in depth about one topic all day long for several weeks slash months slash years in a row um is an incredibly rewarding process and is why a lot of people sign up for the phd but it is also our duty as you know people who are so privileged to be able to think about these things all day uh to be able to step back and synthesize it and be able to share that back with people who maybe aren't so narrowly focused all day in that one area as well yeah yeah uh i wonder if to maybe wrap us up you have some key takeaways for folks that are um either interested in this area but don't um you know aren't currently working in you know health care and and machine learning or are you know but want some pointers for thinking about kind of ethics and inclusion you know what are the kind of the top line things that you've learned in your journey that you think folks need to hear about um i would say two main pieces of advice uh one of them is that the field is way more accessible than i ever thought it would be in that there's a lot of open access large medical so electronic health record data sets out there the largest one is called mimic i believe they just released mimik4 so the fourth version and this is data that's collected from beth israeli guinness medical center it's a hospital in downtown boston and last i checked it's like tens of thousands of adults um and also i think like up to almost 10 000 children um first it was in the intensive care units just sort of their entire stay everything that happened the notes which is incredible the clinical notes that the doctors actually wrote um all the lab tests uh and then i think they recently also added the emergency department as well so everyone who went through the emergency department at the zero deaconess medical center and what they've done which is tremendous is they've allowed this data set to be accessible to researchers um who are can prove that they're researchers for with some sort of credentialing but is pretty light and effectively you know people use it all the time people a lot of classes teach out of it students are able to download the data set and make that model you know and build a prediction model about who is going to live and die in the icu based off of the first 48 hours of clinical notes this is like a very real task that now clinic uh that students in all kinds of introductory classes can now take so i would say there's a lot of open medical data sets out there um if you're interested hop right in get i think i believe andrew beam has a web page actually so type in andrew beam open medical data sets and he just lists dozens of datas if you're interested in mammography if you're interested in colon cancer if you're interested in pcos there's sort of all kinds of data sets um available so i would say first piece of advice just jump right in get your hands dirty see what happens see if you like it see if you like the data cleaning see if that's kind of annoying to you see how you enjoy it the second piece of advice i would say is that this field is incredibly collaborative i've just waxed poetically about my clinical collaborators and you know how much i miss being able to swivel over and annoy them but uh i would i i think when i started my phd i had this notion um especially at a very you know prestigious place like mit where you feel like everyone is stressed out all the time i have this notion that there's sort of a genius who sits alone in the room and does you know just spits out papers alone just sort of just manages to to to create you know brilliance just by themselves and my experience in the phd has been anything from that um even you know the lone genius is actually reading papers by other people and is able to sort of build on top of them yeah and in the phd you can actually you know really tighten that loop and instead of waiting to read someone's paper you can talk to people and say hey i have this cool idea what do you think and they can say oh i have this cool idea and you're able to collaborate through that and so something that i've really enjoyed is both the collaborations with people in my lab my lab is awesome clinical machine learning group at mit but also people at conferences people at uh clinicians random people who read my papers and email me twitter direct messages i think that being able to tap into a whole ecosystem of very excited people that span machine learning people ethicists i recently had a co-author who was an anthropologist for the first time and that was you know insanely cool and so being able to tap into that entire network of people has been incredible often times i don't know what i'm talking about oftentimes i feel like i'm learning so much from them that i'm bringing to the table and i think that's the good part is being able to come in to the room figure out where who who knows what they're talking about learn from them and then be able to shape your own ideas so um you know i am only still a phd student but i'm very excited that i get to be part of this community and this community really means like machine learning people healthcare people anyone else who is vaguely interested in the implications of what's going on now we're expanding to like hci people human commuter interaction people and so thinking about all of those communities coming together is is what gets me up in the morning honestly and it keeps my powers my research um as i you know race towards finishing it are you close uh the hope is that next year i'll graduate so i'll check back with you say i'm in a year and we'll see where i am but um that's the but i've had such an amazing time at mit and um i you know i when i was in 2016 when i started the fairness field was really not a thing uh machine learning and health was like this tiny workshop at nurep's um the main machine learning camp uh conference and now the machine learning workshop is like the biggest workshop at nerfs fairness it has its own set of conferences you know two three four conferences and and i can't imagine what's going to be in another five ten years like i'm so excited yeah that's awesome irene thanks so much for sharing a bit about what you're up to it was completely my pleasure sam thank you so much for having me on thank youall right everyone i am here with irene chen irene is a phd student at mit irene welcome to the podcast uh thank you so much i'm a huge fan of the podcast so i'm absolutely thrilled to be here today awesome awesome i'm really looking forward to digging into our conversation as a listener you know that we're going to start with a little bit about your background how did you come to work at the intersection of machine learning and healthcare uh it's a great question so my training is in applied math i did my undergrad at harvard and the applied math program there is essentially for people who can't pick just one field so technically my my training is applied math with an application in computer science and economics which just meant that i got to take uh different classes and a bunch of different fields and get a feel for you know how quantitative sciences can be applied pretty broadly um after i graduated i went to dropbox and i worked for two years and i think the combination of sort of the research focus i took in undergrad and then seeing algorithms at scale seeing how technology develops seeing how companies make pretty big decisions about when to deploy something and when not to deploy something made me realize that i really wanted to study how machine learning how ai can influence uh sort of the toughest problems things we hadn't even decided we haven't even figured out yet so i went back to school i got now i'm in my phd at mit focusing on healthcare which is a new part of my training but luckily mit has tremendous amount of classes and i've been able to take classes at harvard medical school and really dig into how to bridge the gap between machine learning and these questions of healthcare and is the degree program that you're in uh now still applied math or computer science or is it a what a healthcare degree whatever that might be oh so it's uh it's a eecs electrical engineering computer science and then there's a great certificate program through hst so that's harvard science technologies that allows you to get sort of a like a four class certificate um through basically harvard medical school where you can take i've taken pathology physiology and currently i'm doing a well remote clinical preceptorship where you get to hear from all these clinicians about sort of the questions that keep them up at night so it's a really good bridge between the computer science the math the technical stuff that i love and i'm very familiar with and then the medical field which is a new area for me and deciding how to best bridge that gap and make sure that collaborators and anyone we work with is engaged and are really wanting to work with us nice nice so what are the questions that keep uh practitioners up at night there are a lot of questions certainly i think from the machine learning side um machine learning on working on healthcare data is like everything you know about machine learning but then you add a question mark to it so the very classic machine learning scenario maybe the one of the first problem sets you do is you get a bunch of pictures of cats and dogs and you want to classify them this picture is either cat or a dog when you think about health care it's not so easy to say oh this person has diabetes or doesn't have diabetes and now we want to classify who has diabetes or not it turns out you know whether or not someone has diabetes that label can be wrong right maybe that person hasn't been diagnosed with diabetes yet maybe that person doesn't like doctors and hasn't had covenant that very much yeah so all of a sudden the labels are in question and then also the picture you get you know you don't get a clear picture of a cat you don't get a clear picture of a dog instead you get all of the longitudinal visits that person has ever had with one healthcare system but oh wait they moved so actually we don't know what happened to them or we have people who have lots of tests lots of information from them but we are not sure why there's huge fluctuations of what's going on maybe there are medications that make them have different readings than we would expect so the data we get is really noisy and confounded by a bunch of different things and the labels we have are also combined by different things and then also you know to make everything even worse healthcare is a very high-stakes field so if you misclassify a cat or a dog maybe you insult someone's pet beloved pet but if you misclassify someone getting diabetes that could be really detrimental later on as they you know can contemplate treatment plans and figure out how to manage a chronic disease so the uncertainties in in healthcare coupled with the stakes make it uh really confusing for practitioners to try to figure out that's exactly right sam and then i think on a technical level you know we think of our data as maybe a huge matrix where each row is a patient and then all the columns are you know for one visit all the data we've collected or the next visit what's going on and already you can start to imagine some things might be wonky one of which is that not everyone has all of the data measured not everyone comes in once a year or once every six months and gets the full slate of blood tests and everything so all of a sudden you have very sparse data data where you know most people have zeros for all of these fields and then also maybe you don't have a ton of data for all the talk about big data in healthcare a lot of times when you dice it down to one chronic disease there may be only a few thousand people who fit your inclusion criteria so all of a sudden the data we're working with is like pretty dicey uh pretty sparse and pretty hard to do you know classic machine learning on so a lot of the work that i do is developing machine learning methods that can handle these sort of more longitudinal long-term analyses and figure out how we make these best set predictions for stratification algorithms that clinicians are interested in yeah it sounds like from the examples that you're giving your focus is more on like population health and and healthcare delivery from a systems perspective as opposed to um you know a particular diagnostic approach medical images or or something like that um maybe let's take a step back and have you talk broadly about your research and you know what are some of the problems that that you're thinking about yeah uh so my research focuses on uh developing new machine learning methods specifically for healthcare and then through the lens of questions of equity and inclusion so a great example of this would be you know you're working in a hospital and you want to build a really good risk stratification algorithm so you mentioned that you know right now there's really a lot of work in scoped acute tasks that you could deploy at a hospital for example when someone comes into the icu if we could predict the patient mortality during the hospital stay that would be really beneficial to clinicians because they can allocate resources they can you know talk to the patients they can develop a treatment plan the problem is that when you develop a healthcare algorithm like this there are questions about fairness and bias that might arise because the data that we're training on may have systemic health disparities baked in so all of a sudden and this is something i found fairly early in my phd you might develop a you know a supervised trait a supervised learning algorithm that tries to predict based on the first 48 hours of a patient's stay who's going to live or die in the hospital during the rest of their hospital stay and what i found is that the algorithm that i developed was less accurate for some racial groups than for others and this is not great um this is problematic for a bunch of different reasons and least of which it makes the engineer the person who developed the model practitioner the person who developed algorithm very confused and they don't know why what's going on so a lot of my research looks at how can we think about these machine learning for healthcare algorithms from the risk stratification sort of the very scoped acute tasks all the way to thinking about like longitudinal chronic disease work and thinking about how can we ask questions about how these algorithms are affecting all populations and how can we design new models that work across the entire patient population not just the people that maybe are overrepresented in the data or have already benefited from the health care system maybe want to focus this on people who for whom there have already been health disparities enacted in the case of the model that you described what was happening that caused your model to um have such disparate results based on the ethnicity of the patient yeah so there's for for about a year my phd it became the mystery of what's you know what's going on with this model certainly i wasn't there sprinkling and bias being like ah my my turn to make this evil algorithm and if you talk to a bunch of different people a lot of people especially when they think about questions of fairness have different hypotheses so some people thought oh it's because certain racial groups are smaller compared to other racial groups and therefore there's not enough data we need to go out and collect more data for say the asian population which had two percent um in this data set compared to the white population which was 70 that would explain why the asian population is having higher errors or other people would say actually it's because the data we collect is just noisier for some groups some groups might have say historical mistrust of the day of the healthcare system maybe the measures we don't collect as many measurements as often and therefore we can't get any better they're sort of already baked in issues that we as algorithm makers aren't to blame and therefore we should just sort of say like actually this is what it is and we can't blame this person as it turns out um this algorithm was a little bit of both of those two so one of them is that the data set could have been bigger the data set for certain populations we're not measuring in the same way this patient population was too small for some groups that explained some of it and there are tools that we produced to be able to estimate what was going on there and the other half is that actually there are some groups and there are some conditions that we don't know as much or we're not collecting the right information or we don't we're not able to differentiate for those patients who lives and dies in the hospital based on the information we collect and that's sort of a like i wish sam that i had a very neat answer for you but the truth is that we only right now have a set of tools to be able to cross out hypotheses suggest new ones and then go back to the clinical collaborators and say like okay we're thinking of deploying this tool here are all the caveats do you think this would still be useful or not let's discuss you mentioned earlier uh i believe you said risk stratification can you elaborate on on that and what you mean yes there's a strata stratification in the sense of you know we might stratify a data set to address imbalances and class imbalances in the data set or is it something another use of the term risk stratification here applies to the idea that you um if you have some sort of adverse event say patient mortality you can stratify patients by their risks so you can predict essentially we're talking about essentially a binary classification problem but you can have the probabilities for example and say if one person is 98 likely to die versus two percent likely to die um maybe we should be allocating resources or um focusing more attention on the higher risk patient versus the lower risk patient already you might think well are these scores calibrated meaning does 98 really mean 98 or are we kind of being are things being a little weird here and that's also an important question especially when we're not just looking at classification we're not just looking at zeros and ones we're looking at probabilities as well and being able to see if we're ranking the patients inappropriately and if 98 is correct or maybe it should have been closer to 70. right right and then even if you've got uh correct numbers good numbers you've got this totally different question of where do you draw the lines which may or may not be a machine learning problem right yeah and i think that's maybe the thing that i have learned the most in the phd is that yes machine learning is fantastic and also terrible and also you know has all of these complexities but in fact it's sort of one piece that fits into this entire clinical care pipeline and then ultimately you need to figure out how this risk gratification tool for example would be used by clinicians is it a score that they sort of just look at on the patient um record and sort of assess what's going on there is it a like alarm that goes off when the score goes above something and everyone drops everything and runs over is it something that we're just sort of passively putting in the background and maybe if they want to check they can look it up and if they don't if for example if they're unsure about patients then they would check the score versus they don't and or and then they sort of proceed on their normal clinical delivery anyway i think understanding where in the healthcare system a score like this would be used can better help us understand you know what kind of machine learning methods we should be developing in the first place so in your the focus of your research is it primarily on developing tools to help uh either data scientists in this field or clinicians or is it more broadly kind of understanding the you know the nature of these disparities and how they're introduced in the system or you know something you know yet altogether different or some combination i like to think of i like to think of the machine learning pipeline from like all the way to the beginning where you're like collect you know you figure out what do you want to study in the first place and then you're collecting the data and then maybe you're you know deciding what kind of prediction task what are the x's and y is going to be based on the data you've collected and then we have the algorithm development which we of course spent a lot of time on and then we have deployment and then monitoring what's going on with the machine learning algorithm in the clinical context and my sort of thesis is that we can think about each step and think about questions about ethics and equity and inclusion at each possible step and so that's sort of what i've been focusing my research on which is that everything we do everything we touch in machine learning for healthcare should be thought about how we can make sure we include the entire patient population so the example about wrist stratification and then let's say auditing an algorithm for bias that comes at the very end like you've basically almost got an algorithm going you're a step away from pushing the button and having it in a hospital uh-oh what's going on there how could we debug what's gone wrong fairness-wise i have other work that sort of moves up the development uh moves up the pipeline and saying when you're developing the algorithm what kinds of considerations can we take into account maybe different people have different access to health care and we should sort of build that into the model so that we're not accidentally wrong for people that don't come into the hospital as much and therefore we're not effectively penalizing them and then i have also worked sort of at the very very beginning thinking about what problems are we even solving with machine learning so i recently started a project looking at domestic violence predicting which patients are high risk going back to risk stratification uh for being victims of domestic violence and thinking about this is a task that we don't really give much attention to clinicians are not really trained to assess this and and it's it's a tough problem because we don't even know what the labels are going back to the questions about cats and dogs you know who is a victim who is not or a survivor as some people like to call them uh and and then how do we figure out the right clinical data set to be even able to collect for that and how do we bring that through the entire pipeline so i would say my uh i i love my research and i love that it can focus on different parts of this pipeline hoping in the end that as we build out machine learning for healthcare as we start to get more into hospitals or better understanding how diseases progress or anything you know anything in between that we're able to better think about all of the patient population and who we might be forgetting along the way i'd love to hear more about this domestic violence project that's kind of at the beginning of the of this life cycle what it what's your ultimate goal there we're we're building a early detection program for um so intimate partner violence is a subset of domestic violence and that's what we're focusing on so this is violence between uh present and past uh intimate partners and our goal is to be able to build assess evaluate and then eventually deploy a detection algorithm at the clinical level so in the emergency department or at the written at the radiologists level if someone comes in multiple times um or if they have a series of markers for example um you know an ulna fracture exactly um so if they have series of markers or biomarkers that are sort of high risk that there would be able to be a flag or some sort of alert that would go off that would allow the clinician or any kind of healthcare practitioner to start a conversation and currently the state is that intimate partner violence is pretty widespread and urgent concern that is not uh there's it's not clear what there is to do a lot of survivors they are reluctant to come forward because of stigma about the situation or they don't have the resources or they distrust healthcare professionals there's sort of a lot of factors that go into play and the ideas if that we are able to do early detection then we can help sort of speed up some of these kind of uh this this process where eventually a healthcare professional could broach the subject provide resources or at least monitor what's going on this is work that's done in collaboration with some fantastic colleagues at brigham and women's hospital in downtown boston and we are sort of we've had we validated a healthcare algorithm a machine learning algorithm and now we're sort of working on enlarging our data set to even more patients and validating what's going on there before hopefully deploying it at somewhat large scale at least in a hosp in one hospital to start with okay and so within the context of applying machine learning uh to reduce inequality in health care uh is the idea with this particular project that the the the population itself is underserved and the application of machine learning here is what reduces inequality are you also or primarily focused on kind of at the micro level biases within um the detection and uh mitigating uh issues there all of the above samples i would say that you know nothing is off the table i like to think of it as kind of glass half full or half empty you know machine learning is this incredible tool and so glass half full would say now we have this tool that can allow us to close inequities to detect conditions that haven't been detected in populations that haven't been maybe seen in the healthcare system as as readily or as alertly as they should be and so that is a tremendous opportunity to mine you know all of this big data these electronic health records these imaging studies um you know any kind of wearables all of this data we can now feed into a machine learning algorithm and both build prediction models that can be deployed and also learn about these conditions and figure out what's going on so that's the glass half full version let's have empty version as as you may know and from all your interviewing is that oftentimes machine learning is like you know a robot gone wrong it's you know a knife that cuts you while you're trying to cut something else and there are so many things that can happen when you blindly learn an algorithm and potentially deploy it and so being able to mitigate some of the things that happen there and better understand what's going on is sort of the glass half empty approach where you're constantly fighting different algorithms and figuring out what's going on i would say actually they're both in the same picture so that like just as you can't build any algorithm without seriously considering the clinical landscape that you're developing for and the data that you're collecting and how it's being collected um that also feeds into figuring out what are the questions that we should most readily attack um that have questions of ethics and equity that you should be able to focus in the first place so um you're completely right that uh the domestic violence project for example is very exciting because it tackles the glass half full it's the pitch this is what machine learning was promised for us but also as we deploy that we're also seriously concerned about maybe um you know different socioeconomic statuses people might manifest differently and we might be omitting different people or if we're training what you know how are we determining a label for intimate partner violence are we saying people who come forward and say i need help i want help if we use those labels then we actually might be completely missing a whole other population that are not coming forward and are not being seen by healthcare professionals in the same context so thinking that through that carefully is is of ut most important is uh and is a topic for ongoing work right now to be honest and you know this may be part of this ongoing work but when you think about how do you think through in an example like this case the you know the implications of the prediction itself right um you know if someone comes in for treatment and uh they get a positive um prediction here you know that they their injury was potentially associated with some kind of active domestic about domestic violence that you know potentially sets off some chain of events that impacts their life um you know how do you uh from a research perspective like how do you unwind all of that is that in the scope of of your work i think i it's not i did not imagine it would be in the scope of my work but i actually think it has to be in the scope of everyone's work which is that no machine learning exists in a vacuum and so you know the math the mathy part of me would say something like oh well then the loss function should be weighted towards uh making sure that we have high you know no false positives and only false negatives but then also the public health part of me wants to say well false negatives aren't so good either right you want to make sure that you're actually just accurate all the time why don't we just be perfect all the time and i think thinking through these trade-offs figuring out the clinical protocol the when the score comes up and figuring out what happens afterwards and therefore how should we tune this threshold of specificity and sensitivity is a very key question and could be more important even than what kind of algorithm you know it's a linear model a non-linear model is it a you know self-attention reverse distillation fancy machine learning model and in fact you know the very end you don't want to take the you know take the football all the way to the you know one yard line and then completely miss so i think thinking through these questions are incredibly important um and you're completely right that you know figuring out what the what happens after the flag goes you know what happens after the alert goes off or for whom does the alert go off and it's right or it's wrong um thinking through all those questions is sort of a perfect merging of the clinical collab the clinical domain who has been thinking about these questions for a long time in the machine learning domain where we have the tools the computational tools to be able to parse out is this error if we have errors are they coming from lack of data aka variance or are they coming from noisy measurements aka noise or are they coming from the model that we're using aka bias and so that is a lot of my focus my research but we can never ever ever forget that it comes from the other side of what happens after the model comes out there's been some interest so it's not my line of work but there's been some interesting work about how doctors interact with uh machine learning and the results are interesting in that you know often times the more experienced a doctor is uh in their own professional career the less likely they are to be swayed by the ai algorithm correctly or incorrectly maybe they're just disregarding they you know they already have a high prior in their own medical knowledge and therefore don't need the algorithm the same way whereas younger thinking of a study that specifically looks at dermatologists younger dermatologists that are earlier in their career they might be more swayed by a machine learning algorithm in this study they made them the machine learning algorithm be correct or incorrect and then the younger dermatologists would be you know easily swayed either way because they don't have a strong clinical knowledge domain expertise built up just yet and so i think i think thinking through all of that is really important as well as we as we you know hopefully ultimately is the goal to be able to work alongside doctors in a more concrete way are there any techniques that you've used or or recommend when you're kind of dealing working across this interface between the the data scientists the researcher and the or the machine learning researcher in particular and the clinician uh to for example tune the sensitivity of a particular algorithm uh you know what language are you speaking you know they're are you speaking uh you know false positive rates and the like to them are you showing them examples uh what have you found to work given the different languages that these uh two groups are um most comfortable in i i would say the best thing you can do is have as tight of a feedback loop between clinicians and machine learning practitioners as possible so my advisor is david sontak at mit and i think he's done a fantastic job in the lab of building uh clinicians who actually sit in the lab not currently because we're all remote but before there was a day you know there were desks and clinicians would just sit there and they would do their work and we would do our work but if something happened oh wait this blood test is giving me weird things or you know does this model look reasonably correct does this seem like something that's working we could just swivel over and talk to the clinicians who are right there and i think that's you know such an underrated part of clinical collaborations which is that you need you need to be able to talk to each other you need to be able to um you know have small back and forth about you know if for the same direction i would say that the mental model a lot of people have is that you know you build your machine learning model and you throw it over the wall and a clinician says like yes or no and then throws it back over the wall and anything we can do to lower that wall as low as possible so that we're just kind of tossing a bean bag back and forth would be you know advantageous the other thing i would say is that um better understanding from the machine learning side of what works and doesn't work for the clinician so you know do when we talk about this patient mortality model you know one of the things that surprised me was talking to the clinician and he said you know i know i know which patients are going to die or aren't going to die you know this model isn't telling me anything i don't already know and being able to think oh wait our baseline is not a logistic regression and seeing whether or not a linear model fit on these co these features will you know predict this outcome our baseline is does the clinician know like is this even helpful um is the problem worth solving even useful and having thinking that through is is incredibly important and that to me is a way of bridging sort of the the divide of interdisciplinary work um you've you uh you worked on a paper focused on probabilistic uh approaches to machine learning for healthcare can you talk a little bit about that work yeah so um one of the fun things that i get to do in addition to sort of going super deep uh into one topic and trying to push the frontier of knowledge a little bit one of the things that i find really important is taking a step back and saying actually what is the field doing right now what are the things that we can take into account and through that last summer i wrote this review of looking at probabilistic machine learning for healthcare one of the things that comes up often is how do we build in these levels of say uncertainty estimates or how do we express how a data is distributed and a lot of this comes back to basic ideas of probabilities and so thinking about how we can express you know it's not just zero one it's everything in between um into concepts like fairness uh if we're making a prediction or if we're um thinking about how a data is distributed being able to have the expressivity expression expression degree the degree of expression uh for the of using probabilities is incredibly important um probably the most uh i mean something that resonates a lot with clinches is having uncertainty estimates you know the first time i used my iphone and i asked siri a question and she said you know i don't know actually gave me a lot of confidence in siri for the first time when i used her and she you know she was able to express hey i thought about this or i searched my database and actually rather than give you a bad answer or make a guess i'm just going to tell you i don't know and i think having expressing that mathematically is probably you know on the back end if i had to imagine had to be something along the lines of if the probability of you know y given x is above some level then we say it and if it's below some level um or if they're equal across all of the classes or something like that then we say we don't know and similarly for clinical decision making um it shouldn't just be zero ones this person has this disease this person doesn't have a disease you could also give an uncertainty estimate in thinking through how we can use probabilistic machine learning in that respect but i think maybe the larger point that i want to emphasize is that i really enjoyed writing that review of probabilistic machine learning in healthcare i actually have another review article about ethical machine learning and health care and and i've written a few commentaries about different parts of the field and i think that that's something that i encourage you know all academics to sort of take a step back and say thinking super in depth about one topic all day long for several weeks slash months slash years in a row um is an incredibly rewarding process and is why a lot of people sign up for the phd but it is also our duty as you know people who are so privileged to be able to think about these things all day uh to be able to step back and synthesize it and be able to share that back with people who maybe aren't so narrowly focused all day in that one area as well yeah yeah uh i wonder if to maybe wrap us up you have some key takeaways for folks that are um either interested in this area but don't um you know aren't currently working in you know health care and and machine learning or are you know but want some pointers for thinking about kind of ethics and inclusion you know what are the kind of the top line things that you've learned in your journey that you think folks need to hear about um i would say two main pieces of advice uh one of them is that the field is way more accessible than i ever thought it would be in that there's a lot of open access large medical so electronic health record data sets out there the largest one is called mimic i believe they just released mimik4 so the fourth version and this is data that's collected from beth israeli guinness medical center it's a hospital in downtown boston and last i checked it's like tens of thousands of adults um and also i think like up to almost 10 000 children um first it was in the intensive care units just sort of their entire stay everything that happened the notes which is incredible the clinical notes that the doctors actually wrote um all the lab tests uh and then i think they recently also added the emergency department as well so everyone who went through the emergency department at the zero deaconess medical center and what they've done which is tremendous is they've allowed this data set to be accessible to researchers um who are can prove that they're researchers for with some sort of credentialing but is pretty light and effectively you know people use it all the time people a lot of classes teach out of it students are able to download the data set and make that model you know and build a prediction model about who is going to live and die in the icu based off of the first 48 hours of clinical notes this is like a very real task that now clinic uh that students in all kinds of introductory classes can now take so i would say there's a lot of open medical data sets out there um if you're interested hop right in get i think i believe andrew beam has a web page actually so type in andrew beam open medical data sets and he just lists dozens of datas if you're interested in mammography if you're interested in colon cancer if you're interested in pcos there's sort of all kinds of data sets um available so i would say first piece of advice just jump right in get your hands dirty see what happens see if you like it see if you like the data cleaning see if that's kind of annoying to you see how you enjoy it the second piece of advice i would say is that this field is incredibly collaborative i've just waxed poetically about my clinical collaborators and you know how much i miss being able to swivel over and annoy them but uh i would i i think when i started my phd i had this notion um especially at a very you know prestigious place like mit where you feel like everyone is stressed out all the time i have this notion that there's sort of a genius who sits alone in the room and does you know just spits out papers alone just sort of just manages to to to create you know brilliance just by themselves and my experience in the phd has been anything from that um even you know the lone genius is actually reading papers by other people and is able to sort of build on top of them yeah and in the phd you can actually you know really tighten that loop and instead of waiting to read someone's paper you can talk to people and say hey i have this cool idea what do you think and they can say oh i have this cool idea and you're able to collaborate through that and so something that i've really enjoyed is both the collaborations with people in my lab my lab is awesome clinical machine learning group at mit but also people at conferences people at uh clinicians random people who read my papers and email me twitter direct messages i think that being able to tap into a whole ecosystem of very excited people that span machine learning people ethicists i recently had a co-author who was an anthropologist for the first time and that was you know insanely cool and so being able to tap into that entire network of people has been incredible often times i don't know what i'm talking about oftentimes i feel like i'm learning so much from them that i'm bringing to the table and i think that's the good part is being able to come in to the room figure out where who who knows what they're talking about learn from them and then be able to shape your own ideas so um you know i am only still a phd student but i'm very excited that i get to be part of this community and this community really means like machine learning people healthcare people anyone else who is vaguely interested in the implications of what's going on now we're expanding to like hci people human commuter interaction people and so thinking about all of those communities coming together is is what gets me up in the morning honestly and it keeps my powers my research um as i you know race towards finishing it are you close uh the hope is that next year i'll graduate so i'll check back with you say i'm in a year and we'll see where i am but um that's the but i've had such an amazing time at mit and um i you know i when i was in 2016 when i started the fairness field was really not a thing uh machine learning and health was like this tiny workshop at nurep's um the main machine learning camp uh conference and now the machine learning workshop is like the biggest workshop at nerfs fairness it has its own set of conferences you know two three four conferences and and i can't imagine what's going to be in another five ten years like i'm so excited yeah that's awesome irene thanks so much for sharing a bit about what you're up to it was completely my pleasure sam thank you so much for having me on thank you\n"