Trends in Machine Learning & Deep Learning with Zachary Lipton - #556

The Evolution of Machine Learning: From Single-Purpose Models to Complex Pipelines

In recent years, machine learning has undergone significant evolution, transforming from single-purpose models to complex pipelines that feel almost magical. This shift is driven by the increasing maturity of individual models and the growing need for more sophisticated interactions between them. One key factor in this transformation is the emergence of new heuristics and rules around model deployment.

For instance, consider the experience of Shazam, a music recognition app that can identify songs with remarkable accuracy. While it may seem like magic, Shazam's success relies on a combination of clever design and the use of multiple models in different contexts. By leveraging this multi-model approach, developers can create user experiences that feel innovative and groundbreaking, even if the underlying technology is complex.

However, building such complex pipelines requires more than just individual model expertise. It demands a deep understanding of how to combine models, interact with users, and optimize performance. In many cases, this means that organizations need to invest in developing their own custom models, rather than relying on off-the-shelf solutions. This is where the art of model combination comes in – a key area of innovation that has been underexplored until recently.

The Lego Analogy: Building Blocks for Innovation

A useful analogy to describe this process is the world of LEGO bricks. Just as LEGO has become increasingly sophisticated, with thousands of available pieces and countless possible combinations, machine learning has reached a similar point of maturity. The basic building blocks required to create innovative models are now in place, but it's how these pieces are combined that matters.

In the same way that LEGO enthusiasts can build incredible structures using simple bricks, developers can create complex pipelines by combining multiple models in creative ways. This process is not unlike designing a new LEGO piece – it requires imagination, creativity, and a deep understanding of how the individual components work together.

The Impact of Maturity: From Artist to Superstar

As machine learning has matured, the gap between great performers and superstars has narrowed significantly. In music, for example, the difference between a skilled artist and a superstar is no longer solely about technical proficiency. Rather, it's about the ability to create complex, nuanced pieces that resonate with audiences on a deeper level.

Similarly, in machine learning, the distinction between good models and exceptional ones is becoming increasingly blurred. With the advent of sophisticated model combination techniques, developers can now create pipelines that outperform individual models in many domains. This shift has significant implications for innovation, as it allows organizations to unlock new capabilities and solve complex problems with unprecedented ease.

The Power of Tools: Leveraging Hugging Faces and MLOps

A key enabler of this innovation is the development of specialized tools and frameworks. Hugging Faces, a popular library for natural language processing (NLP), has democratized access to high-quality models, making it easier for developers to build complex pipelines.

MLOps (Machine Learning Operations) tools, such as MLflow and TensorFlow, have also played a crucial role in streamlining the model deployment process. By providing a standardized framework for building, testing, and deploying machine learning models, these tools have reduced the barriers to entry for organizations looking to develop custom solutions.

The Future of Innovation: From Practice to Play

As machine learning continues to evolve, we can expect to see more innovation emerge from the intersection of practice and play. In music, for example, the difference between a skilled musician and a superstar is no longer solely about technical proficiency. Rather, it's about the ability to create complex, nuanced pieces that resonate with audiences on a deeper level.

Similarly, in machine learning, the distinction between good models and exceptional ones is becoming increasingly blurred. With the advent of sophisticated model combination techniques, developers can now create pipelines that outperform individual models in many domains. This shift has significant implications for innovation, as it allows organizations to unlock new capabilities and solve complex problems with unprecedented ease.

The Role of Design: From Scales to Music

In music, the difference between a great artist and a superstar is no longer solely about technical proficiency. Rather, it's about the ability to create complex, nuanced pieces that resonate with audiences on a deeper level. Similarly, in machine learning, the distinction between good models and exceptional ones is becoming increasingly blurred.

This shift is driven in part by advances in model combination techniques, which allow developers to create pipelines that outperform individual models in many domains. However, it's also dependent on the ability of designers to create innovative, user-friendly interfaces that hide the complexity of the underlying technology.

The Future of Machine Learning: A World of End-to-End Solutions

As machine learning continues to evolve, we can expect to see more innovation emerge from the intersection of practice and play. With the advent of sophisticated model combination techniques, developers can now create pipelines that outperform individual models in many domains.

In the near future, we can expect to see more end-to-end solutions emerge, where complex tasks are broken down into smaller, manageable components, and each component is optimized using machine learning techniques. This shift has significant implications for innovation, as it allows organizations to unlock new capabilities and solve complex problems with unprecedented ease.

Conclusion

The evolution of machine learning from single-purpose models to complex pipelines represents a major turning point in the field's history. As we move forward, it's essential to recognize the importance of model combination, design, and innovation in unlocking the full potential of this technology.

By combining these elements, developers can create user experiences that feel groundbreaking and innovative, even if the underlying technology is complex. As machine learning continues to evolve, we can expect to see more innovation emerge from the intersection of practice and play – a world where end-to-end solutions become the norm, and complex tasks are broken down into manageable components using machine learning techniques.

"WEBVTTKind: captionsLanguage: enall right everyone welcome to another episode of ai rewind 2021 today we are joined by zachary lipton an assistant professor in the machine learning department and operations group at carnegie mellon university to talk through all things machine learning and deep learning zach last joined us on the show for the 2019 edition of rewind and i'm super excited to have him back once again zach welcome back to the chuma ai podcast cool thanks for having me sam great to see you again it is great to see you again i think the last time we uh physically had the opportunity to hang out was also 2019 in vancouver uh i think that's probably a story shared by a lot of folks in our field like that was the last opportunity that folks had to hang out in person right um how have the last couple of years been for you oh man it's been eventful um you know i'm not gonna pretend it's all been smooth but i mean some things are nice like my students are great and i think have been i think it's not been easy for everyone like some people um some people got sick some people lost someone some people didn't get to see their family for a couple of years um on the other hand like people i feel like it's a weird thing where people manage to be startlingly productive or at least you know maybe i don't want to shame anyone who doesn't feel productive i feel like from just my feeling in my life i feel like people are i mean it's just kind of like cmu culture or science i thought people have been really um yeah like locked in in research but i think there's a kind of like emotional wear and tear of just not seeing anyone like especially like some folks are like living by themselves you know and then like when they were quarantined not seeing another human for six months and for others of us just like catching up now so it's been it's been a little bit wild um but you know interesting on the research side and um you know we got a puppy so like interesting personally nice nice uh so as i mentioned in the lead up we are here to review the year in ml and deep learning this is the uh the third rewind that will publish the first couple were in nlp and computer vision uh or computer vision and nlp in the order that they were published and so far a couple of key themes have emerged uh one which was common in those first couple of uh of episodes is this idea that uh as john bohannon put it nlp eating machine learning kind of like in the same way we would say you know ai eating the software or what have you the idea that computer vision is uh adopting transformers and things like that you have echoed one of the other observations that john made in that nlp conversation and it is um that particular point is kind of a slowing down of the field and uh a little bit of a respite from that kind of breakneck pace of uh of change that we were experiencing for a while uh so maybe that is a place for you to jump in and and riff for a bit yeah happy to have you to riff um you know like justin stino maybe i mean i'm like choking a contrarian or something i'll i'll start by you know maybe pushing back on one thing that's kind of interesting of like you know that that phrase of like nlp eating ml is is kind of cute because it's sort of well among other things right like in some sense there there's like the the line for for the longest time for the last seven years has sort of been machine learning eating nlp and that like if you look at the set of people going into uh like sort of an nlp oriented like grad party there's a point where like nlp sat really close to like you know they were like nlp and computational linguistics like sort of two sides of a coin and they sat not so far from their like philosophers of linguistics or whatever and now you have it you have a moment for the last however many years where the median person in nlp knows absolutely nothing about language there's nothing interesting to say about language that couldn't just as easily and it's not to say nobody or that there isn't anyone with something interesting interesting observations or interesting experiments that are that are kind of hitting on both sides but say that like the center of gravity of the field has moved to this way that almost there there's almost no l in nlp you know it's just like yeah um it's just sort of like you know a set of tools where if if the like commercial demand was more on music than on nlp you would use almost you know conceivably like the same set of models because all they care is just like a sequence of tokens like a very generic sort of approach and so in some sense it's sort of just been like deep learning eating nlp has been the story for a while and i think that like this version of nlp eating ml is well i guess one they don't really mean an ml but really more just like other application areas yeah yeah i think you know and they don't really mean nlp as much as the thing that ate nlp which is transformers right whatever is like that like new organism that displaced mlp is now coming across but right it's more like you know there was like a discipline of computer vision where you had people that like the the typical person who was in there knew something about like the physics of light and optics and like was sort of like doing this sort of like you know that that was the angle they were like a real expert on like the the modality of vision and the person nlp knew something about language and i think they both got eight by deep learning in such a way that you know over the last seven years ideas that would hit on one side could very easily pour it across to the other and for most of that history um i think it's hard to say precisely why if there's some reason or if it's just sort of the the order in which things happen um like the those breakthrough image net results that really caught people's attention were envisioned first but i think for most of that history it's been very one-dimensional like very one-directional of um i think things going from you know mostly in the direction of computer vision to nlp and i think if anything this is not really enough eating vision but it's just notable that maybe one of the bigger things happening in vision is crossing in the other direction like contrary to that pattern um but yeah you know broadly on the like things slowing down pattern i i think i've been i've been noticing writing about this for i don't know maybe like four years now but i think there's there's definitely uh a moment where like if we were to look through the history and say like 2012 imagenet 2014 like sequence of sequence models 2015 uh alphago um 2016-17 big advances and like uh the kind of like uh perceptual quality of like generative models 2017 transformers 2018 um bert um there was a kind of change that you know i i don't think these are necessarily all profound in the sense of like some big intellectual move but they are like qualitative changes in capabilities it's like there's a big move in the sense of like what set of problems do i think are best tackled with these tools and what sort of performance can i expect from them and a big difference in the sense of if i'm a practitioner in the field and somebody hits me with a typical like industry problem what's my go-to tool and if we look at like 2021 and now we're saying well if someone hits you with a a classification task what are you going to do is going to use a resnet from 2014 and 2015 you know someone hits you with nlp tasks it's like basically fine-tuning bur or a bert like you know very roberta alberto you know whatever it's like you know right like there's this moment where you know um like i i think that in some sense i think it's okay in that um like researchers now need to like start looking somewhere else other than just like what if i tweak the architecture a little bit like this there's a and as i was telling you like when we were riffing before is that i think the research is there is some aspect of people like roping around in the dark looking for looking for a way in it's almost like they're like swinging at a pinata with a blindfold on and trying to find like where where is where is there an angle that like where is there something big and i think you want to have a lot of researchers in that mindset of like i'm looking for the blind spot like i'm looking for the big prize that other people aren't looking for um and look every now and then someone gets every now and then someone really lands a mark and the pinata rips open and a bunch of candy falls on the floor and then everybody rushes on and there's some period of time where nobody is worried about like nobody even knows where the bat is everybody's just picking candy off the floor um and you know i think where we saw that period of like people finding all these you know every it wasn't like you didn't need a big intellectual breakthrough to have a big um to have an impactful breakthrough uh in all those years and i think we're getting to the point where like you know there is some amount of stagnation because like most of the good candy has been picked up and people are like looking at the old grimy like moldy stuff that's like you know maybe there's still something in there it's like it's like those like the sloppy seconds on the uh the the the research you know pinata uh with that analogy in mind you know where should research be swinging the bat do you have is that a is that it you know it's a crystal ball kind of question but um what does your intuition tell you where opportunities might be well i always look for you know what what is it that we actually care about when people are selling a story an aspirational story about um you know i'm not like a mathematician first i'm not just like what's a hard problem like let's just solve for that reason you know i got into it too late you know uh i kind of back into it from like what's actually like what's the dream and if you look at like the dream people are selling people even people with like existing companies right now the claim they're making if you look at like ibm watson which went up in flames look at the claims they were making like what are we gonna do for you it's like we're gonna you're gonna make better decisions you're gonna provide personalized healthcare you're gonna help people to you know have better health outcomes than they otherwise would have without our ai if you look at this kind of stuff you know what are people selling what are people hoping to actually achieve and then you look at like what are people actually doing and if i like i always kind of look back and forth between those and say like what's what's like the missing part like if you actually want to realize the dream of what you know people i just seem to want what's what's gone like what's what's not even being addressed in a mature way and so i think one one thing that sort of jumps out is that everybody's sort of premise you know everyone says like these things are good it's always based on some notion of like accuracy or the return of an rl system as evaluated on some fixed static environment and and then you look at what people are actually doing is like they're taking some model trained in some context on some set of data and deploying that crap in some different environment which is changing and unpredictable ways um and where the whole environment is not just changing like in a kind of benign or passive way often it's changing in direct response to like you know like think about google search right you deploy uh google every single time they tweak their algorithm what's the first thing that happens and it's like all the message boards light up and all the seo goons they're like oh like seo change the algorithm now now you need to add this keyword you need to do this and i think that ml doesn't address that kind it's like i might say ml doesn't i don't mean no nothing that we aspire to in l but i mean like the the main thing you know the main thing that practitioners do they're the toolkit the mature one but like you know i know how to use pie torch and train you know resnet and resnext and whatever whatever um that world it's like completely set in in the environment of like i train a model evaluate on a sort of like iid hold outset or even if you evaluate on some kind of challenge set it's not like with any coherent principle for why you should expect this model to do well on that challenge that or why you should think the performance on that challenge that's representative of what you should encounter in the world so i think performing in a dynamic world making decisions and not just predictions right because everybody's sort of saying ultimately if you think you're going to make money off of this or you think you're going to affect some kind of like societally beneficial outcome by using aia if you think you're going to do anything then ultimately it's like what you're the claim at some some point what you're hoping to do is guide some kind of decision or automate some kind of decision right you're actually hoping to have an outcome not just to like be a passive observer to the world and make accurate predictions about what would happen were you not to take any action at all and and this kind of setting of like actually providing guidance for what you should do in the world is is we are you know it's a thing like yeah there are people working on causal inference there are people who are trying to bring reinforcement learning closer to the real world by you know maybe incorporating some ideas from causal influence like to think about confounding that might exist in the data to be able to build models and off policy kind of way you know so that you're not just saying i'm just going to deploy some randomly acting system and uh important application and have it suck for two million years until it learns um but they're relatively immature and then they get relatively little attention if you were to look at like you know what do people you know what is you know i'm not telling beat up on our buddies in the press because what is like you know kade gonna write a big article about in the new york times it's not typically like the the the slog of like scientific advances and making robust machine learning or um you know off policy rl or something like this it's it's uh you know there's a big neural network that you know has nine trillion parameters and a billion dollar investment from microsoft and this kind of and and so i think right um decision making uh robustness and dynamic environments and and actually addressing certain societal desert erotic you know people have sort of noticed the problems that arise in terms of the um ways sort of like an ai system can affect whether it's like um unethical sort of outcomes if you plug on the naively into certain decision systems but the sort of like field of actually like developing systems that could in some coherent way align with societal desert is quite primitive right so there's like a recognition that there's a problem but we're very early stages on getting towards solutions so i think that to me like these are the areas i think are are more interesting um you know in that like look if you can get if you can squeeze half a point out on like you know all the nlp benchmarks by like making a slight variation on bert you'll get a lot of citations that everyone will use it and they should and it is useful but it's i feel like not it's it's like a direct it's like a a slight change in degree it's not a change in kind and so i think you know when you when i look at the field i think the the luxury of being an academia the reason it'd be in academia is to think that like i don't have to just think how do i do epsilon better than someone at the same crop we're all doing tomorrow but like what what actually is something that you know addresses you know some problem that nobody's even engaging with intellectually right now it sounds like your answer then to the where to swing the bat is in getting closer to you know real world problems that the folks are having and you mentioned a lot of different elements um you know i heard some aspects of domain generalization in there i heard um you know aspects of even like user interface like how you're presenting the information heard aspects of fairness in there but uh broadly it sounds like you're kind of also calling into the question kind of the simplification that often happens in research of problems that removes them from you know all of the constraints and fuzziness of the real world yeah and like it's tricky right because everybody's everybody thinks they're doing that well it's more like everybody's gotta choose to focus on something and and to focus on the thing they want to focus on they got to compromise on something else um and and it's not that like one thing is right or wrong like i don't think it's wrong if there's people out there building bigger language models i don't think that's fundamentally wrong um i think you got to be like um you know maybe like gotta use your brain to think about how how what kind of claims i can make about these things or or how should they be used in the real world but i think like look it's interesting to turn that knob and say what if i make this big or what you know what happens um there there's plenty of work to be done um you know like one kind of like trade-off that you often have is that um you know you have like if i want to get close to what real data looks like you know and i want to get close to things i can actually do in a bunch of domains often like all that's available is you know uh data's um like there's there's a way that like predictive modeling and like the status quo you know is closer to the real world in that it touches real data and it gets within it's like narrow aspiration of like just predict well on like iid data like under a naive assumption about how the world doesn't change it's able to do that on really complex high-dimensional real-world data um on the other hand what you give up when you focus only on just that problem is any kind of consideration of you know i think people just only think about how do i get better predict you know building predictive models um is you know like they're getting close to dealing with real data but they're asking a very narrow set of questions about it which is like how do i get higher accuracy then on the other side you know you have folks say for example trying to get at like um fundamental questions about um what sort of like causal queries i can make and very often in order to flesh out those questions now another they're taking on like a more ambitious set of like kinds of queries that i can ask but in order to make progress like understanding the fundamental form of those questions this often starts with well i gotta analyze like fundamentally one of these questions even answerable from the data sets that i have in order to like maybe get some of that analysis to go through i have to make some simplifying assumptions about the form of the data like i assume that the whole world is linear and it's not high dimensional and it's not whatever so you know you have plenty of people who are who are doing you know work that's like really more like ambitious and expansive on the front of like the kinds of questions i can ask but they're making really simplifying assumptions uh in terms of like the source of data i have and the number of variables i have and uh not worried about that but i mean that's the compromise they make there's other folks building predictive models and trying to get close to like do something that works on real data but be naive about the kind of questions you can ask and you know not worry too much about just how limited is like what you could do with those predictions or their power to guide decisions in the real world and and then i think once you have that kind of tension of like okay everybody's looking at something and not looking at something else you know i think the question you have is like a research community is like are you over leveraged somewhere you know i think oftentimes people there's like a naive form of of a of a criticism which is like oh this thing sucks and this thing is good but like there's a more mature version of it which is like we're way over leveraged on this thing and it's paying way too much attention to this thing and neglecting these other things so it's like you know like more matter of like moving the needle it's not that like nobody should be building a bigger language model or or tinkering with architectures but it's sort of like okay like we're at a point where we're not getting nearly as much juice per squeeze um doing that why do we have 99 of the community uh engaged in this why do we have so many papers that are being submitted that most of which know you know are are not actually contributing anything either as an idea or as a result and so yeah i think we're we're sitting in like some funny terrain like that so were there notable uh papers or research advances that you think kind of poked at the at some of these issues that you're raising or you think are swinging the bat in the right direction yeah i think there's this very weird climate now which is like i think for a lot of these questions you have sort of like like uh a growing recognition that they're problems but then you have like uh a subset of people that are just kind of like taking advantage of the uh way in which the like peer review system is like over text and uh scattered and and like sort of just using the language of those problems but not actually addressing them and i think you see this in all of these and you see us in the fairness literature i think you see it in the the robustness literature i think you see in the causal like in that like people submitting papers that sound like they're addressing causal problems they're not actually people just saying this model's robust in a way where it's like by the way you can never just say a model is robust like if you state nothing about the ways in which the environment's allowed to change and you know um there's no such thing as like general robustness right because i can post two different assumptions about the world where um in one of them when the environment changes and you know this is what i should be doing and the other one when it changes that's what i should be doing i have no way of discerning which world i'm in right like class would be like do i live in the label shift assumption like if i make that assumption that like the distribution of categories is changing but the the class conditional distribution what does it what does uh you know kovid versus not covet look like is what's not changing but the prevalence is changing first do i assume that like the covariate distribution is changing but that the the label the conditional the probability of the label given the impulse is different i might have no way of discerning whether i'm in world a or world b but you know one thing is the right like the robust model in this setting and you know should do this and the robust model in that setting should do that um so you have like a like a set of people that are just kind of doing the deep learning thing which you know like prediction lets you get away with it like let me throw spaghetti at the wall because i get to evaluate on the hold out data and like how well i'm doing is identified so i don't have to be able to state in terms of any principle if you get like a causal effect you don't get to observe the causal effect so it's like if what you're doing doesn't actually identify the causal effect you could call it a causal you could you could you could just like use the language of causality in a deep learning favor it's not actually like addressing causality in any kind of sound way and fool a reviewer but not necessarily be doing it and so you know i i tend to think that like a lot of these other problems are kind of more foundational like they're not problems where like we know how to evaluate systems let's have people try stuff and whatever their problems were um so i'll give you an example like in um you know um the distribution shift world i mean there's a i think a handful of things people are doing that are a little bit more interesting or sound or actually um giving a path forward um there there's a group at stanford uh some of percy students um uh um shiori and pongway among them put together this really expansive benchmark called wilds and it's a uh like a collection across a whole lot of different application domains of a whole lot of different settings where you have some kind of subpopulation shift or some other kind of distribution shift and it provides like a sort of unified resource for a whole bunch of settings again like you still need to have some kind of you know you can't just like use the data set and say oh i tried this thing and just one domain and it generalized well to these two others therefore it's robust but at least it gives you some um like unified resource for asking a question you know like if you compare to the world where basically people are just saying like i have pictures of uh uh eminist images and then eminence images on funky backgrounds i think it's like a big advance towards like uh a nice sanity check and putting people in touch with the sorts of problems that are arising in the real world um there's uh one formulation of these domain adaptation settings is that uh there's a version called domain generalization um and here it's like sort of saying i have like a a bunch of different environments that i've collected data from and now i have you know i want to generalize well to target environments um possibly using the fact that like i can look at the different source environments that i've had and they're actually marked out as different environments i could try to see something like what's stable versus unstable across environments and they've been some interesting papers um um so uh by the way before we met i uh i think some of my students feel like what do you think are uh some of the interesting favorites i want to give some credit to my students who who are now like the extension of my memory uh so my student sarah pointed out there's a lot of interesting work where you have a whole lot of methods that are proposed um but it turns out that uh if you set up a really rigorous baseline and there's some papers uh some from uh cmu from our friend alan rosenfeld um and his advisor andre russeski uh some from um uh david lopez paz at uh fair but where they've shown shown things that like for a lot of these setups it's really really hard to beat like really stupid bass lines like just dump the data together and just do erm on it like just train on all the data together and don't use the environmental labels in any sophisticated way um you know in our own lab my students saurabh has been been making a lot of progress on these distribution shift problems and uh we have some results that we've been excited about like among other things working out when you're presented with uh you know you've trained on data from you've see you have some classes you've seen before and then suddenly at test time uh you have some data that shows up from some additional class that like you never saw before can you actually on the fly look at you know this previously seen data from from some classes and now additional data from some from some unknown class and identify like oh i can i say exactly precisely what fraction of the new data is from some previously unseen class and even develop a classifier that can now start predicting it so to say oh i think these samples have this probability of belonging to that class so you'd imagine that like in the context like a model monitoring pipeline you'd eventually like to live in that world where um if the world changes in some way the model could come back to you and say hey i think with high probability like you know at least 20 of your new data actually belongs to some new class that you've never seen before and here are some examples that i think belong to that class and then you could sort of you know um take some kind of corrective action if you think that uh the model's wrong um so uh that's on the robustness side um on the causality side and i got some of these tips from my student uh chanting with some of the work that uh we've been we've been talking about and uh going over um i think there's so causality research is really exciting because it actually gets to the question we care about which is like what would happen if i did this versus what would have happened if i were just a passive observer watching the decisions get made as they always are and and cause unfortunately a philosophically coherent way for answering those kinds of questions but the danger is that those answers are almost always predicated on some pretty strong assumptions that like our um you know i can learn this like uh the parameters of my causal model but the structure of the causal model is given like a priori and i know it exactly and there's no one observed confounding that you know can make all my results invalid and so there's a lot of interesting things happening um among them um there's some folks like carlos chinelli who uh just started a faculty job in the statistics department at uw have been doing a lot of interesting work on sensitivity analysis so if there's measurement error or if there's some some some omitted variable bias or something um just how you know like frameworks for being able to take just how much would there have to be for me to change my causal conclusion right so getting towards like i'm not just saying oh if i'm nailing all these ridiculously precise assumptions about how the world is then this is the answer to your causal query but saying like you know this is how far off those assumptions would have to be for like me to like have to totally change my mind um eric chechenchechin is a is a researcher at wharton a statistician who um does a lot of exciting uh work in this area and johnson hit me to a paper uh that he's doing which addresses a specific problem of you know people often make this assumption there's no unobserved confounding um and you know that is such a strong assumption because it's like even if you have the right confounder if you just measure it in a slightly noisy way and there's unobserved confounding um and so he's gotten to uh this formulation we call proximal causal learning and it's like you can allow that okay i have some proxies for the underlying confounders but they're not perfect proxies um and what can i do in that situation um and finally one one one thing on the on the causal inference side that i've been um really excited about is you know a whole lot of machine learning just sort of takes its stance which is like i've i've got this set of variables and i've got a collection of examples it's like something like you know my data looks something like a table now it could be kind of complicated because if it's like text the different documents could be different length but it's sort of the typical formulation that people work with don't usually allow for the setting where it's like oh i have a collection of a bunch of different data sets and i observe this thing in this data set and this other thing and that dataset but i feel like a lot of real world decision making is actually governed by that kind of um process i think we've all gotten a little bit of a crash course in this from just watching like uh the like coveted response like unfold in the public eye and it's like oh i've got this data from the cdc but it has these features but it doesn't you know it tells you how many reported cases but it doesn't tell you how many tests are on you know but oh i have this other data from the manufacturers of the diagnostic equipment and that data actually tells me what fraction of tests are positive not just what number of tests are positive and i have this other data from you know the local municipalities and so you get these questions where it's like if i have some question where you know i think very often we have queries especially in economics this comes up they call them like data fusion type problems but where like the answer can't come like directly i have no one data set that can necessarily answer my query but i have a whole bunch of different data sets and it's possible that if i combine them intelligently i could sort of like triangulate to the answer to the question that i have we talked about that on the infrastructure side as well is there kind of terminology evolving on the machine learning side for thinking about problems like this i for some reason also calls to mind graphical kinds of things and that you want you'd imagine some kind of connectiveness in the data and the way they're represented to one another well you know in the econ world they call these like data fusion problems and um someone who's done some really interesting work on that from like the ai ml side is a researcher named elias barenboim so he's a professor at columbia and he was youth pearl's grad student and uh now he's a prophet is alright doing a bunch of i think a lot of the you know super exciting work in this area and you know he he's gotten to these sort of questions where i get this paper from i don't know if it was it was technically 2020 but i read it in 21 so we can we can call it 20 21. um but on uh an algorithm pretty calls like a the general general identifiability problem so it's not just saying like oh i've got this one data set is this thing you know can i answer my causal query but it's like oh i've got this collection of data sets and in this data set these variables are observed and there's others that this other variable is observed and maybe it maybe this data set was collected by someone doing a particular kind of experiment on one of the variables and other you know so it's like i might have different data sets from different experiments they're not even necessarily just different views of like the same data one of them someone was intervening in some way but if you if you kind of have this collection of data sets and some underlying causal structure now can you tell me precisely how can i combine all these data sets to answer the question that you have or or or i guess like you know with causal questions always the first step is is it possible to identify you know the the answer to the question that you have based on the data that's available and then you know if you can well give me give me the formula such that if i plug in the data from these different data sets i could you know it would give me that that estimate so i was like is it estimable and if so like how do i produce such an estimate um yeah so it's you know i i think these are general exciting uh areas there's also a lot of work happening now in causal discovery um so uh this is a really ambitious problem um because so in causal inference you basically say i i know the structure of the causal graph i know which variables potentially cause which other variables but i just don't know the functions that determine you know so it's like x you know um x and y together influence z i don't know what is the function by which the values of x and y determine z but i know that like z listens to x and y like if i were to intervene on x that could potentially change the value of z whereas if i were to intervene on z it wouldn't change the value of x right and so if you have this kind of structure causal inference says well how do i like figure out you know basically what are those functions so that i could then answer a causal query um but there's like a sort of like and that by itself is super hard and you know we always never agree on causal effects because it's like well if if you assume the graph looks like this and it's slightly different or you know then then all bets are off um causal discovery basically says what if i don't even know the graph app for yori or i have some partial knowledge of the graph but i don't actually know fully which arrows you know go from which variables to which other variables um so in that case you know you you ask this question like well when is it possible to recover the graph and in general you can only recover the graph up to like something called an equivalence class um but now there's a whole bunch of other papers that i i don't have all the links i could send them to you offline but things they start asking questions it says okay like well if i'm able to do use causal discovery to get the graph up to like equivalence well now i can ask questions like what set of experiments should i run in what order to as efficiently as possible resolve any lingering ambiguities so like right just the observational data might at least tell me something like certain variables aren't connected to other variables and i'm able to orient for some edges in the graph like which direction do they point but others i can't you know but then um you know the hope of causal discoveries to be able to additionally do that so that's that's one you know exciting thing and um my student champion who's been working a lot in that area so we have a paper uh that gets at this question of if you get to make these decisions about kind of like i was telling you before if you have different data sets and by combining them you can answer a question but not necessarily by using one of them alone or even if you can combine them to answer the question there's still an unresolved question of how much data should i collect from this source versus that source in order to like as efficiently as possible pin down the causal effect we've been working on that problem of basically how how do i like you know imagine you're working at a company and you have some third-party data provider that charges you for uh you know i pay this much per thousand examples right like um you know how would i make the decision sequentially of like okay based on what i know now which data source should i query next and for how many samples and then okay now i update my beliefs i make a subsequent decision which i think this decision process is always going on in the background right like if you're a company that's buying data from people or going out and actively you know doing some kind of monitoring effort data collections you're making decisions on the fly about oh uh i want to collect data from here oh now i now i know something i didn't know before this is going to guide my decision of what to collect next but we don't usually formally model that process we usually sort of assume the data is already there and then focus on how do you estimate you know something given that the data is there so um yeah those are you know some somewhere so i'm excited about that direction and i think on the the fairness side i think one way that things are maturing is that people been posing these questions in ways that are maybe i'm familiar with this uh philosopher charles mills who passed away recently um yes he's this great like moral political philosopher and he writes about um sort of like ideal approach to theorizing about questions of justice and in um and i think it you know and my student i had a post-doc who graduated cena fossil for is now a professor at northeastern but we wrote a paper like a couple years ago just just making a connection between what's going on in ml and and this sort of framing of ideal versus non-ideal theorizing about justice that comes from um uh among other folks charles mills um but you know he has point that you know when you start posing like a question about equity or question about justice as a sort of like technical problem and you you make up a toy model um there's this danger that you know you you sort of highlight as like salient and relevant those parts of the problem that are captured by like your toy model and you relegate as like not even of you know academic consideration everything that doesn't show up in your model right so i think that and the danger here is that like if the things that you're just like completely forgetting about are actually like the everything that really matters you wind up in a situation where you could do a lot of academic tinkering and you could even develop like elegant mathematical theories um but they have almost nothing to say about the underlying question of justice that you care about and i think this is sort of the situation that we've been in to some degree and it's not to implicate everyone but it's say like the the main thing right is that we've been posing as questions of equity in the form of like say i have a data set uh say i have a particular feature um let me just sort of start enumerating different things that should be equal and then saying oh well it's not possible to make them all equal simultaneously so let's either just like naively pick one and then flesh out an algorithm for it or just kind of like crying about how fairness is impossible um and i think like what gets lost in that whole kind of discussion is that it's almost all of it it's like taking for granted just i've got a data set there's a bunch of anonymous features i don't really say anything about what they actually mean or what real world processes they correspond to or how disparities arise and how um that consideration really bears on what is sort of the appropriate uh response from a standpoint of like affecting justice uh like when is it you know do you we don't look at every single like we don't look at um um i don't know if you don't look at like the uh major league baseball for example and say like oh well i noticed there are more players from some country that you know than from some other countries you know relatives you know and say okay let me just equalize it instead of quota system but it's also because i don't you don't believe that like i don't know say some country that like you know excels in baseball like puerto rico so you don't think they've been giving an unfair advantage in getting to the major leagues or something and so like this whole backstory of like what actually are the um the the what actually do these variables mean and to the extent that there are disparities reflected in the data like um where do they come from and and and how do they correspond to some political uh some some coherent political stance or theory that sort of makes a straight line from that to who is responsible to remediate who has a responsibility to remediate it these are like fundamentally the concerns that we sort of always have when we speak i think in the law or or in a broader sense about questions of justice that some for some reason i think there's something i think maybe just about the fact that it's a new field or something like that but that for some reason have been just kind of completely sidelined or completely as a strong word but by by like the main branch of fairness research and i think there's a a number of people who are who are doing interesting work here to try to actually ask the critical questions and i think like lily who and um issa caller houseman uh are two people who i think just have been like kind of asking the right kinds of questions um for for a while and and kind of framing that critique in a way that i think is what's so rare it's like both really understands what's happening in like the sort of fair ml world and also really understands the sort of context and um like the the people actually understand ethics and actually understand like legal principles of justice and are um i think able to speak from some degree of authority to sort of what's missing and the way we're posing those questions and tackling them before before you jump into their work the things that come to mind for me are this idea of you know techno solutionism being part of the problem like we're trying to you know throw technology at the problems that technology is is creating for us that yeah we talked about that a couple years ago we did we did uh and also there's kind of a nod and and the way you talked about the problem of fairness to causality and you know when we all got really excited about causality a couple years ago you know and uh i think it was that same nurips uh was you know everyone left excited about causality like it was supposed to be the savior of fairness and it was you know applying causal modeling to machine learning more broadly was going to you know give us you know transparency give us fairness give us uh interpretability and you know break open all of the black boxes and all of that like where's the heart you know i think i i just i couldn't more highly recommend um issa and lily's work here that i think you know there's a there's a handful of works you know there's some work by elizabeth and by ilya spitzer that has and and before some earlier work by like matt kuzner that sort of pose different um notions or of that that are like within a causal framing kind of like coming out of like pearls causal modeling some notions of like you know the earliest versions of that say something like well there's a lot of different versions if someone wants to say something like okay it's not just a question of is race or or gender or whatever is considered as protected attribute does it turn out to be correlated with some outcome we want to ask some question of like is it what causes the outcome um and there there's a way that like these questions have been paused and it's actually you know the causal framing is not unique to machine learning it's actually something that um i think like the legal scholarship itself often expresses things in causal terms right um you know um and um before them you know i think like economists for example like you know there's a famous experiments right where people say well we sent like the resume experiment that i think cinderella millennium and some others ran or they they randomized names to be uh tip more more likely sort of um uh you know like black american signing is worth more likely to be white american as you know and then they they they send the resumes to people and they measure the response and it's it's an interesting experiment it's certainly like uh uh valuable research and the fact that there is in certain contexts a difference in the response rate like you know does jump out as um you know problematic right on the other hand if you sent them out and there was no difference in response rates um should we conclude in the other direction like oh there's there's nothing wrong right and and i think the answer is you know obviously different people will have different opinions and and the answer might be answered different in different contexts but i think that there's a lot of context which i think many of us or most of us would say that that's not necessarily the case right um for example if uh right like if you if you if you like if you're like oh you know because you know it's like what does it mean to just change the name right so i could change your i could change your name but uh and that wouldn't make a difference but if i changed what college you went to from say like hbcu to uh um you know say some some other school and that made a difference even if your name by itself conditioned on everything else didn't make a difference so so there's this notion that's baked into a lot of literature that tries to pose questions about um discrimination through a causal lens that sort of tends to adopt a rather like narrow notion of what could constitute discrimination as like the direct effect of some attribute like the direct effect of gender the direct effect of race on a decision and the problem is that well what about like all of these sort of potentially indirect effects that could still be you know if i were making someone to make a decision based on some factor that is super correlated with race and also irrelevant to the decision otherwise well like would do say that that's not discrimination um and so there's this you know then there's some work by elias berenbaum and ilia schmitzer which is i think a sort of step at least conceptually in in a more interesting direction whereas what they try to do is sort of you know if you have like a a causal model over all the variables uh you could say something like well let me disentangle the like the the how the effect of some attribute of interest whether it's race or gender comes to influence some outcome of interest along all the different sort of plausible causal paths that it may take and i can sort of attribute to what extent is this you know influencing the outcome by that variable versus via this by this path versus by this other path um now keep in mind though like that's sort of i think what's cool about it is it's like a thinking tool like in practice do we actually expect that we would have a causal model that captures all the variables of interest and actually says exactly we would know precisely which variables influence like every variable that goes from somebody's gender you know to uh whether or not they got hired you know we would be able to like we're going to trace like over what scale like over the scope of someone's entire life we're going to you know in build into our graph every opportunity right every every every decision every opportunity that someone was given or not given on that account like that we're gonna have a graph that is so rich is to capture all of that you know it seems unlikely but at least it gives you maybe like a thinking tool as like okay it's it you know at least i can conceptualize and step back and think about the fact that there are these um but among other things it outsources the normative work right at the end of the day presumably that the reason to disambiguate these different pathways is to say someone believes that you know they usually put as like in the terms of like some past are permissible like maybe they run through like unambiguous qualifications for the position being hired for versus other paths or sort of like impermissible past because all they're really doing is telegraphing uh information but they're not actually influencing uh you know they're not actually relevant to the job qualifications or whatever the context is but um they're still outsourcing the normative work of someone well someone someone has to go and say which paths are permissible and which paths are impermissible um and lily has a really sharp critique she also has a a nice set of um blog posts that are called like disparate causes i believe on on this blog phenomenal world but it goes into this problem kind of critically and among other things you know getting at this question of what we call like a direct effect or an indirect effect is partly an artifact of the representation that we have and there are some causal questions where for any kind of process that we describe there's multiple different valid causal representations conceivably right because um you can always like zoom into if i have this variable in this variable and an edge between them i can always zoom into it and say oh like it's not just that like you know uh someone's college influences their internship it's actually their college influences this subtle decision that's made by some recruiter which influences this which influences that and so you can always zoom into it and um sort of bring more into focus and whether or not someone would like look you know now like a very generic question like what is the average treatment effect maybe as long as you had whether you had a very sort of granular or very sort of um you know coarse representation of some process if they're both valid you'll have the same answer for for a question like that but this question about like what are the pathways taken is sort of like um and are they permissible or not is sort of an artifact partly of at what resolution do you zoom into this process and do you capture it and something might look okay you know if you zoom way out and you subsume a whole lot of mediators into just like an arrow but if you zoom in closely and you knew more about how that process took place then maybe you would say oh this isn't kosher um so i i think you know at a high level um i think that like causality gives us like a set of thinking tools for thinking critically about some of these problems and they are maybe in some way like a partial step in the right direction um but at the same time you know i don't think it's like a magic bullet that like sort of addresses all questions of of of fairness or justice or discrimination and i think that often you know they're they're that the the sort of like model that was sufficiently rich to be able to even if you believe that they were like you couldn't you wouldn't actually have you know uh you wouldn't actually be able to like produce the causal model so you could fully resolve those questions and i think one nice point that lily makes um and i think it might have been uh um in joint work with with isa but but i remember one of the points is is that you know i think that there's a lot of times like a dangerous it could be a little bit of a distraction you know if you said there's like heroic amount of you know i have to know every single variable and every single thing and estimate every single relation uh and before i can make any kind of conclusion about whether there's discrimination um that might not actually be necessary it's not in general what we do i think there are situations where we can size up um at a bird's eye view that there is there's some fundamental like inequity in society and and conclude that we ought that we we have some responsibility to do something about it and that that doesn't need to be contingent upon saying that like i've exactly estimated every single possible you know uh causal functional on the pathway uh of every single factor that you know plays any role in uh you know on the path to some decision that's made about someone in their life that that might set like a um at the end of the day like too high a bar that you know you kind of um they did i think i think we're we're able to recognize cases of discrimination and plenty of cases where where we're not able to do this kind of like you know um herculean like numerical feat let's maybe shift gears and talk a little bit about uh use cases or application areas that have made notable progress in 2021 um anything come to mind there you know well look um one one obvious one and you know as much as i might you know be kind of called on to contrary but like uh well one worry i think got to give some credit is i think um alpha folds from deepmind um like i i'm not a protein folding expert but i know some people that are not just like gullible deep learning boosters um who do work in the area and as far as i can tell it's like actually uh a pretty significant um leap forward um that you know it's work that you know could very well have you know won like a significant you know science surprise you know uh like that level of accomplishment um and you know that that's a little bit you know hearsay in that like i'm not i'm not uh an expert in in protein folding but as far as i understand it really is a a legitimate significant contribution and i think an area where you know maybe deep learning wasn't quite as you know inlined as a essential tool so that's certainly um a use case um i think you're starting to see um a lot of the use cases that were maybe obvious ones um but not necessarily um you know for example like radiology it's sort of like an obvious initial thing because and harkening back to like our earlier conversation about the difference between prediction and decisions well part of why radiology is like people see it as like this big target is that um you know there are certain roles of the radiologist where they really are involved in decision making and the recognition is part of a weird it's like interventional radiology but there are also are lots of people who literally are looking at images and making classifications and medical imaging is the case like a diagnostic kind of imaging right and i think that's the case where we've known since the moment you know the kind of big image recognition results started hitting and say 2012 2013 that radiology was a potential target and you had some maybe overly uh optimistic statements from like jeff hinton like if you're you know if i were if you're in medical school now do not specialize in radiology it hasn't quite gotten to that point it hasn't taken the radiologist out of the loop but i've been i've been chatting with a lot of radiologists recently and i've been surprised to find well sort of two things one on one side that some of the systems really are quite good and you actually have some systems being deployed already actually like actually piping information into patient records um and at the same time that i think some of the problems that we discussed earlier about what can go wrong are happening on the ground and you do have a situation where for example uh systems that work well on one set of equipment uh are not performing well on you know some new scanner which you know you know it's sufficiently similar to all the other scanners that a human radiologist should have no problem right yeah and these are not adversarial examples nobody's out there designing a scanner there's that there's not like there's like a radiologist out there like i'm gonna i'm gonna build a scanner that just up all the previous deep learning so that way we can like keep our jobs um so so i think you sort of have you know both uh a moment of the technology actually kind of like making landfall um but i think you also have uh some some moment of like the rubber hitting the road and people people sort of seeing um up front some you know up close some some of the ways in which the technology uh is brittle and dangerous yeah um and and i think that largely this might be like an unsexy story because it's almost like what's like the big sexy application but i think largely the story now and and i think that overall the biggest like if i were to like take out like a bird's eye view of the economy and just like i'm just watching like what is ai doing you know i think right the story of like 2014 2015 is like these new use cases like fundamental new things popping up like things we weren't doing with deep learning suddenly like machine translation people swapping out the old guts and sticking into deep learning systems and um suddenly like every single mobile phone having the capacity to run some kind of small deep learning model because it's being used for recognizing objects in the cameras and doing the face recognition that unlocks your phone and all that i think the bigger story of the last couple years has been more on the side of um more on the side of like deployment and like diffusion and like maturity of the like uh like operations around ml like i noticed more and more companies that like their pain point isn't that they need someone who could train a model their pain point if they need an ml ops person they need someone who who can actually keep the crap running day in and day out that someone who who knows you know it's a good if there's some specialized that it's not it's like a pure like ml researcher like someone like me then like i don't have the skill set i haven't spent my life in like you know there's a real serious discipline and like keeping software working day in and day out like the people it's it's amazing what we could do when you see companies that like have a software that you know product that like 400 million people use every day and they like go seven years without a single you know hour of downtime you know it's absolutely bonkers how difficult that is and machine learning is throws in a whole you know i think researchers don't have that but she throws like a weird set of complications because all kinds of ways that things can go wrong even if there aren't software bugs and so they need to understand something about enough about statistics they have some sense of what could go wrong in ways that you need to model things that aren't software glitches they're like the world-changing glitches like the world is the bug even if even if everything's coded precisely and need to be able to interface back and forth between like software developers ml engineers and researchers so i think like the maturity of ml ops and also just broadly like the the use of ml not just in you know i think there was a moment right when there was google amazon facebook uh microsoft and i don't know if you ever read this you know but like i uh i wrote this like satire bit um just because like when everyone was making a big deal about like oh whatever professor left to go to whatever company and it's their salary and they were kind of writing about it like you know almost like like football players getting traded or something and so i wrote this stupid post that was just like uh announcing that i had been hired as like the intergalactic head of machine learning by johnson and johnson or something and i'd be you know for some you know astronomical some and it was just a stupid joke but the point was like a year later i forget where i was and i met someone and they worked at like johnson and johnson ai research right and i think that this is part of like what's going on now is that like there was a moment in time i'm sure like i'm making this up i research it before the you know but you know this is what you come here to speak from academic authority just make up crap on your podcast uh i'm sure there was a moment in time where like there were only a small number of like elite tech firms that were using like modern sql databases yeah you know when it was fresh i think was that ibm when it was developed but there's probably a moment this is like a really hot fundamentally new technology that really changes business operations at places and there were like a handful of like super technical firms that knew how to do it and now it's like the most boring technical firm in the world uses sql right um and i think that this is a huge part of what's happening in ai if you were to size up like a commercial environment so i think there are exciting startups that are using this technology in new ways there are you know interesting things going on at like the sexy tech companies but i think there's a lot of like you know there's no company that you go to you know whether it's like uh you know i'm sure i'm sure if you went to like a waste management company like they're they're finding you know they're using ai for something or forecasting demand or trying to figure out how to route their trucks or something and i think that like this sort of just general like progression of ai from um like a luxury good to a commodity is like an essential part of what's going on and like the fact that like yeah like every company has you know this is becoming their concern um and i think i think part and parcel of that is the way that the tooling is getting better and better and better um a whole lot of companies you know like what are they offering it's like uh things that make it that like the stuff that everyone's already been doing for a while that anyone could do it right um and it's easy to track and you know it's easy to organize like you know i think this movement of like ai from a concern of like what's the new model to like what is like a stable workflow that we can adopt such that a company that can't spend half a million dollars for engineer can still use this technology like successfully and profitably i think that's a major part of the story of like the commercial application of ai right now um and it's kind of like it's a pretty unsexy story maybe of just like oh this is just becoming right but i think it's what happens to everything right like something's not like sorry if you're uh ai researcher but if you're in ml ops it's pretty cool oh yeah a lot of really cool stuff happening in that field and there's like a lot more jobs at every company in the world together than there is that like whatever it is like apple microsoft amazon facebook whatever so i think that that move before we run out of time uh i'd love to have you kind of dust off the crystal ball a little bit more and kind of share some of your predictions for the upcoming year years we've talked a little bit about where you'd swing the bat uh from a research perspective but yeah how do you think 2022 um you know it with the backdrop of kind of the you know the the i don't know if you'd call it a cooling or a slowing or boring a vacation or whatever you'd want to call it like um are there are there innovations that are you know you kind of see the the silhouette emerging from the shadows and you you think something's there right the kool-aid it's like it's like it's not the funny thing about that is it's not like uh it's all cooling or it's all heating up it's like i think the coolest or weirdest the interesting thing about it is that it's you know like whenever you sum up something like uh like a complex phenomena with like a single number it's like you always you lose a lot of information i think it's like right it's sort of like more like fall of the roman empire right it's like like the like rome's all right like roma is still partying like the borders are still expanding but you also have like you know like you have like cities being lost and whole countries going off the map and it's like i think that's happening right like you you have like like like uber ai shutting down air search you have like hiring freezes at major companies big the big leaders like having major hiring freezes not offering the kinds of salaries in 2022 they were offering in 2018 to like kind of like well-known researchers and at the same time you have like whole companies where like the shockwave hasn't even hit them yet and they're like first getting into the like like major health systems starting to adopt like you know deep learning and and i i think that yeah there's that going on if i had to predict what's going to happen like i'm going to double down on on uh decision making so you know i think something i'm already seeing a lot of is you know like you know you could think of like like two things came a few things came together that made ai so hot which is like one was suddenly the fact that like like the existence of easily queryable well-organized curated data at every single firm in the world you know the fact that like health companies started using electronic health records every company being basically an internet company everyone having a digital trace of all their customer interactions now you know we can get to a separate like normative point about whether we want to live in that world or like whether we're irked by the surveillance state but from a stand like an economic standpoint that happened together with like advances in both the tooling and algorithms around statistical modeling and so the question became we have this data we have statistical tools how do we do this like analytics on the data right um but there's another side which is like how do how do we guide how do we use the data to guide actions and i think people um i think one one thing that is underutilized by most firms and i think only a small number of people are really sophisticated about it is is is really focusing on this like the decision problem and part of that is part of that is you know offline causal inference which is some of the stuff we were talking about like how can i use some some causal background knowledge together with the data that i have to uh infer a causal effect and use that to guide a decision but a huge part of that is experimentation and i think that this is a huge thing that not enough companies do that you're gonna start seeing you know become you know obviously like amazon you know has has what they call like web labs you know where the you know google like you know does randomized control trials for you know which shade of green the g should be in google or something um but i think most companies grossly underutilize experimentation like really methodical experiments because you know that plays into the data picture online experiments in particular well online not necessarily in the sense i mean i think online is part of you know in a sense of like you know doing like reinforcement learning having like a policy that's adaptive as you're getting the results but even experimenting at all right like just like we've been guiding if you look at how we guide personalized decisions it's often in the context of i just take passively collected traces of people's data i do some kind of latent factor analysis or whatever to build a recommender system versus actually i'm going to randomize choices and try to estimate the sort of like you know potentially like heterogeneous treatment effects of how different people will respond differently to different things but actually to estimate the effects on um you know whether it's people's behavior or whatever you know i don't mean this is sort of like uh sound like i'm i'm um uh advising that we like really nearly experiment on people without thinking about the considerations or which which decisions or which experiments are potentially like of ethical import and obviously there's a lot that needs to there's a lot of considerations that need to go into how how you do that and doing it right um but yeah i i think that you know the the reckoning we're seeing i think is over and over again is like people claiming that is gonna you know gonna personalize this personalize that it's gonna lead you to make all these different decisions in better ways and then people find like oh i just naively trained a predictive model came up with some heuristic for how to operationalize that as a decision and something didn't go as planned and i think that um people actually getting more into this world of both using offline you know kind of causal inference on observational data but also actually experimenting in the real world and you know developing more mature processes for saying how do i how do i test hypotheses how do i see what the impacts are of you know different actions that i have i think that that's going to become more and more and more important and you're gonna start seeing the like hiring focus you know and just like where teams start moving um you know towards those kinds of problems and again i i don't think this is like overnight you're going to go from people like hiring 90 percent you know deep learning you know like like pytorch jockeys to like 90 hiring um you know experts and like you know uh banded algorithms and causal inference but i do think that there is a um there's a shift here i'm seeing at every level i'm seeing it in what what looks you know interesting among new students what looks interesting among um folks hitting a hiring market i think that this sort of intersection of like um you know cs operations research economics um and bringing to bear like you know tools of predictive modeling that we've gotten but also um um more sophisticated processes of experimentation and and estimating causal effects and and principles of just guiding you know intelligent decision making um i see i see there's like a i think there's like a growing up process happening there and um you know i think the other thing though is um well one thing i add on and this is not like a specific prediction but a meta prediction is the you know you know like the the internet like like web 2.0 web 3.0 whatever the hell we're doing like very little there's a lot of like new there's a lot of new stuff that we're seeing in like the way companies are behaving and the way they're interacting with people um that isn't technologically new right there's a lot of stuff that like you could have done from the late 1990s um the tooling wasn't is there which restricted how many people could develop it um but it was something else it was something about like you know there was a capability that came and a few people have figured out some very like some early players that figured out like you know how to conquer e-commerce like amazon whatever it took a long time before you got to uber right and so there are certain innovations there that were like you know it's like there were a few a bunch of pieces that need to fit together like a certain understanding of like markets or a certain understanding of like usage patterns of phones with the technological capability that had been there all along and i think that like there's there's a kind of like innovation in deployment that doesn't actually correspond like i think when people have been stoked about ml recently right it's been like oh bert is like good at classifying text or you know like you know seek to seek you know lcms and then transformers are good at you know this one thing just like single purpose models but like i'll give an example so i'm an advisor for companies full coi i'm an advisor for a company called a bridge ai and um uh a bridge is is a company that like is sitting between like doctors and the patients and and sitting in this like interaction where um patients uh it turns out patients are recording their visits on their cell phones and they're doing this already sometimes surreptitiously sometimes with the doctor's consent it may or may not be wiretapping depending upon like what your particular state is like you know two parties so their idea was like let's inline this as like a normal part of the doctor patient interaction like let's have permission let's have both parties get in they'll agree to record the conversation they'll pull out a bridge and then they record it and and there's all kinds of different things you could do right like you can help the doctor to draft a summary of the visit you can help the patient to understand like oh like you know don't forget like you mentioned that you uh would be starting this new medication have you picked it up or have you you know called in that prescription or did you schedule up this follow-up so there's like a million different places to plug in models and any one of them by itself may or may not be like you know a single purpose like major innovation but the the ways that you can mix and match these like okay i've got the conversation i just gotta send it to an asr model i get back the text of the text i need to flag out like well what are the interes what are the relevant or salient parts of the conversation how do i then take that turn it into like an interface feature that you know provides some value or make things useful to the patient and i think that like a lot of things like when alexa works really well right you know or google home or anything when it works really well it's usually not because there's one model that's like magnificent it's like the magic is in the clever way that they stitch together you know some some some astute observations about what are the common interaction patterns together with like what are the what were the right little places you could patch in machine learning and the right ways that you can like patch in some intelligent heuristics and rules around it such that like you know you have like an end-to-end product that feels like it's magic right um you know shazam even is a little bit like that but one of these things where like there's a few like like a little heuristic but if you start thinking how do i decompose this into like something that works it's like you can make a pipeline where every single step of it's kind of simple but like the end of the end result is something where it feels a little bit magical right and i think that like this is going to be i think a major part of this is like i think we've been looking at like people who are really good at building single purpose models then turning that into a big startup and or trying to turn parlay that into a startup and i think there will be some amount of like the single purpose models are mature and they'll get a little bit better but what's maybe under explored a bit are ways that you mix and match models together with cool interaction patterns and you know some clever understanding of like what people want and you know what data is available etc to build like kind of user experiences that maybe under the hood are invoking like seven different models in seven different contexts and but it's like kind of hidden from the user in a clever way where it just feels like you're having like it adds up to a new capability that no one model um or piece of software like by itself would provide and so i do think that there is some element of this like you know um like we've built a bunch of cool legos and we haven't given people that many years to like you know instead of like you know some innovation comes from like i designed a new lego piece but like i think a lot of innovation will come from people that are you know you know don't have to be like have like off the charts skills at building legos but they're really they have a kind of design sense for you know what are cool ways to put them together yeah i think that's a it's a natural consequence of the the broader maturity conversation we've had we've been having right the lego pieces are you know not that we've come up with every lego piece that's ever going to be created and that there aren't some cool ones to come but all the basic pieces required to build really cool stuff is in place and now it's all about how do you put them together yeah and even more so than the pieces themselves the tools to easily put them in place you've got your hugging faces you've got your mlops tools like it's a great time to be a builder right yeah it also like it takes some of that work away that it allows you to focus right i think music's like that a little bit right like you know there's this way like when you're learning an instrument and you're like i gotta practice articulation i gotta practice rudiments i gotta practice scales i gotta do this in here you know you're sitting you're going dude and they're playing this kind of over and over and over again when you're like 10 11 12 13 14 years old but you get to some point where like maybe you still practice like that one hour a day but when you go to play you're not even thinking at that level at all like it's not and i think there is some element of that of like people using machine learning recently have been thinking like just like how do i get the data and train a single model and i think once you have a lot of these contexts where maybe you don't even need to train a model maybe there's an off the shelf model that's sufficiently good at this task going to work better than anything you could train even if you're applying it sort of on slightly different domain shifted data right then you start getting to this point where uh right like like the difference between like a great artist and like a super boring artist isn't that like the great artist is better at scales you know it's not at that point you know right it's not like oh like like miles davis play like better you know play like cleaner scales than uh you know like i don't know these state-of-the-art is staying on key or something like that it's not like that right yeah so so i think there's you know a lot of a lot of innovation to be had on that side yeah awesome awesome well zach it has been wonderful catching up uh let's make sure it's not two years until the next time yeah right who knows what pandemic will be in full swing by that time awesome well thanks so much for uh helping us reflect on 2021 and in the ml and dl domains and uh catch you next time yeah thanks for having me sam great to see youall right everyone welcome to another episode of ai rewind 2021 today we are joined by zachary lipton an assistant professor in the machine learning department and operations group at carnegie mellon university to talk through all things machine learning and deep learning zach last joined us on the show for the 2019 edition of rewind and i'm super excited to have him back once again zach welcome back to the chuma ai podcast cool thanks for having me sam great to see you again it is great to see you again i think the last time we uh physically had the opportunity to hang out was also 2019 in vancouver uh i think that's probably a story shared by a lot of folks in our field like that was the last opportunity that folks had to hang out in person right um how have the last couple of years been for you oh man it's been eventful um you know i'm not gonna pretend it's all been smooth but i mean some things are nice like my students are great and i think have been i think it's not been easy for everyone like some people um some people got sick some people lost someone some people didn't get to see their family for a couple of years um on the other hand like people i feel like it's a weird thing where people manage to be startlingly productive or at least you know maybe i don't want to shame anyone who doesn't feel productive i feel like from just my feeling in my life i feel like people are i mean it's just kind of like cmu culture or science i thought people have been really um yeah like locked in in research but i think there's a kind of like emotional wear and tear of just not seeing anyone like especially like some folks are like living by themselves you know and then like when they were quarantined not seeing another human for six months and for others of us just like catching up now so it's been it's been a little bit wild um but you know interesting on the research side and um you know we got a puppy so like interesting personally nice nice uh so as i mentioned in the lead up we are here to review the year in ml and deep learning this is the uh the third rewind that will publish the first couple were in nlp and computer vision uh or computer vision and nlp in the order that they were published and so far a couple of key themes have emerged uh one which was common in those first couple of uh of episodes is this idea that uh as john bohannon put it nlp eating machine learning kind of like in the same way we would say you know ai eating the software or what have you the idea that computer vision is uh adopting transformers and things like that you have echoed one of the other observations that john made in that nlp conversation and it is um that particular point is kind of a slowing down of the field and uh a little bit of a respite from that kind of breakneck pace of uh of change that we were experiencing for a while uh so maybe that is a place for you to jump in and and riff for a bit yeah happy to have you to riff um you know like justin stino maybe i mean i'm like choking a contrarian or something i'll i'll start by you know maybe pushing back on one thing that's kind of interesting of like you know that that phrase of like nlp eating ml is is kind of cute because it's sort of well among other things right like in some sense there there's like the the line for for the longest time for the last seven years has sort of been machine learning eating nlp and that like if you look at the set of people going into uh like sort of an nlp oriented like grad party there's a point where like nlp sat really close to like you know they were like nlp and computational linguistics like sort of two sides of a coin and they sat not so far from their like philosophers of linguistics or whatever and now you have it you have a moment for the last however many years where the median person in nlp knows absolutely nothing about language there's nothing interesting to say about language that couldn't just as easily and it's not to say nobody or that there isn't anyone with something interesting interesting observations or interesting experiments that are that are kind of hitting on both sides but say that like the center of gravity of the field has moved to this way that almost there there's almost no l in nlp you know it's just like yeah um it's just sort of like you know a set of tools where if if the like commercial demand was more on music than on nlp you would use almost you know conceivably like the same set of models because all they care is just like a sequence of tokens like a very generic sort of approach and so in some sense it's sort of just been like deep learning eating nlp has been the story for a while and i think that like this version of nlp eating ml is well i guess one they don't really mean an ml but really more just like other application areas yeah yeah i think you know and they don't really mean nlp as much as the thing that ate nlp which is transformers right whatever is like that like new organism that displaced mlp is now coming across but right it's more like you know there was like a discipline of computer vision where you had people that like the the typical person who was in there knew something about like the physics of light and optics and like was sort of like doing this sort of like you know that that was the angle they were like a real expert on like the the modality of vision and the person nlp knew something about language and i think they both got eight by deep learning in such a way that you know over the last seven years ideas that would hit on one side could very easily pour it across to the other and for most of that history um i think it's hard to say precisely why if there's some reason or if it's just sort of the the order in which things happen um like the those breakthrough image net results that really caught people's attention were envisioned first but i think for most of that history it's been very one-dimensional like very one-directional of um i think things going from you know mostly in the direction of computer vision to nlp and i think if anything this is not really enough eating vision but it's just notable that maybe one of the bigger things happening in vision is crossing in the other direction like contrary to that pattern um but yeah you know broadly on the like things slowing down pattern i i think i've been i've been noticing writing about this for i don't know maybe like four years now but i think there's there's definitely uh a moment where like if we were to look through the history and say like 2012 imagenet 2014 like sequence of sequence models 2015 uh alphago um 2016-17 big advances and like uh the kind of like uh perceptual quality of like generative models 2017 transformers 2018 um bert um there was a kind of change that you know i i don't think these are necessarily all profound in the sense of like some big intellectual move but they are like qualitative changes in capabilities it's like there's a big move in the sense of like what set of problems do i think are best tackled with these tools and what sort of performance can i expect from them and a big difference in the sense of if i'm a practitioner in the field and somebody hits me with a typical like industry problem what's my go-to tool and if we look at like 2021 and now we're saying well if someone hits you with a a classification task what are you going to do is going to use a resnet from 2014 and 2015 you know someone hits you with nlp tasks it's like basically fine-tuning bur or a bert like you know very roberta alberto you know whatever it's like you know right like there's this moment where you know um like i i think that in some sense i think it's okay in that um like researchers now need to like start looking somewhere else other than just like what if i tweak the architecture a little bit like this there's a and as i was telling you like when we were riffing before is that i think the research is there is some aspect of people like roping around in the dark looking for looking for a way in it's almost like they're like swinging at a pinata with a blindfold on and trying to find like where where is where is there an angle that like where is there something big and i think you want to have a lot of researchers in that mindset of like i'm looking for the blind spot like i'm looking for the big prize that other people aren't looking for um and look every now and then someone gets every now and then someone really lands a mark and the pinata rips open and a bunch of candy falls on the floor and then everybody rushes on and there's some period of time where nobody is worried about like nobody even knows where the bat is everybody's just picking candy off the floor um and you know i think where we saw that period of like people finding all these you know every it wasn't like you didn't need a big intellectual breakthrough to have a big um to have an impactful breakthrough uh in all those years and i think we're getting to the point where like you know there is some amount of stagnation because like most of the good candy has been picked up and people are like looking at the old grimy like moldy stuff that's like you know maybe there's still something in there it's like it's like those like the sloppy seconds on the uh the the the research you know pinata uh with that analogy in mind you know where should research be swinging the bat do you have is that a is that it you know it's a crystal ball kind of question but um what does your intuition tell you where opportunities might be well i always look for you know what what is it that we actually care about when people are selling a story an aspirational story about um you know i'm not like a mathematician first i'm not just like what's a hard problem like let's just solve for that reason you know i got into it too late you know uh i kind of back into it from like what's actually like what's the dream and if you look at like the dream people are selling people even people with like existing companies right now the claim they're making if you look at like ibm watson which went up in flames look at the claims they were making like what are we gonna do for you it's like we're gonna you're gonna make better decisions you're gonna provide personalized healthcare you're gonna help people to you know have better health outcomes than they otherwise would have without our ai if you look at this kind of stuff you know what are people selling what are people hoping to actually achieve and then you look at like what are people actually doing and if i like i always kind of look back and forth between those and say like what's what's like the missing part like if you actually want to realize the dream of what you know people i just seem to want what's what's gone like what's what's not even being addressed in a mature way and so i think one one thing that sort of jumps out is that everybody's sort of premise you know everyone says like these things are good it's always based on some notion of like accuracy or the return of an rl system as evaluated on some fixed static environment and and then you look at what people are actually doing is like they're taking some model trained in some context on some set of data and deploying that crap in some different environment which is changing and unpredictable ways um and where the whole environment is not just changing like in a kind of benign or passive way often it's changing in direct response to like you know like think about google search right you deploy uh google every single time they tweak their algorithm what's the first thing that happens and it's like all the message boards light up and all the seo goons they're like oh like seo change the algorithm now now you need to add this keyword you need to do this and i think that ml doesn't address that kind it's like i might say ml doesn't i don't mean no nothing that we aspire to in l but i mean like the the main thing you know the main thing that practitioners do they're the toolkit the mature one but like you know i know how to use pie torch and train you know resnet and resnext and whatever whatever um that world it's like completely set in in the environment of like i train a model evaluate on a sort of like iid hold outset or even if you evaluate on some kind of challenge set it's not like with any coherent principle for why you should expect this model to do well on that challenge that or why you should think the performance on that challenge that's representative of what you should encounter in the world so i think performing in a dynamic world making decisions and not just predictions right because everybody's sort of saying ultimately if you think you're going to make money off of this or you think you're going to affect some kind of like societally beneficial outcome by using aia if you think you're going to do anything then ultimately it's like what you're the claim at some some point what you're hoping to do is guide some kind of decision or automate some kind of decision right you're actually hoping to have an outcome not just to like be a passive observer to the world and make accurate predictions about what would happen were you not to take any action at all and and this kind of setting of like actually providing guidance for what you should do in the world is is we are you know it's a thing like yeah there are people working on causal inference there are people who are trying to bring reinforcement learning closer to the real world by you know maybe incorporating some ideas from causal influence like to think about confounding that might exist in the data to be able to build models and off policy kind of way you know so that you're not just saying i'm just going to deploy some randomly acting system and uh important application and have it suck for two million years until it learns um but they're relatively immature and then they get relatively little attention if you were to look at like you know what do people you know what is you know i'm not telling beat up on our buddies in the press because what is like you know kade gonna write a big article about in the new york times it's not typically like the the the slog of like scientific advances and making robust machine learning or um you know off policy rl or something like this it's it's uh you know there's a big neural network that you know has nine trillion parameters and a billion dollar investment from microsoft and this kind of and and so i think right um decision making uh robustness and dynamic environments and and actually addressing certain societal desert erotic you know people have sort of noticed the problems that arise in terms of the um ways sort of like an ai system can affect whether it's like um unethical sort of outcomes if you plug on the naively into certain decision systems but the sort of like field of actually like developing systems that could in some coherent way align with societal desert is quite primitive right so there's like a recognition that there's a problem but we're very early stages on getting towards solutions so i think that to me like these are the areas i think are are more interesting um you know in that like look if you can get if you can squeeze half a point out on like you know all the nlp benchmarks by like making a slight variation on bert you'll get a lot of citations that everyone will use it and they should and it is useful but it's i feel like not it's it's like a direct it's like a a slight change in degree it's not a change in kind and so i think you know when you when i look at the field i think the the luxury of being an academia the reason it'd be in academia is to think that like i don't have to just think how do i do epsilon better than someone at the same crop we're all doing tomorrow but like what what actually is something that you know addresses you know some problem that nobody's even engaging with intellectually right now it sounds like your answer then to the where to swing the bat is in getting closer to you know real world problems that the folks are having and you mentioned a lot of different elements um you know i heard some aspects of domain generalization in there i heard um you know aspects of even like user interface like how you're presenting the information heard aspects of fairness in there but uh broadly it sounds like you're kind of also calling into the question kind of the simplification that often happens in research of problems that removes them from you know all of the constraints and fuzziness of the real world yeah and like it's tricky right because everybody's everybody thinks they're doing that well it's more like everybody's gotta choose to focus on something and and to focus on the thing they want to focus on they got to compromise on something else um and and it's not that like one thing is right or wrong like i don't think it's wrong if there's people out there building bigger language models i don't think that's fundamentally wrong um i think you got to be like um you know maybe like gotta use your brain to think about how how what kind of claims i can make about these things or or how should they be used in the real world but i think like look it's interesting to turn that knob and say what if i make this big or what you know what happens um there there's plenty of work to be done um you know like one kind of like trade-off that you often have is that um you know you have like if i want to get close to what real data looks like you know and i want to get close to things i can actually do in a bunch of domains often like all that's available is you know uh data's um like there's there's a way that like predictive modeling and like the status quo you know is closer to the real world in that it touches real data and it gets within it's like narrow aspiration of like just predict well on like iid data like under a naive assumption about how the world doesn't change it's able to do that on really complex high-dimensional real-world data um on the other hand what you give up when you focus only on just that problem is any kind of consideration of you know i think people just only think about how do i get better predict you know building predictive models um is you know like they're getting close to dealing with real data but they're asking a very narrow set of questions about it which is like how do i get higher accuracy then on the other side you know you have folks say for example trying to get at like um fundamental questions about um what sort of like causal queries i can make and very often in order to flesh out those questions now another they're taking on like a more ambitious set of like kinds of queries that i can ask but in order to make progress like understanding the fundamental form of those questions this often starts with well i gotta analyze like fundamentally one of these questions even answerable from the data sets that i have in order to like maybe get some of that analysis to go through i have to make some simplifying assumptions about the form of the data like i assume that the whole world is linear and it's not high dimensional and it's not whatever so you know you have plenty of people who are who are doing you know work that's like really more like ambitious and expansive on the front of like the kinds of questions i can ask but they're making really simplifying assumptions uh in terms of like the source of data i have and the number of variables i have and uh not worried about that but i mean that's the compromise they make there's other folks building predictive models and trying to get close to like do something that works on real data but be naive about the kind of questions you can ask and you know not worry too much about just how limited is like what you could do with those predictions or their power to guide decisions in the real world and and then i think once you have that kind of tension of like okay everybody's looking at something and not looking at something else you know i think the question you have is like a research community is like are you over leveraged somewhere you know i think oftentimes people there's like a naive form of of a of a criticism which is like oh this thing sucks and this thing is good but like there's a more mature version of it which is like we're way over leveraged on this thing and it's paying way too much attention to this thing and neglecting these other things so it's like you know like more matter of like moving the needle it's not that like nobody should be building a bigger language model or or tinkering with architectures but it's sort of like okay like we're at a point where we're not getting nearly as much juice per squeeze um doing that why do we have 99 of the community uh engaged in this why do we have so many papers that are being submitted that most of which know you know are are not actually contributing anything either as an idea or as a result and so yeah i think we're we're sitting in like some funny terrain like that so were there notable uh papers or research advances that you think kind of poked at the at some of these issues that you're raising or you think are swinging the bat in the right direction yeah i think there's this very weird climate now which is like i think for a lot of these questions you have sort of like like uh a growing recognition that they're problems but then you have like uh a subset of people that are just kind of like taking advantage of the uh way in which the like peer review system is like over text and uh scattered and and like sort of just using the language of those problems but not actually addressing them and i think you see this in all of these and you see us in the fairness literature i think you see it in the the robustness literature i think you see in the causal like in that like people submitting papers that sound like they're addressing causal problems they're not actually people just saying this model's robust in a way where it's like by the way you can never just say a model is robust like if you state nothing about the ways in which the environment's allowed to change and you know um there's no such thing as like general robustness right because i can post two different assumptions about the world where um in one of them when the environment changes and you know this is what i should be doing and the other one when it changes that's what i should be doing i have no way of discerning which world i'm in right like class would be like do i live in the label shift assumption like if i make that assumption that like the distribution of categories is changing but the the class conditional distribution what does it what does uh you know kovid versus not covet look like is what's not changing but the prevalence is changing first do i assume that like the covariate distribution is changing but that the the label the conditional the probability of the label given the impulse is different i might have no way of discerning whether i'm in world a or world b but you know one thing is the right like the robust model in this setting and you know should do this and the robust model in that setting should do that um so you have like a like a set of people that are just kind of doing the deep learning thing which you know like prediction lets you get away with it like let me throw spaghetti at the wall because i get to evaluate on the hold out data and like how well i'm doing is identified so i don't have to be able to state in terms of any principle if you get like a causal effect you don't get to observe the causal effect so it's like if what you're doing doesn't actually identify the causal effect you could call it a causal you could you could you could just like use the language of causality in a deep learning favor it's not actually like addressing causality in any kind of sound way and fool a reviewer but not necessarily be doing it and so you know i i tend to think that like a lot of these other problems are kind of more foundational like they're not problems where like we know how to evaluate systems let's have people try stuff and whatever their problems were um so i'll give you an example like in um you know um the distribution shift world i mean there's a i think a handful of things people are doing that are a little bit more interesting or sound or actually um giving a path forward um there there's a group at stanford uh some of percy students um uh um shiori and pongway among them put together this really expansive benchmark called wilds and it's a uh like a collection across a whole lot of different application domains of a whole lot of different settings where you have some kind of subpopulation shift or some other kind of distribution shift and it provides like a sort of unified resource for a whole bunch of settings again like you still need to have some kind of you know you can't just like use the data set and say oh i tried this thing and just one domain and it generalized well to these two others therefore it's robust but at least it gives you some um like unified resource for asking a question you know like if you compare to the world where basically people are just saying like i have pictures of uh uh eminist images and then eminence images on funky backgrounds i think it's like a big advance towards like uh a nice sanity check and putting people in touch with the sorts of problems that are arising in the real world um there's uh one formulation of these domain adaptation settings is that uh there's a version called domain generalization um and here it's like sort of saying i have like a a bunch of different environments that i've collected data from and now i have you know i want to generalize well to target environments um possibly using the fact that like i can look at the different source environments that i've had and they're actually marked out as different environments i could try to see something like what's stable versus unstable across environments and they've been some interesting papers um um so uh by the way before we met i uh i think some of my students feel like what do you think are uh some of the interesting favorites i want to give some credit to my students who who are now like the extension of my memory uh so my student sarah pointed out there's a lot of interesting work where you have a whole lot of methods that are proposed um but it turns out that uh if you set up a really rigorous baseline and there's some papers uh some from uh cmu from our friend alan rosenfeld um and his advisor andre russeski uh some from um uh david lopez paz at uh fair but where they've shown shown things that like for a lot of these setups it's really really hard to beat like really stupid bass lines like just dump the data together and just do erm on it like just train on all the data together and don't use the environmental labels in any sophisticated way um you know in our own lab my students saurabh has been been making a lot of progress on these distribution shift problems and uh we have some results that we've been excited about like among other things working out when you're presented with uh you know you've trained on data from you've see you have some classes you've seen before and then suddenly at test time uh you have some data that shows up from some additional class that like you never saw before can you actually on the fly look at you know this previously seen data from from some classes and now additional data from some from some unknown class and identify like oh i can i say exactly precisely what fraction of the new data is from some previously unseen class and even develop a classifier that can now start predicting it so to say oh i think these samples have this probability of belonging to that class so you'd imagine that like in the context like a model monitoring pipeline you'd eventually like to live in that world where um if the world changes in some way the model could come back to you and say hey i think with high probability like you know at least 20 of your new data actually belongs to some new class that you've never seen before and here are some examples that i think belong to that class and then you could sort of you know um take some kind of corrective action if you think that uh the model's wrong um so uh that's on the robustness side um on the causality side and i got some of these tips from my student uh chanting with some of the work that uh we've been we've been talking about and uh going over um i think there's so causality research is really exciting because it actually gets to the question we care about which is like what would happen if i did this versus what would have happened if i were just a passive observer watching the decisions get made as they always are and and cause unfortunately a philosophically coherent way for answering those kinds of questions but the danger is that those answers are almost always predicated on some pretty strong assumptions that like our um you know i can learn this like uh the parameters of my causal model but the structure of the causal model is given like a priori and i know it exactly and there's no one observed confounding that you know can make all my results invalid and so there's a lot of interesting things happening um among them um there's some folks like carlos chinelli who uh just started a faculty job in the statistics department at uw have been doing a lot of interesting work on sensitivity analysis so if there's measurement error or if there's some some some omitted variable bias or something um just how you know like frameworks for being able to take just how much would there have to be for me to change my causal conclusion right so getting towards like i'm not just saying oh if i'm nailing all these ridiculously precise assumptions about how the world is then this is the answer to your causal query but saying like you know this is how far off those assumptions would have to be for like me to like have to totally change my mind um eric chechenchechin is a is a researcher at wharton a statistician who um does a lot of exciting uh work in this area and johnson hit me to a paper uh that he's doing which addresses a specific problem of you know people often make this assumption there's no unobserved confounding um and you know that is such a strong assumption because it's like even if you have the right confounder if you just measure it in a slightly noisy way and there's unobserved confounding um and so he's gotten to uh this formulation we call proximal causal learning and it's like you can allow that okay i have some proxies for the underlying confounders but they're not perfect proxies um and what can i do in that situation um and finally one one one thing on the on the causal inference side that i've been um really excited about is you know a whole lot of machine learning just sort of takes its stance which is like i've i've got this set of variables and i've got a collection of examples it's like something like you know my data looks something like a table now it could be kind of complicated because if it's like text the different documents could be different length but it's sort of the typical formulation that people work with don't usually allow for the setting where it's like oh i have a collection of a bunch of different data sets and i observe this thing in this data set and this other thing and that dataset but i feel like a lot of real world decision making is actually governed by that kind of um process i think we've all gotten a little bit of a crash course in this from just watching like uh the like coveted response like unfold in the public eye and it's like oh i've got this data from the cdc but it has these features but it doesn't you know it tells you how many reported cases but it doesn't tell you how many tests are on you know but oh i have this other data from the manufacturers of the diagnostic equipment and that data actually tells me what fraction of tests are positive not just what number of tests are positive and i have this other data from you know the local municipalities and so you get these questions where it's like if i have some question where you know i think very often we have queries especially in economics this comes up they call them like data fusion type problems but where like the answer can't come like directly i have no one data set that can necessarily answer my query but i have a whole bunch of different data sets and it's possible that if i combine them intelligently i could sort of like triangulate to the answer to the question that i have we talked about that on the infrastructure side as well is there kind of terminology evolving on the machine learning side for thinking about problems like this i for some reason also calls to mind graphical kinds of things and that you want you'd imagine some kind of connectiveness in the data and the way they're represented to one another well you know in the econ world they call these like data fusion problems and um someone who's done some really interesting work on that from like the ai ml side is a researcher named elias barenboim so he's a professor at columbia and he was youth pearl's grad student and uh now he's a prophet is alright doing a bunch of i think a lot of the you know super exciting work in this area and you know he he's gotten to these sort of questions where i get this paper from i don't know if it was it was technically 2020 but i read it in 21 so we can we can call it 20 21. um but on uh an algorithm pretty calls like a the general general identifiability problem so it's not just saying like oh i've got this one data set is this thing you know can i answer my causal query but it's like oh i've got this collection of data sets and in this data set these variables are observed and there's others that this other variable is observed and maybe it maybe this data set was collected by someone doing a particular kind of experiment on one of the variables and other you know so it's like i might have different data sets from different experiments they're not even necessarily just different views of like the same data one of them someone was intervening in some way but if you if you kind of have this collection of data sets and some underlying causal structure now can you tell me precisely how can i combine all these data sets to answer the question that you have or or or i guess like you know with causal questions always the first step is is it possible to identify you know the the answer to the question that you have based on the data that's available and then you know if you can well give me give me the formula such that if i plug in the data from these different data sets i could you know it would give me that that estimate so i was like is it estimable and if so like how do i produce such an estimate um yeah so it's you know i i think these are general exciting uh areas there's also a lot of work happening now in causal discovery um so uh this is a really ambitious problem um because so in causal inference you basically say i i know the structure of the causal graph i know which variables potentially cause which other variables but i just don't know the functions that determine you know so it's like x you know um x and y together influence z i don't know what is the function by which the values of x and y determine z but i know that like z listens to x and y like if i were to intervene on x that could potentially change the value of z whereas if i were to intervene on z it wouldn't change the value of x right and so if you have this kind of structure causal inference says well how do i like figure out you know basically what are those functions so that i could then answer a causal query um but there's like a sort of like and that by itself is super hard and you know we always never agree on causal effects because it's like well if if you assume the graph looks like this and it's slightly different or you know then then all bets are off um causal discovery basically says what if i don't even know the graph app for yori or i have some partial knowledge of the graph but i don't actually know fully which arrows you know go from which variables to which other variables um so in that case you know you you ask this question like well when is it possible to recover the graph and in general you can only recover the graph up to like something called an equivalence class um but now there's a whole bunch of other papers that i i don't have all the links i could send them to you offline but things they start asking questions it says okay like well if i'm able to do use causal discovery to get the graph up to like equivalence well now i can ask questions like what set of experiments should i run in what order to as efficiently as possible resolve any lingering ambiguities so like right just the observational data might at least tell me something like certain variables aren't connected to other variables and i'm able to orient for some edges in the graph like which direction do they point but others i can't you know but then um you know the hope of causal discoveries to be able to additionally do that so that's that's one you know exciting thing and um my student champion who's been working a lot in that area so we have a paper uh that gets at this question of if you get to make these decisions about kind of like i was telling you before if you have different data sets and by combining them you can answer a question but not necessarily by using one of them alone or even if you can combine them to answer the question there's still an unresolved question of how much data should i collect from this source versus that source in order to like as efficiently as possible pin down the causal effect we've been working on that problem of basically how how do i like you know imagine you're working at a company and you have some third-party data provider that charges you for uh you know i pay this much per thousand examples right like um you know how would i make the decision sequentially of like okay based on what i know now which data source should i query next and for how many samples and then okay now i update my beliefs i make a subsequent decision which i think this decision process is always going on in the background right like if you're a company that's buying data from people or going out and actively you know doing some kind of monitoring effort data collections you're making decisions on the fly about oh uh i want to collect data from here oh now i now i know something i didn't know before this is going to guide my decision of what to collect next but we don't usually formally model that process we usually sort of assume the data is already there and then focus on how do you estimate you know something given that the data is there so um yeah those are you know some somewhere so i'm excited about that direction and i think on the the fairness side i think one way that things are maturing is that people been posing these questions in ways that are maybe i'm familiar with this uh philosopher charles mills who passed away recently um yes he's this great like moral political philosopher and he writes about um sort of like ideal approach to theorizing about questions of justice and in um and i think it you know and my student i had a post-doc who graduated cena fossil for is now a professor at northeastern but we wrote a paper like a couple years ago just just making a connection between what's going on in ml and and this sort of framing of ideal versus non-ideal theorizing about justice that comes from um uh among other folks charles mills um but you know he has point that you know when you start posing like a question about equity or question about justice as a sort of like technical problem and you you make up a toy model um there's this danger that you know you you sort of highlight as like salient and relevant those parts of the problem that are captured by like your toy model and you relegate as like not even of you know academic consideration everything that doesn't show up in your model right so i think that and the danger here is that like if the things that you're just like completely forgetting about are actually like the everything that really matters you wind up in a situation where you could do a lot of academic tinkering and you could even develop like elegant mathematical theories um but they have almost nothing to say about the underlying question of justice that you care about and i think this is sort of the situation that we've been in to some degree and it's not to implicate everyone but it's say like the the main thing right is that we've been posing as questions of equity in the form of like say i have a data set uh say i have a particular feature um let me just sort of start enumerating different things that should be equal and then saying oh well it's not possible to make them all equal simultaneously so let's either just like naively pick one and then flesh out an algorithm for it or just kind of like crying about how fairness is impossible um and i think like what gets lost in that whole kind of discussion is that it's almost all of it it's like taking for granted just i've got a data set there's a bunch of anonymous features i don't really say anything about what they actually mean or what real world processes they correspond to or how disparities arise and how um that consideration really bears on what is sort of the appropriate uh response from a standpoint of like affecting justice uh like when is it you know do you we don't look at every single like we don't look at um um i don't know if you don't look at like the uh major league baseball for example and say like oh well i noticed there are more players from some country that you know than from some other countries you know relatives you know and say okay let me just equalize it instead of quota system but it's also because i don't you don't believe that like i don't know say some country that like you know excels in baseball like puerto rico so you don't think they've been giving an unfair advantage in getting to the major leagues or something and so like this whole backstory of like what actually are the um the the what actually do these variables mean and to the extent that there are disparities reflected in the data like um where do they come from and and and how do they correspond to some political uh some some coherent political stance or theory that sort of makes a straight line from that to who is responsible to remediate who has a responsibility to remediate it these are like fundamentally the concerns that we sort of always have when we speak i think in the law or or in a broader sense about questions of justice that some for some reason i think there's something i think maybe just about the fact that it's a new field or something like that but that for some reason have been just kind of completely sidelined or completely as a strong word but by by like the main branch of fairness research and i think there's a a number of people who are who are doing interesting work here to try to actually ask the critical questions and i think like lily who and um issa caller houseman uh are two people who i think just have been like kind of asking the right kinds of questions um for for a while and and kind of framing that critique in a way that i think is what's so rare it's like both really understands what's happening in like the sort of fair ml world and also really understands the sort of context and um like the the people actually understand ethics and actually understand like legal principles of justice and are um i think able to speak from some degree of authority to sort of what's missing and the way we're posing those questions and tackling them before before you jump into their work the things that come to mind for me are this idea of you know techno solutionism being part of the problem like we're trying to you know throw technology at the problems that technology is is creating for us that yeah we talked about that a couple years ago we did we did uh and also there's kind of a nod and and the way you talked about the problem of fairness to causality and you know when we all got really excited about causality a couple years ago you know and uh i think it was that same nurips uh was you know everyone left excited about causality like it was supposed to be the savior of fairness and it was you know applying causal modeling to machine learning more broadly was going to you know give us you know transparency give us fairness give us uh interpretability and you know break open all of the black boxes and all of that like where's the heart you know i think i i just i couldn't more highly recommend um issa and lily's work here that i think you know there's a there's a handful of works you know there's some work by elizabeth and by ilya spitzer that has and and before some earlier work by like matt kuzner that sort of pose different um notions or of that that are like within a causal framing kind of like coming out of like pearls causal modeling some notions of like you know the earliest versions of that say something like well there's a lot of different versions if someone wants to say something like okay it's not just a question of is race or or gender or whatever is considered as protected attribute does it turn out to be correlated with some outcome we want to ask some question of like is it what causes the outcome um and there there's a way that like these questions have been paused and it's actually you know the causal framing is not unique to machine learning it's actually something that um i think like the legal scholarship itself often expresses things in causal terms right um you know um and um before them you know i think like economists for example like you know there's a famous experiments right where people say well we sent like the resume experiment that i think cinderella millennium and some others ran or they they randomized names to be uh tip more more likely sort of um uh you know like black american signing is worth more likely to be white american as you know and then they they they send the resumes to people and they measure the response and it's it's an interesting experiment it's certainly like uh uh valuable research and the fact that there is in certain contexts a difference in the response rate like you know does jump out as um you know problematic right on the other hand if you sent them out and there was no difference in response rates um should we conclude in the other direction like oh there's there's nothing wrong right and and i think the answer is you know obviously different people will have different opinions and and the answer might be answered different in different contexts but i think that there's a lot of context which i think many of us or most of us would say that that's not necessarily the case right um for example if uh right like if you if you if you like if you're like oh you know because you know it's like what does it mean to just change the name right so i could change your i could change your name but uh and that wouldn't make a difference but if i changed what college you went to from say like hbcu to uh um you know say some some other school and that made a difference even if your name by itself conditioned on everything else didn't make a difference so so there's this notion that's baked into a lot of literature that tries to pose questions about um discrimination through a causal lens that sort of tends to adopt a rather like narrow notion of what could constitute discrimination as like the direct effect of some attribute like the direct effect of gender the direct effect of race on a decision and the problem is that well what about like all of these sort of potentially indirect effects that could still be you know if i were making someone to make a decision based on some factor that is super correlated with race and also irrelevant to the decision otherwise well like would do say that that's not discrimination um and so there's this you know then there's some work by elias berenbaum and ilia schmitzer which is i think a sort of step at least conceptually in in a more interesting direction whereas what they try to do is sort of you know if you have like a a causal model over all the variables uh you could say something like well let me disentangle the like the the how the effect of some attribute of interest whether it's race or gender comes to influence some outcome of interest along all the different sort of plausible causal paths that it may take and i can sort of attribute to what extent is this you know influencing the outcome by that variable versus via this by this path versus by this other path um now keep in mind though like that's sort of i think what's cool about it is it's like a thinking tool like in practice do we actually expect that we would have a causal model that captures all the variables of interest and actually says exactly we would know precisely which variables influence like every variable that goes from somebody's gender you know to uh whether or not they got hired you know we would be able to like we're going to trace like over what scale like over the scope of someone's entire life we're going to you know in build into our graph every opportunity right every every every decision every opportunity that someone was given or not given on that account like that we're gonna have a graph that is so rich is to capture all of that you know it seems unlikely but at least it gives you maybe like a thinking tool as like okay it's it you know at least i can conceptualize and step back and think about the fact that there are these um but among other things it outsources the normative work right at the end of the day presumably that the reason to disambiguate these different pathways is to say someone believes that you know they usually put as like in the terms of like some past are permissible like maybe they run through like unambiguous qualifications for the position being hired for versus other paths or sort of like impermissible past because all they're really doing is telegraphing uh information but they're not actually influencing uh you know they're not actually relevant to the job qualifications or whatever the context is but um they're still outsourcing the normative work of someone well someone someone has to go and say which paths are permissible and which paths are impermissible um and lily has a really sharp critique she also has a a nice set of um blog posts that are called like disparate causes i believe on on this blog phenomenal world but it goes into this problem kind of critically and among other things you know getting at this question of what we call like a direct effect or an indirect effect is partly an artifact of the representation that we have and there are some causal questions where for any kind of process that we describe there's multiple different valid causal representations conceivably right because um you can always like zoom into if i have this variable in this variable and an edge between them i can always zoom into it and say oh like it's not just that like you know uh someone's college influences their internship it's actually their college influences this subtle decision that's made by some recruiter which influences this which influences that and so you can always zoom into it and um sort of bring more into focus and whether or not someone would like look you know now like a very generic question like what is the average treatment effect maybe as long as you had whether you had a very sort of granular or very sort of um you know coarse representation of some process if they're both valid you'll have the same answer for for a question like that but this question about like what are the pathways taken is sort of like um and are they permissible or not is sort of an artifact partly of at what resolution do you zoom into this process and do you capture it and something might look okay you know if you zoom way out and you subsume a whole lot of mediators into just like an arrow but if you zoom in closely and you knew more about how that process took place then maybe you would say oh this isn't kosher um so i i think you know at a high level um i think that like causality gives us like a set of thinking tools for thinking critically about some of these problems and they are maybe in some way like a partial step in the right direction um but at the same time you know i don't think it's like a magic bullet that like sort of addresses all questions of of of fairness or justice or discrimination and i think that often you know they're they're that the the sort of like model that was sufficiently rich to be able to even if you believe that they were like you couldn't you wouldn't actually have you know uh you wouldn't actually be able to like produce the causal model so you could fully resolve those questions and i think one nice point that lily makes um and i think it might have been uh um in joint work with with isa but but i remember one of the points is is that you know i think that there's a lot of times like a dangerous it could be a little bit of a distraction you know if you said there's like heroic amount of you know i have to know every single variable and every single thing and estimate every single relation uh and before i can make any kind of conclusion about whether there's discrimination um that might not actually be necessary it's not in general what we do i think there are situations where we can size up um at a bird's eye view that there is there's some fundamental like inequity in society and and conclude that we ought that we we have some responsibility to do something about it and that that doesn't need to be contingent upon saying that like i've exactly estimated every single possible you know uh causal functional on the pathway uh of every single factor that you know plays any role in uh you know on the path to some decision that's made about someone in their life that that might set like a um at the end of the day like too high a bar that you know you kind of um they did i think i think we're we're able to recognize cases of discrimination and plenty of cases where where we're not able to do this kind of like you know um herculean like numerical feat let's maybe shift gears and talk a little bit about uh use cases or application areas that have made notable progress in 2021 um anything come to mind there you know well look um one one obvious one and you know as much as i might you know be kind of called on to contrary but like uh well one worry i think got to give some credit is i think um alpha folds from deepmind um like i i'm not a protein folding expert but i know some people that are not just like gullible deep learning boosters um who do work in the area and as far as i can tell it's like actually uh a pretty significant um leap forward um that you know it's work that you know could very well have you know won like a significant you know science surprise you know uh like that level of accomplishment um and you know that that's a little bit you know hearsay in that like i'm not i'm not uh an expert in in protein folding but as far as i understand it really is a a legitimate significant contribution and i think an area where you know maybe deep learning wasn't quite as you know inlined as a essential tool so that's certainly um a use case um i think you're starting to see um a lot of the use cases that were maybe obvious ones um but not necessarily um you know for example like radiology it's sort of like an obvious initial thing because and harkening back to like our earlier conversation about the difference between prediction and decisions well part of why radiology is like people see it as like this big target is that um you know there are certain roles of the radiologist where they really are involved in decision making and the recognition is part of a weird it's like interventional radiology but there are also are lots of people who literally are looking at images and making classifications and medical imaging is the case like a diagnostic kind of imaging right and i think that's the case where we've known since the moment you know the kind of big image recognition results started hitting and say 2012 2013 that radiology was a potential target and you had some maybe overly uh optimistic statements from like jeff hinton like if you're you know if i were if you're in medical school now do not specialize in radiology it hasn't quite gotten to that point it hasn't taken the radiologist out of the loop but i've been i've been chatting with a lot of radiologists recently and i've been surprised to find well sort of two things one on one side that some of the systems really are quite good and you actually have some systems being deployed already actually like actually piping information into patient records um and at the same time that i think some of the problems that we discussed earlier about what can go wrong are happening on the ground and you do have a situation where for example uh systems that work well on one set of equipment uh are not performing well on you know some new scanner which you know you know it's sufficiently similar to all the other scanners that a human radiologist should have no problem right yeah and these are not adversarial examples nobody's out there designing a scanner there's that there's not like there's like a radiologist out there like i'm gonna i'm gonna build a scanner that just up all the previous deep learning so that way we can like keep our jobs um so so i think you sort of have you know both uh a moment of the technology actually kind of like making landfall um but i think you also have uh some some moment of like the rubber hitting the road and people people sort of seeing um up front some you know up close some some of the ways in which the technology uh is brittle and dangerous yeah um and and i think that largely this might be like an unsexy story because it's almost like what's like the big sexy application but i think largely the story now and and i think that overall the biggest like if i were to like take out like a bird's eye view of the economy and just like i'm just watching like what is ai doing you know i think right the story of like 2014 2015 is like these new use cases like fundamental new things popping up like things we weren't doing with deep learning suddenly like machine translation people swapping out the old guts and sticking into deep learning systems and um suddenly like every single mobile phone having the capacity to run some kind of small deep learning model because it's being used for recognizing objects in the cameras and doing the face recognition that unlocks your phone and all that i think the bigger story of the last couple years has been more on the side of um more on the side of like deployment and like diffusion and like maturity of the like uh like operations around ml like i noticed more and more companies that like their pain point isn't that they need someone who could train a model their pain point if they need an ml ops person they need someone who who can actually keep the crap running day in and day out that someone who who knows you know it's a good if there's some specialized that it's not it's like a pure like ml researcher like someone like me then like i don't have the skill set i haven't spent my life in like you know there's a real serious discipline and like keeping software working day in and day out like the people it's it's amazing what we could do when you see companies that like have a software that you know product that like 400 million people use every day and they like go seven years without a single you know hour of downtime you know it's absolutely bonkers how difficult that is and machine learning is throws in a whole you know i think researchers don't have that but she throws like a weird set of complications because all kinds of ways that things can go wrong even if there aren't software bugs and so they need to understand something about enough about statistics they have some sense of what could go wrong in ways that you need to model things that aren't software glitches they're like the world-changing glitches like the world is the bug even if even if everything's coded precisely and need to be able to interface back and forth between like software developers ml engineers and researchers so i think like the maturity of ml ops and also just broadly like the the use of ml not just in you know i think there was a moment right when there was google amazon facebook uh microsoft and i don't know if you ever read this you know but like i uh i wrote this like satire bit um just because like when everyone was making a big deal about like oh whatever professor left to go to whatever company and it's their salary and they were kind of writing about it like you know almost like like football players getting traded or something and so i wrote this stupid post that was just like uh announcing that i had been hired as like the intergalactic head of machine learning by johnson and johnson or something and i'd be you know for some you know astronomical some and it was just a stupid joke but the point was like a year later i forget where i was and i met someone and they worked at like johnson and johnson ai research right and i think that this is part of like what's going on now is that like there was a moment in time i'm sure like i'm making this up i research it before the you know but you know this is what you come here to speak from academic authority just make up crap on your podcast uh i'm sure there was a moment in time where like there were only a small number of like elite tech firms that were using like modern sql databases yeah you know when it was fresh i think was that ibm when it was developed but there's probably a moment this is like a really hot fundamentally new technology that really changes business operations at places and there were like a handful of like super technical firms that knew how to do it and now it's like the most boring technical firm in the world uses sql right um and i think that this is a huge part of what's happening in ai if you were to size up like a commercial environment so i think there are exciting startups that are using this technology in new ways there are you know interesting things going on at like the sexy tech companies but i think there's a lot of like you know there's no company that you go to you know whether it's like uh you know i'm sure i'm sure if you went to like a waste management company like they're they're finding you know they're using ai for something or forecasting demand or trying to figure out how to route their trucks or something and i think that like this sort of just general like progression of ai from um like a luxury good to a commodity is like an essential part of what's going on and like the fact that like yeah like every company has you know this is becoming their concern um and i think i think part and parcel of that is the way that the tooling is getting better and better and better um a whole lot of companies you know like what are they offering it's like uh things that make it that like the stuff that everyone's already been doing for a while that anyone could do it right um and it's easy to track and you know it's easy to organize like you know i think this movement of like ai from a concern of like what's the new model to like what is like a stable workflow that we can adopt such that a company that can't spend half a million dollars for engineer can still use this technology like successfully and profitably i think that's a major part of the story of like the commercial application of ai right now um and it's kind of like it's a pretty unsexy story maybe of just like oh this is just becoming right but i think it's what happens to everything right like something's not like sorry if you're uh ai researcher but if you're in ml ops it's pretty cool oh yeah a lot of really cool stuff happening in that field and there's like a lot more jobs at every company in the world together than there is that like whatever it is like apple microsoft amazon facebook whatever so i think that that move before we run out of time uh i'd love to have you kind of dust off the crystal ball a little bit more and kind of share some of your predictions for the upcoming year years we've talked a little bit about where you'd swing the bat uh from a research perspective but yeah how do you think 2022 um you know it with the backdrop of kind of the you know the the i don't know if you'd call it a cooling or a slowing or boring a vacation or whatever you'd want to call it like um are there are there innovations that are you know you kind of see the the silhouette emerging from the shadows and you you think something's there right the kool-aid it's like it's like it's not the funny thing about that is it's not like uh it's all cooling or it's all heating up it's like i think the coolest or weirdest the interesting thing about it is that it's you know like whenever you sum up something like uh like a complex phenomena with like a single number it's like you always you lose a lot of information i think it's like right it's sort of like more like fall of the roman empire right it's like like the like rome's all right like roma is still partying like the borders are still expanding but you also have like you know like you have like cities being lost and whole countries going off the map and it's like i think that's happening right like you you have like like like uber ai shutting down air search you have like hiring freezes at major companies big the big leaders like having major hiring freezes not offering the kinds of salaries in 2022 they were offering in 2018 to like kind of like well-known researchers and at the same time you have like whole companies where like the shockwave hasn't even hit them yet and they're like first getting into the like like major health systems starting to adopt like you know deep learning and and i i think that yeah there's that going on if i had to predict what's going to happen like i'm going to double down on on uh decision making so you know i think something i'm already seeing a lot of is you know like you know you could think of like like two things came a few things came together that made ai so hot which is like one was suddenly the fact that like like the existence of easily queryable well-organized curated data at every single firm in the world you know the fact that like health companies started using electronic health records every company being basically an internet company everyone having a digital trace of all their customer interactions now you know we can get to a separate like normative point about whether we want to live in that world or like whether we're irked by the surveillance state but from a stand like an economic standpoint that happened together with like advances in both the tooling and algorithms around statistical modeling and so the question became we have this data we have statistical tools how do we do this like analytics on the data right um but there's another side which is like how do how do we guide how do we use the data to guide actions and i think people um i think one one thing that is underutilized by most firms and i think only a small number of people are really sophisticated about it is is is really focusing on this like the decision problem and part of that is part of that is you know offline causal inference which is some of the stuff we were talking about like how can i use some some causal background knowledge together with the data that i have to uh infer a causal effect and use that to guide a decision but a huge part of that is experimentation and i think that this is a huge thing that not enough companies do that you're gonna start seeing you know become you know obviously like amazon you know has has what they call like web labs you know where the you know google like you know does randomized control trials for you know which shade of green the g should be in google or something um but i think most companies grossly underutilize experimentation like really methodical experiments because you know that plays into the data picture online experiments in particular well online not necessarily in the sense i mean i think online is part of you know in a sense of like you know doing like reinforcement learning having like a policy that's adaptive as you're getting the results but even experimenting at all right like just like we've been guiding if you look at how we guide personalized decisions it's often in the context of i just take passively collected traces of people's data i do some kind of latent factor analysis or whatever to build a recommender system versus actually i'm going to randomize choices and try to estimate the sort of like you know potentially like heterogeneous treatment effects of how different people will respond differently to different things but actually to estimate the effects on um you know whether it's people's behavior or whatever you know i don't mean this is sort of like uh sound like i'm i'm um uh advising that we like really nearly experiment on people without thinking about the considerations or which which decisions or which experiments are potentially like of ethical import and obviously there's a lot that needs to there's a lot of considerations that need to go into how how you do that and doing it right um but yeah i i think that you know the the reckoning we're seeing i think is over and over again is like people claiming that is gonna you know gonna personalize this personalize that it's gonna lead you to make all these different decisions in better ways and then people find like oh i just naively trained a predictive model came up with some heuristic for how to operationalize that as a decision and something didn't go as planned and i think that um people actually getting more into this world of both using offline you know kind of causal inference on observational data but also actually experimenting in the real world and you know developing more mature processes for saying how do i how do i test hypotheses how do i see what the impacts are of you know different actions that i have i think that that's going to become more and more and more important and you're gonna start seeing the like hiring focus you know and just like where teams start moving um you know towards those kinds of problems and again i i don't think this is like overnight you're going to go from people like hiring 90 percent you know deep learning you know like like pytorch jockeys to like 90 hiring um you know experts and like you know uh banded algorithms and causal inference but i do think that there is a um there's a shift here i'm seeing at every level i'm seeing it in what what looks you know interesting among new students what looks interesting among um folks hitting a hiring market i think that this sort of intersection of like um you know cs operations research economics um and bringing to bear like you know tools of predictive modeling that we've gotten but also um um more sophisticated processes of experimentation and and estimating causal effects and and principles of just guiding you know intelligent decision making um i see i see there's like a i think there's like a growing up process happening there and um you know i think the other thing though is um well one thing i add on and this is not like a specific prediction but a meta prediction is the you know you know like the the internet like like web 2.0 web 3.0 whatever the hell we're doing like very little there's a lot of like new there's a lot of new stuff that we're seeing in like the way companies are behaving and the way they're interacting with people um that isn't technologically new right there's a lot of stuff that like you could have done from the late 1990s um the tooling wasn't is there which restricted how many people could develop it um but it was something else it was something about like you know there was a capability that came and a few people have figured out some very like some early players that figured out like you know how to conquer e-commerce like amazon whatever it took a long time before you got to uber right and so there are certain innovations there that were like you know it's like there were a few a bunch of pieces that need to fit together like a certain understanding of like markets or a certain understanding of like usage patterns of phones with the technological capability that had been there all along and i think that like there's there's a kind of like innovation in deployment that doesn't actually correspond like i think when people have been stoked about ml recently right it's been like oh bert is like good at classifying text or you know like you know seek to seek you know lcms and then transformers are good at you know this one thing just like single purpose models but like i'll give an example so i'm an advisor for companies full coi i'm an advisor for a company called a bridge ai and um uh a bridge is is a company that like is sitting between like doctors and the patients and and sitting in this like interaction where um patients uh it turns out patients are recording their visits on their cell phones and they're doing this already sometimes surreptitiously sometimes with the doctor's consent it may or may not be wiretapping depending upon like what your particular state is like you know two parties so their idea was like let's inline this as like a normal part of the doctor patient interaction like let's have permission let's have both parties get in they'll agree to record the conversation they'll pull out a bridge and then they record it and and there's all kinds of different things you could do right like you can help the doctor to draft a summary of the visit you can help the patient to understand like oh like you know don't forget like you mentioned that you uh would be starting this new medication have you picked it up or have you you know called in that prescription or did you schedule up this follow-up so there's like a million different places to plug in models and any one of them by itself may or may not be like you know a single purpose like major innovation but the the ways that you can mix and match these like okay i've got the conversation i just gotta send it to an asr model i get back the text of the text i need to flag out like well what are the interes what are the relevant or salient parts of the conversation how do i then take that turn it into like an interface feature that you know provides some value or make things useful to the patient and i think that like a lot of things like when alexa works really well right you know or google home or anything when it works really well it's usually not because there's one model that's like magnificent it's like the magic is in the clever way that they stitch together you know some some some astute observations about what are the common interaction patterns together with like what are the what were the right little places you could patch in machine learning and the right ways that you can like patch in some intelligent heuristics and rules around it such that like you know you have like an end-to-end product that feels like it's magic right um you know shazam even is a little bit like that but one of these things where like there's a few like like a little heuristic but if you start thinking how do i decompose this into like something that works it's like you can make a pipeline where every single step of it's kind of simple but like the end of the end result is something where it feels a little bit magical right and i think that like this is going to be i think a major part of this is like i think we've been looking at like people who are really good at building single purpose models then turning that into a big startup and or trying to turn parlay that into a startup and i think there will be some amount of like the single purpose models are mature and they'll get a little bit better but what's maybe under explored a bit are ways that you mix and match models together with cool interaction patterns and you know some clever understanding of like what people want and you know what data is available etc to build like kind of user experiences that maybe under the hood are invoking like seven different models in seven different contexts and but it's like kind of hidden from the user in a clever way where it just feels like you're having like it adds up to a new capability that no one model um or piece of software like by itself would provide and so i do think that there is some element of this like you know um like we've built a bunch of cool legos and we haven't given people that many years to like you know instead of like you know some innovation comes from like i designed a new lego piece but like i think a lot of innovation will come from people that are you know you know don't have to be like have like off the charts skills at building legos but they're really they have a kind of design sense for you know what are cool ways to put them together yeah i think that's a it's a natural consequence of the the broader maturity conversation we've had we've been having right the lego pieces are you know not that we've come up with every lego piece that's ever going to be created and that there aren't some cool ones to come but all the basic pieces required to build really cool stuff is in place and now it's all about how do you put them together yeah and even more so than the pieces themselves the tools to easily put them in place you've got your hugging faces you've got your mlops tools like it's a great time to be a builder right yeah it also like it takes some of that work away that it allows you to focus right i think music's like that a little bit right like you know there's this way like when you're learning an instrument and you're like i gotta practice articulation i gotta practice rudiments i gotta practice scales i gotta do this in here you know you're sitting you're going dude and they're playing this kind of over and over and over again when you're like 10 11 12 13 14 years old but you get to some point where like maybe you still practice like that one hour a day but when you go to play you're not even thinking at that level at all like it's not and i think there is some element of that of like people using machine learning recently have been thinking like just like how do i get the data and train a single model and i think once you have a lot of these contexts where maybe you don't even need to train a model maybe there's an off the shelf model that's sufficiently good at this task going to work better than anything you could train even if you're applying it sort of on slightly different domain shifted data right then you start getting to this point where uh right like like the difference between like a great artist and like a super boring artist isn't that like the great artist is better at scales you know it's not at that point you know right it's not like oh like like miles davis play like better you know play like cleaner scales than uh you know like i don't know these state-of-the-art is staying on key or something like that it's not like that right yeah so so i think there's you know a lot of a lot of innovation to be had on that side yeah awesome awesome well zach it has been wonderful catching up uh let's make sure it's not two years until the next time yeah right who knows what pandemic will be in full swing by that time awesome well thanks so much for uh helping us reflect on 2021 and in the ml and dl domains and uh catch you next time yeah thanks for having me sam great to see you\n"