#50 Weapons of Math Destruction (with Cathy O'Neil)

**The Ethics of Artificial Intelligence: A New Framework for Algorithmic Auditing**

As technology continues to advance at an unprecedented rate, one of the most pressing concerns facing us today is the ethics of artificial intelligence. Philosopher Cathy O'Neill has been leading the charge in this area, and her work on "Weapons of Mass Destruction" highlights the need for a more nuanced understanding of the impact that algorithms can have on society.

**The Problem with Current Approaches**

When it comes to assessing whether an algorithm is working effectively, most people's initial response is simply "yes." However, as O'Neill points out, this oversimplification neglects the fact that there are often broader social and economic implications at play. For example, an algorithm designed to optimize efficiency in a factory may also have unintended consequences for workers who lose their jobs due to automation. To address these concerns, we need a new framework for evaluating algorithms that takes into account the potential impact on stakeholders beyond just the immediate users of the technology.

**The Concept of an Ethical Matrix**

One promising approach to addressing this challenge is the concept of an "ethical matrix." This involves identifying all the stakeholders who may be affected by an algorithm, including not just individual users but also broader societal groups such as employees, customers, and even the environment. The matrix then provides a rubric for evaluating the potential risks and benefits of each stakeholder's concerns, allowing us to make more informed decisions about how to design and deploy algorithms that are both effective and responsible.

**The Importance of Stakeholder Engagement**

O'Neill emphasizes that the key to creating an ethical matrix is to engage with stakeholders from the outset. This means involving not just technical experts but also social scientists, policymakers, and other subject matter specialists in the process of designing the matrix. By doing so, we can ensure that the concerns and values of all relevant stakeholders are taken into account, rather than being imposed on them by external actors.

**The Benefits of Auditing Algorithms**

Another important aspect of O'Neill's work is the concept of "algorithmic auditing," which involves evaluating the performance and impact of algorithms in a systematic and transparent way. This can help identify potential flaws or biases in algorithms that may not be immediately apparent, as well as provide insights into how these algorithms are being used in practice.

**The Case of the Teaching Value-Added Model**

One specific example of the challenges posed by algorithmic auditing is the teaching value-added model, which has been shown to be overly sensitive to small changes in teacher data. This can have serious consequences for teachers who are unfairly penalized or rewarded based on factors beyond their control. O'Neill's work highlights the need for more nuanced and transparent approaches to evaluating algorithms like this one.

**The Future of Algorithmic Auditing**

Ultimately, the goal of algorithmic auditing is not just to identify problems with existing algorithms but also to develop new approaches that prioritize transparency, accountability, and social responsibility. By incorporating an ethical matrix into our design process, we can create algorithms that are more responsive to the needs of all stakeholders, rather than just serving the interests of a privileged few.

**A New Framework for Data Science**

O'Neill's work on algorithmic auditing and the concept of an ethical matrix represents a major turning point in the development of data science as a field. By prioritizing ethics and social responsibility alongside technical expertise, we can create a more inclusive and equitable approach to data-driven decision making that benefits everyone involved. As O'Neill herself puts it, "data science doesn't predict the future – it causes the future." It's time for us to take this power seriously.

**Resources**

For more information on Cathy O'Neill's work on algorithmic auditing and the concept of an ethical matrix, please visit [link]. We also invite you to take our survey in the show notes and share your thoughts on how we can promote a more responsible approach to data science.

"WEBVTTKind: captionsLanguage: enwell in this episode 50 our season 1 finale of data framed a data camp podcast I have the great pleasure of speaking with Cathy O'Neill data scientists investigative journalist consultant algorithmic auditor and author of the critically acclaimed book weapons of mass destruction Cathy and I will discuss the ingredients that make up weapons of mass destruction which are algorithms and models that are important in society secret and harmful from models that decide whether you keep your job a credit card or insurance or algorithms that decide how we're policed sentenced to prison or given parole Cathy and I will be discussing the current lack of fairness in artificial intelligence how societal biases are perpetuated by algorithms and how both transparency and auditability of algorithms will be necessary for a fairer future what does this mean in practice stick around to find out as Cathy says fairness is a statistical concept it's a notion that we need to understand at an aggregate level and moreover data science doesn't just predict the future it causes the future I'm Hugo von Anderson a date assigned as the data camp and this is data frame welcome to data frame a weekly data cam podcast exploring what data science looks like on the ground for working data scientists and what problems have been solved I'm your host Hugo Bound Anderson you can follow me on Twitter that's you go back and data camp at data cap you can find all our episodes and show notes at data camp comm slash community slash podcast listeners as always check out the show notes for more material on the conversation today I've also included a survey in the show notes and a link to a forum where you can make suggestions for future episodes I'd really appreciate it if you take the survey so I can make sure that we're producing episodes that you want to hear now back to our regularly scheduled programming before we dive in I'd like to say thank you all for tuning in all year and thank you to all our wonderful guests we have had so much fun producing these 50 episodes and cannot wait to be back on the airwaves in 2019 we'll be back words season to early in 2019 and to keep you thinking curious and data focused in between seasons we're having a data frame challenge the winner will get to join me on a segment here on data framed the challenge is to listen to as many episodes as you can and to tweet excerpts that you find illuminating to at data camp at Hugo Bound and the relevant guests using the hashtag data framed challenge that's hashtag data framed challenge at the start of season two will randomly select the sender of one of these tweets to join me on a podcast segment the more tweets you send the more chance you have however will delete the duplicates first that's hashtag data framed challenge hi there Kathy and welcome to data framed thank you I'm glad to be here that's such a great pleasure to have you on the show and I'm really excited to be here to talk about you know a lot of different things surrounding data science and ethics and algorithmic bias and all of these types of things but before we get into the the nitty-gritty I'd love to know a bit about you so perhaps you can start off by telling us what you do and what you're known for in the data community well I'm a data science consultant and I started company to audit algorithms but I I guess I've been a data scientist for almost as long as there's been that title and actually I would argue that I was a data scientist before that title existed because I worked as a quant in finance starting in 2007 and I think of that as a data science job even though other people might not agree so I mean and the reason I'm being I'm waffling is because when I enter data science maybe I would say in 2011 in a large question in my mind was like to what extent is this a thing and so I wrote a book actually called doing data science just to explore that question I co-authored it with Rachel shut the idea of that book was like what is data science is it a thing is it new is it important what you know is it powerful is it too powerful things like that so what am i known for I think I'm known for being like a gadfly for sort of calling out print for possibly you know people think of me as overly negative about the field I think of myself as the antidote to kool-aid well yeah I think of a lot of the work you've done as kind of a restoring or restorative force to a lot of what's happening at blinding speed in tech in data science in how algorithms are becoming more and more a part of our daily lives yeah that sounds nice thank you you're welcome and I like kind of the way you motivated your book that you co-wrote doing data science in terms of you know exploring what what data science is because you actually have a nice working definition in there which is something along the lines of like a data savvy quantitatively minded coding literate problem solver and that's how I like to think of data science work in general as well yeah you know I've actually kind of discarded that definition in in preference for a new definition which I just came up with like a couple months ago but I'm into and this is a way of distinguishing the new part of data science from the old part of data science so it's not it's not really a definition of data science per se but it is what is sort of a definition of what I worry about in data science if you will which is that data science doesn't just predict the future it causes the future so that distinguishes it from astronomers likes astronomers you use a lot of quantitative techniques they use tons of data they're not new so are they data scientists in the first definition that you just told me for my book probably yes but in the second definition no because like the point is that they can tell us when Haley's comment is coming back but they're not going to affect when Haley's comment is coming back and and that's that's the thing that data science or I should say data scientists need to understand about what they're doing for the most part they are not just sort of predicting but they're causing and that's that's where it gets these sort of society-wide feedback loops that we need to start worrying about agree completely and I look forward to delving into these these feedback loops and this idea of a feedback loop in data science work and in algorithms and modeling is you know one of the key ingredients of what you call a weapon of mass destruction which I really look forward to getting back to but I like the idea that you've moved on from a quasi definition you had in in your book doing data science because a question I mean doing data science was was written five or six years ago now is that right yeah 2012 I think yeah right so I'm wondering looking back on that if you were to rewrite it or do it again what what do you think is worth talking about now that you couldn't see then well I mean it to be clear it was it was each chapter was a was a different lecture in a class at Columbia it was taken from those lectures including a lecture I gave about finance including a lecture that we had the you know that the data scientists from square speak you know we had people that were probably not considered data scientists but statisticians speaking etc so he was like a grab-bag and in that way it was actually really cool because it was all over the place and broad and we could see how sort of these techniques were broadly applicable various techniques and we could also go into you know networks in one chapter and you know time series in another and that was that was neat because we could sort of like have a survey if you will of stuff but it wasn't meant to be a deep dive in any given direction if I rewrote it now I would probably if I kept with that survey approach I would be surveying a totally different world because we have very different kinds of things going on now I guess we also have some sort of through streams like we have some some things that are still happening that we're happening then that we emphasize more I think in particular I would spend a lot more time on recommendation engines although we do have a chapter on recommendation engines from the former CEO of hunch I believe to try to understand a person and by 20 questions and then sort of recommend what kind of iPhone they should buy or something like that but nowadays you know I spend a lot more time exploring things like to what extent do the YouTube recommendations radicalize our youth that's really interesting because I think what what that does is it puts data science and data science work as as we've been discussing already into a broader societal context and assesses and communicates around the impact of all the work that happens in data science so I think that provides a nice segue into a lot of the work you've done which culminated in your in your book weapons of mass mass destruction that I'd like to spend a bit of time on so could you tell me kind of the basic ingredients of what a weapon of mass destruction actually is sure a weapon of mass destruction is is a kind of algorithm that I feel we're not worrying enough about it's important and it's secret and its destructive those are the three characteristics important by important I mean it's widespread it's scale that's used on a lot of people for important decisions I usually think of the categories of decisions in the following like financial decisions so it could it be a credit card or insurance or housing or livelihood decisions like do you get a job do you keep your job are you good at your job do you get a raise or Liberty so how are you police how are you sentenced to prison how are you given parole your actual Liberty and then the fourth category would be information so how are you fed information how is your environment online in particular informed through algorithms and what kind of long-term effects are those having on different parts of the population so those are the four categories they're important one of the things that I absolutely insist on when we talk about welcoming weapons about destruction or algorithms or regulation in particular is that we really focus in on important algorithms there's just too many algorithms to worry about so we have to sort of triage and think about which ones actually matter to people and then the second thing is that there secret almost all of these are secret even people don't even know that they exist nevermind understand how they work and then finally they're making important secret decisions about people's lives and they up like they they make mistakes and it's destructive for that individual who doesn't get the opportunity or the job or the credit card or the housing opportunity or they get in prison too long so it's destructive for them but as an observation this goes back to the feedback loop thing it's not just destructive for an individual but it actually sort of undermines the original goal the algorithm and creates a destructive feedback loop on the level of society yeah and a point you making in your book which we may get to is that they can also feed into each other and exacerbate conditions had already exist in society such as being unfair on already underrepresented groups so before we get there though could you provide I mean you've provided a nice kind of framework of the different buckets of these algorithms and WMDs and where they fall but could you provide a few concrete examples of what you consider to be the most harmful WMDs yeah I'll give you a few and I choose these in part because they're horrible but also because they all fail in a totally different ways and I want to make the point that there's like not one solution to this problem so the first one comes from the world of teaching public school teaching so there was a thing called the value-added model for teachers which was used to fire a bunch of teachers and unfairly because it turned out it was not much better than a random number generator didn't contain a lot of information about a specific teacher and in instances were where it did seem to be a sort of an extreme value that it was often it was manipulated by previous teachers cheating so like you couldn't really control your numbers but if you're a previous teacher cheated then your number would go down so it was like in this crazy system yeah because if I remember correctly the baseline is set by where your students were in the previous year or yeah with them right yeah the idea was like how well did your students do relative to it their expected performance in a standardized test and it was a very noisy question in terms of statistics unless the previous teacher in the previous year had cheated on the under those kids tests and those kids did extremely well relative to what they actually understood which would force them of course to do extremely badly the next year even if you're a good teacher so it would look really bad for you but long story short it was normally speaking when there wasn't cheating involved just a terrible statistical non robust model and yet it was being used to fire people so that's the first example the next example is this example from hiring which is a story about Kyle beam this young man who noticed in a personality test that he had to take to get a job that he failed he noticed some of the questions were exactly the same questions that he had been given in a mental health assessment and when he was being treated for bipolar disorder so that was like an embedded illegal mental health assessment that you know is illegal on to the Amaris with Disability Act which makes it illegal for any kind of health exam including a mental health exam to be administered as part as part of a hiring process so that's that's another example but in and I should add that like it wasn't just one one job it was you know Kyle ended up taking seven different versions of this test I should say he ended up taking the same exact test seven different times when he applied to seven different chain stores all of them in the Atlanta Georgia area so he wasn't just precluded from that one job he was precluded from almost any minimum wage work in the area and it wasn't just him it was like anybody who would have failed that so health assessment which is you know to say a vast community of people with mental health status so that's a great example of the feedback loop I was mentioning you know because of the scale of this Chronos test it wasn't just destructive for the individual but it was undermining the exact goal of personality test and also undermining the the overall goal of the ABA which is to to avoid the systematic filtering out of sub population and so that's the second example and the third example I would give is what we call recidivism risk algorithms in the criminal justice system where you have basically again questionnaires that end up with a score for recidivism risk that is handed to a judge and being told to the judge like this is objective scientific measurement of somebody's risk of recidivism recidivism being the likelihood of being arrested after leaving prison and the problem with that well there's lots of problems with that but the very immediate problem with that is that the questions on the questionnaire are almost entirely proxies for race and class so they ask questions like did you grow up in a high crime neighborhood I mean you grew up in a high crime neighborhood if you're a poor black person in fact like that's almost the definition of high crime neighborhood that's where the police are sent to arrest people historically from the broken windows policy the theory of police saying to the present day and by the way I should add like in part that has been propagated by another algorithm which is predictive policing so you're being asked all these proxies for poverty proxies for race and class other questions are like are you a member of a gang do you have mental health problems do you have addiction problems a lot of this kind of information is only available be cut or only held against you if you are poor and you know Pete rich or people white people get treated they don't get punished for this kind of thing so long story short it's a you know basically a test to see how poor you are and how minority you are and then if your score is higher which you which it is if you are poor and if you're black then you get sent Prison for longer now I should say like as toxic as that algorithm is and as obvious as it is that the that it creates negative feedback loops one of the things that sort of the jury is still out on is like whether that is actually that different from what we we have already we have already sort of racist classist a system not to mention judges and we have evidence for that and the idea was we're going to get better we're going to be more scientific we're gonna be more objective it's not at all clear that kind of scoring system would do so nor is it clear by the way because there's been lots of not lots but there's been some amount of testing since my book came out about how judges actually use these scoring systems it's not clear that they use them the way that they're intended and there's all sorts of evidence now that judges either ignore them or they ignore them in certain cases but listen to them other cases like for example they ignore them in black court courtrooms and they they use them in white courtrooms so they actually like keep a lot of people and especially if they're being used for pretrial detention considerations like they'll let white people out of incarceration pretrial but then they're they're going to ignore them in urban districts where they're gonna keep black people incarcerated before trial long story short there's also a lot of questions around how they're actually being used but it's it's a great example of a weapon of mass destruction sort of just created as if it just that the nature of algorithms will make things more fair I mean I guess going to your earlier point no algorithm is perfect and we couldn't expect that to be perfect but it's the reason these sort of society-wide just destructive feedback loops get propagated get created by these algorithms isn't just because they're imperfect it's because they're being used as I said in that example but more broadly they're more like funneling people in different classes and different for different genders or races or different mental health status or disability status they're funneling them in onto a path which they were sort of quote-unquote already on depending on their demographics yeah and I think speaking to your point of the fact that these algorithms may not be creating new biases I mean they may as well but that their encoding societal biases and keeping people on a path that they may have been on already I think something distinct from that is that they're actually scalable as well right yeah right so we shouldn't be surprised of course now that we say that loud like the propagating past practices or automating the status quo they're just doing what was done in the past and acting like Oh since this happened in the past in a pattern it's we should predict that it will happen in the future but the way they're being actually utilized it means not just that it we predict it will happen but we're gonna cause it to happen if you are more likely to pay back alone you're more likely to get a loan so the people who are who are deemed less likely are going to be cut out of the system they're not going to be offered a credit card and since all the algorithms work in concert and similarly to each other this becomes like a rule and it's highly scaled even if it's not the exact same algorithm which it was in the case of Kyle being with a Chronos algorithm the same exact algorithm being used but even if it isn't the fact is data scientists do their job similarly across across different companies in the same industry so online credit decisioning is going to be based on similar kinds of demographic questions we'll jump right back into our interview with Cathy O'Neil after a short segment now it's time for a segment called data science best practices I'm here with Heather nullus a machine learning engineer at t-mobile hey Heather hey you go Heather we have a lot of problems in data science and a huge practical one is running code on different computers operating systems in production and in the cloud and so on right absolutely so for me whenever I started a new job the worst part is that first day a few days where you just spend your time trying to get your computer to work like you'd expect a development or data science computer to work it requires a lot of setup and you have to install the correct operating systems get your programming languages installed find the IDE that everybody else is using make sure that you have the right version then download all of your right packages any libraries that you went and so on and this takes a long time and it's really complicated and for me it's like I said the most painful part of starting a new job and no matter how well somebody thinks that they've documented the process to set up a computer to run their code they probably haven't this becomes an even bigger problem when you're thinking about writing something in production or on the cloud or on tons of other computers where you have to have multiple versions stabili exists in a scalable capacity because how can you make sure that the setup is correct on all of those machines are there any solutions to this there are and so today I would like to tell you about one called docker so just imagine that you can write all of the machines for setting up your computer once in a very clear understandable legible language and then have that setup happened on any machine that you want that is essentially what docker does and so for data scientists the most useful thing is it eliminates the machines setup it allows for seamless code handoff so you're working on a model on your machine you really need it to run on your co-workers machine you can just wrap it all up in a docker container and hand it over and they don't even need to develop an AR Python or whatever your they don't need to have any language installing their computer in fact yeah I'm a huge fan of docker can you tell us a bit about how it works a docker file is a set of instructions that tells the Machine how to set itself up and then to run your code when a darker file is finished building we call that a snapshot and that's a docker image it's just like a picture in time of what this little miniature machine looks like when the images run it creates a container which is a mini virtual machine containers are really cool because you can have a lot of them running on your computer and they don't interact unless you tell them to and so for instance when I'm writing programs and Python 2.7 and 3.6 on the same machine I run into a lot of trouble because this libraries communicate with each other and this is a huge hassle to keep them from interacting but if I do it in containers I could have a container of written in Python 2.7 and a container written in Python 3.6 running on the same machine at the same time and neither one of them complains so this is wonderful protection environments where people are writing in all sorts of languages because each container is isolated it allows for all sorts of code to be used scaled and maintained on a single server and so for data scientists the important thing is if you can wrap up your models into a docker file and no setup as needed then you can just hand that to an engineer and your code will just run even cooler than this is others can do the work of constructing a docker image for you so you're like I don't know anything about docker I want to do this for the very first time you can go to docker hub and find images that other people has already created so you can find images for Python images for our images for Python 2.6 with a really specific package installed there's probably a darker container for Fortran if you want to like absolutely anything that you want somebody's probably started making it and do you use docker at t-mobile we do so at t-mobile our goal is to create machine learning models that run constantly in production in customer-facing ways not wanting to create our models and then recreate them in Python we decided to use our to create all of our api's dark docker is perfect to do this because the setup of our model is really complicated when you go to run our models we do deep learning and so we need are a bunch of our packages including Charis Charis runs Python on the back end then you need a bunch of really specific Linux libraries to make sure that your Python is running correctly and then we need a way for our API to pass the security requirements for t-mobile as a whole because you can't just have these API is open out in the world and our initial docker image for doing this top six gigs which caused a ton of trouble it actually took down production clusters but after a lot of work we've gotten it down to one four eight five gigs which is well within the range of acceptable by our DevOps team and we're going to release it in a few weeks on docker hub so it includes all the necessary functionality and security features that you would need to run a deep learning are tensorflow Kerris model in production and it will be open sourced because we're super passionate about using our and production or you want to empower data scientists who are most comfortable and are to go ahead and make their own api's without having to rely on a python dev to eventually do that work for them are there any potential pitfalls you'd like to warn our listeners about learning darker can be a little bit confusing at first just because it's a totally different way of setting up a file and if you've never used containers before it can take a little bit of a brain switch to get used to but after that it's really simple and then aside from that the other thing that you just really have to look out for is if you want your container super production stable they do have to be really small by really small I mean probably under two gigs when you go putting out 20 gig docker containers because it was easy DevOps doesn't get too happy and with our that means eliminating a lot of the extra packages and our is not super super great at tracking packages so you just have to be really thoughtful whenever you install things into your docker image about how it will affect the size after that it's pretty intuitive thanks Heather for that introduction to docker after that interlude it's time to jump back into our chat with Kathy I also think I mean there are a lot of different avenues we can take here and for people who want more after this conversation I highly recommend Kathy's book weapons of mass destruction something I'd like to focus on is that in all of these models you know the value-added model for teaching the hiring model these models to predict recidivism rate one really important aspect of these is that they're not interpret we can't tell why they make the predictions they do the fact that they're black box in that sense and the relationship between this inability to interpret them the inability of a teacher to go and say why have you given me this rating and they're pointed to the algorithm and the fact that this combined with the scalability really makes on mass lack of accountability and lack of fairness correct yeah I mean it's exactly right I mean and I talked about that is in fact a characteristic of a weapon of mass destruction that it's secret and that's a really important part of it because when you have something that's important and secret like it's almost always going to be destructive there's no there's no you know a good data science model has a feedback loops and and it incorporates its mistakes but there's no reason for their mistakes to be incorporated when we don't alert people to them I'm so that's this sort of unaccountability is a real problem for the model but it's also obviously a real problem for the people who are scored incorrectly because they have no appeal there's no due process and to that point there were six teachers in Houston that won a lawsuit they were fired based on their value-added model scores they sued and won and the judge found that their due process rights had been violated and I'm sort of sitting around waiting for that to happen in every other example that I have mentioned but also in lots of lots of other examples that are similar where you have this secret important decision made about you like why is that okay so this is a retroactive I suppose I don't wanna use the word solution but a way of dealing with what has happened I agree that action needs to be taken across the board I'm wondering what some viable solutions are to stop this happening in future and I love the fact that we open this conversation with you telling us that you work in consulting now in particular in algorithmic audits and I'm wondering if that will be a part of the solution going forward and what else we can do as a data science community to make sure that we're accountable I mean yes I mean so there's two different approaches and one of them is transparency and one of them is audit ability and honestly I think we need to consider both very carefully we have to think about what it means for something to be transparent certainly it wouldn't be very useful to hand over the source code to the teachers to tell them oh this is how you're being evaluated on here are the coefficients that we have trained on on this data no that would not be useful so we need to understand what we mean by transparency and I sort of worked out a kind of idea that I think is worth a try it's kind of a sensitivity analysis I mean that's a technical term but really what it looks like is you know hey first confirmed that the data that you have about me is correct next what if you know what if something had changed a little bit what if this kid had gotten a slightly better score what if that kid hadn't been in my class what if I had yet another kid what if I'd been teaching a different school what if I'd been teaching in a different classroom in my school what if I'd had you know 30 kids instead of 20 how would my score change and it's not going to prove everything it would catch obvious errors it would catch obvious instabilities which actually that algorithm in particular had so you know if you found out that your score would go from bad to good based on one small change then you would know that this is a bogus algorithm so that's one idea at the level of transparency but I would insist on suggesting that you know you really don't know whether an algorithm is fair just knowing your own how your own score works even if you really really understood your own score you wouldn't know if it's a fair fairness is a statistical concept it's a notion that we need to understand at an aggregate level so I am pushing for the idea of auditing as just as important as transparency really to ask the questions along the lines of for whom does this algorithm fail does this fail more often for black people or white people just failed more often for women than for men etc and that's a question you cannot get to just by an understanding your own score or whether your own data is correct or incorrect it's a question that has to be asked at a much higher level which much more access now to your point that I myself have an algorithmic auditing company I do but guess what I it doesn't have that many customers sadly and it's a result of the fact that algorithms essentially don't have that much scrutiny we there's not much leverage to convince somebody to audit their algorithms I have a some clients and those clients are great and I love them they are clients who really do want to know whether their algorithm is working as intended and they want to know either for their own sake because that money's on the line or the reputations on the line or yeah for some third front on behalf of some third party like the the investors or their customers or the public at large they want to know whether it's working what I really started my company for though is to understand to audit algorithms that I think are generally speaking the algorithms that companies don't want to have audited if you see where I'm going with this like it's those algorithms that are profiting from racism or profiting from bypassing the Americans with Disability Act those are the very algorithms that I want to be auditing but I don't have I don't have those clients yet and I don't have them because we're still living in a sort of plausible deniability situation with respect to algorithms so it may not currently be within these company's interests to be audited right so where do these incentives or where do you see them coming from I can imagine the end game could be legislators catching up with with technology and another thing we currently have is that data scientists and the community as a whole are in relative positions of being able to make requests to their own companies so you could imagine you know we're having this conversation now around checklists versus odds versus codes of conduct within the data science community as a whole and you could imagine algorithmic order it's becoming part of a checklist or an oath or code of conduct so I'm wondering where you see the incentives for companies in late stage capitalism coming from yeah I mean I know there's a lot of really cool data scientists out there and I love them all but I don't expect their power to be sufficient to get their company that they work for to start worrying about this in general so I think it has to come from fear honestly and that's either fear of federal regulators I'm not holding my breath for that to happen or fear of litigation so that you know that essentially they're their compliance officer says you have to do this or else we're taking out too much risk and we're gonna get screwed just in his example like Kyle Beane was applying to work at Kroger's grocery store when he got red lighted by that Chronos algorithm so Kroger is grocery store was licensing the Chronos algorithm with a license agreement that said they wouldn't understand the algorithm that Chronos had built but they understood that if there was any they had this indemnification clause extra contract on top of their licensing agreement that said if there's any problem with this algorithm Chronos would pay for the problem so they would take on the risk but you know Cronus is not a very big company it was working with seven huge companies just in the Atlanta Georgia area taking on the risk which is stupid because honestly like the fair hiring law the ABA the onus is on the large company not on some small data vendor so when when Kyle's father who's a lawyer sued he filed a class-action lawsuit seven class-action lawsuits against every one of those large companies those large companies are on the hook for the settlement if it if it ends up as a settlement it's not in Chronos is going to go bankrupt very very quickly if that ends up being settled for lots of money so I it's just one example but it's I think a very important example to demonstrate the fact that the companies using these algorithms for HR or what-have-you and that's often the framework the setup is that like some but some third small company builds the algorithm and then licenses to some large either company or government agency in the case of predictive policing or recidivism or two for that matter teacher evaluation and they can't actually they can't just offshore the the risk because it's those large companies that are going to be on the hook for that for the lawsuits right now the world is that those large companies do not they do not see the risk they do not acknowledge the risk and for so far they've gotten away with it your discussion of Cronus they really reminded me something that really surprised me when reading weapons of mass destruction was how I mean I knew about a lot of these cases but about a lot of the data vendors and small companies that build these models I'd heard of hardly any of them that kind of shocked me with respect to how much impact they are having and can have in the future on on society yeah you know it's this kind of where we as a society are waking up and that's a very important thing the public itself is starting to say hey wait algorithms aren't necessarily fair but how do we how do we know that it's because we use Google search and we we use Facebook and we see these I would say consumer facing algorithms one one by one you know on a daily basis and so we see the flaws of those things and we see the longer term sort of societal effects of being outraged by the news we see on Facebook every day those happen to be sort of obvious examples of problematic algorithms but they're also they also happen to be like some of the hardest biggest most complex algorithms out there I would not know actually how to go about auditing them I mean let me put it this way there there be like a thousand different ways to audit them and you'd have to sort of think really hard about each way I've had how to set up a test whereas just asking whether a specific personality test or application filter which is also used you know an algorithm that filters applications for jobs whether that is legal is a much more finite doable question but because of the nature of those algorithms like we may send in an application for a job we don't even know our application is being filtered by an algorithm so how is the public gonna find out it's it's wrong or it's wrong or they their their application was wrongly classified it's completely hidden from our view and I would say that most of the algorithms that are having strong effects on our lives college admissions officers all use algorithms now too like we don't know about them we can't and we can't complain if they're they go wrong because we just were never made aware of them and yet those are the ones that desperately need to be audited and so in terms of where people can find out more about these these types of algorithms and the challenges we're facing as a society I know you know for example the recidivism work pro-public has done a lot of great work on on that I followed data in society and a inow Institute but I'm wondering do you have any suggestions for where people can read more widely out what's happening now I mean the good news is that there's lots and lots of people thinking about this the bad news is ProPublica AI now any kind of sort of outside group even with the best intentions doesn't have access to these algorithms you know I mean that's a large part of why I did not go that route I'm not an academic I'm not I don't have the goal of having a sort of think tank that audits algorithms from the outside because you can't you literally can't audit algorithms that are HR algorithms from the outside you have to be invited in so that's why I started a company that theoretically anyway could be invited in to audit an algorithm but then the the problem I still have you know in spite of the fact that I'm willing to sign a nondisclosure agreement is that nobody wants my services because of this plausible deniability issue literally there are people that have talked to you that they want my services but then their corporate lawyers come on the phone and they say if what if you find a problem of with our algorithm that we don't know how to fix and then like later on with somebody Sue's us and in Discovery it's found that we knew there was a problem with this algorithm that's no good we can't use your services goodbye we'll jump right back into our interview with Cathy after a short segment now it's time for a segment called data science best practices I'm here with Ben Scranton an independent data science consultant hi Ben hi q go it's great to be back on your show do you know what color a unit test is no but I do know that I'm red green colorblind it is a bit of a trick question but your unit test should be red green cream listeners may recall that we've discussed unit tests in a previous segment awesome to recap unit tests are a key software engineering tool in our Arsenal to make sure code is correct they help you catch errors immediately as you write your code and you can check if one of your rocket scientist co-workers push code which broke the build righteous data scientists always run the unit test before pushing their changes so how does this magic happen bin unit tests work by providing a framework that runs your code before the entire application is complete the framework will call your functions or instantiate objects and then check that assertions are true which lets you find out immediately if you coded your brilliant idea correctly when everything is still fresh in your mind and easiest to fix a good framework will even set up the resources to run a test such as a fake database or simulated data before calling your tests those of you who heard the Vivian UQ segment on how to think about correctness of scientific models will recall that this is the verification step and how you provide proof that your code correctly implements the model so Ben do you have a favorite unit test framework in Python check out unit tests and in our hadley wickham's test that really the most important thing is that you use a unit test framework look for the one which provides the least friction for you and your team so what is this chromatic magic you mentioned red green green is one of the best philosophies for writing unit tests and part of the test-driven development approach the idea is to make the feedback loop on whether the code you wrote is correct as tight as possible this will boost your productivity and increase the quality of your work essentially for free so how does it look the first red means that you should write your unit test code and write a stub for the function you want to test then when you run the unit test it should fail hence the name red this gives you confidence that your unit test will fail when the code doesn't work and the first green the first green means that you then implement the code and get the unit test to work at this stage you only care about correctness don't forget canuse comment that premature optimization is the root of all evil in programming cool and the final green the final green means that now that your code and unit tests work you can refactor it to make it faster cleaner or add other improvements because you have a unit test you know that your optimizations which you should make only if necessary were correct remember only refactor in the presence of working tests Martin Fowler may have said something to that effect I'm excited to try this out next time I write some tests yeah once you start writing code this way you will have much more confidence in your system and be more productive because unit tests enable you to catch most bugs sooner while you are first implementing the code bugs are just easier to fix when you still have all the code stacked up in your head thanks Ben for that dive into test-driven development time to get straight back into our chat with Kathy so I'd like to move slightly and just think about the broader context of data science and the data revolution and I'm wondering what other important challenges you think we're now facing with respect to the amount of data there is data privacy and all the work that's happening I mean I'd say the biggest problem is that you know we live in a putative Lee free society and we're having a lot of problems out how to deal with this in a large part because it doesn't it doesn't give way to that many individual stories like I think I found a few stories like Kyle beam story etc that I find some teachers who were fired unfairly by their value-added model but the way our policymaking works in this country is like they need to find victims and the people get outraged and then they complain and then the policymakers pass laws and the nature of this statistical harm is different and it's harder to measure and so it's harder it's harder to imagine laws being passed and that's the best case scenario when you live in a society that is actually cares I guess the best best case scenario might be happening in Europe where they actually do pass laws although I think it's much more focused on privacy and less focused on this kind of algorithmic discrimination but in terms of what I worry about the most I'm looking at places like China with their social credit score which are intrinsically not trying to be fair they are just sort of explicitly trying to nudge people or strong-arm people really into behaving well and sort of there are social control mechanisms and they're going to be very very successful so what lessons do you think we can take from history to help approach this issue and I mean in particular from you know previous technological revolutions such as the Industrial Revolution but there may be others well I mean I so there's lots of different answers to that one of them is like it took us a while to catch up with pollution because it was like who in particular is harmed by a polluted river so it's kind of mixed journal what is it called externality and so we have externalities here which we are not keeping track of in the same kind of way and it took us a while to actually care enough about our environment to worry about how chemicals change it but we ended up doing that in large part because of the book Silent Spring but other things as well and then another example I like to give is if you think about the exciting new invention called the car like people were super excited by the car but it was also really dangerous and over time we have kept track of car related deaths and we have like lower them quite a bit because of inventions like the seatbelts and the crash test dummies etc and we started paying attention to like what makes something safer not to say that they're totally safe because they're not they're still not but we have traded the convenience for the risk I feel like best-case scenario in our future interactions with algorithms we're going to be doing a similar kind of trade where we're like we need algorithms they're so efficient and convenient but we have to be aware of that risk and the first step of that is to sort of measure the deaths we measured car deaths car related deaths we need to measure algorithmic related harm and that goes back to the point I'm most making at least twice already which is that we aren't we aren't not aware at currently of the harm because we're because it's invisible to us and so when I talk to policy makers which I do I beg them to not to regulate algorithms by saying you know here's how you have to make an algorithm cuz I think that would be possibly too restrictive but regulate algorithms and saying tell us how this is going wrong measure your harm show us who's getting harmed that's the very first step in understanding how to make things safer and I think this speaks also to a greater general cultural contextual challenge we're facing is in that as part of a political cycle the amount of debts incurred in a society forms a fundamental part in a lot of respects you know in America and in other countries but the amount of unfairness and poverty isn't necessarily something that's discussed in the same framework right can you say that again yes so deaths are something which immediately quantifiable and be able to brought to legislators and politicians as part of the political cycle whereas the amount of poverty isn't necessarily something that is as interesting in the news cycle and the political cycle yeah that's that's a good point I mean it's it's harder to quantify inequality than it is to quantify deaths and you know that goes back to like the question of what does our political system respond to if anything I mean right now it's just a complete show but like you know even in the best of times it responds better to you know stories of cruelty and death and it does to silent mistakes that nevertheless cost people real opportunities so it's you know it's hard to measure what is an opportunity loss cost to like not getting a particular job or not being here's another one that's not even relevant to current lawsuits going on with Facebook like not being shown at an ad for a job that you might have wanted because Facebook's getting in trouble for showing ads to only to young people and so it's like an age issue or only two men so so women don't get to see the STEM related job ads and so that and so well how much harm is that for a given person you know it's it's not obviously the most harmful thing that's ever happened to someone so it's not as exciting at a policy level but if it happens systemically which it does like it's a problem for society yeah and that speaks to another really important point in terms of accountability and trance Farren see that you can be shown stuff in your online experience that I and I'm shown something totally different and legislators are shown something entirely different and this is something that we this type of targeting is it is a relatively new phenomenon that's right I mean it's one of the reasons it's so hard to pin down is that it's going to my earlier point you get to see what you get to see but but that's not a statistical statement about what people get to see flipping that from in the other direction it is an amazing tool for predatory for predatory actions like payday lending or for-profit colleges it's like they can't believe how lucky they got they used to have a lot of trouble locating their victims desperate poor people but now it's like they couldn't be happier because they've got this this system and it's called the internet that finds them for for them and they're cheaply and on mass and scaling and is in a way that's exceedingly easy to scale so it's it's a fantasy come true for for those kinds of bad actors but then the question becomes how do we even keep track of that if they are actually going after those people that are in sort of a very real way voiceless and don't have the political capital to make their problems a priority so I've got one final question for you Kathy you know we're in the business of data science education here at data camp because of that a lot of our listeners will be the data analysts and data scientists of the future and I'd like to know what you'd like to see them do in their practice that isn't happening yet I just wrote a paper it's not out yet but it will be out pretty soon about ethics and artificial intelligence with philosopher named Hannigan it's called the ethical matrix and actually don't know it is called but I think it's something along the lines of the ethical matrix and at least it introduces this concept of an ethical matrix and it's a very simple idea the idea is to broaden our definition of what it means for an algorithm to work so when you ask somebody does this algorithm work they always say yes and then you say what do you mean and they just like oh it's efficient and so you know you're like oh but beyond that does it work and and you know that's when it becomes like what do you mean is that an and and then even if they want to go there they're like is that an in complicated question that I don't know how to attack I'm so the idea of this ethical matrix is a sort of to give a rubric to address this question and it's something that I claim but we should do before we start building an algorithm that we should do with everyone who is involved and so like to that point the first step in building of ethical matrix is to understand who the stakeholders are I mean to get those stakeholders involved in the construction of the of the matrix and to embed the the values of the stakeholders you know in a balanced way relative to their concerns so the rows are the sake holders the columns are the concerns and then you go through each cell of the matrix and try to decide are these stakeholders at high risk for this concern to go very very wrong it's as basically as simple as that but our theory is that if this is if this becomes part of the yoga of building a data-driven algorithm then it will theoretically at least help us consider much more broadly what it means for an algorithm to work what it means for it to have long-term negative consequences things to monitor for making sure that they're not going wrong etc and it will bring us from the from the narrow point of view it's working because it's working for me and I'm making money which is like I call the one by one ethical matrix the stakeholder is me or my company and the only concern is profit brought in that out to look at all the people that were affecting look at all this including maybe the environment look at all the concerns they might have fairness transparency you know false positives false negatives and consider all of these and balance their concerns explicitly well I for one I'm really excited about reading this paper and will include a link in the show notes to it as well great when it's a when it's a yeah great yeah fantastic look Cathy thank you so much for coming on on the show I've enjoyed this conversation so much great thank you for having me thank you thanks for joining our conversation would Cathy O'Neill about weapons of mass destruction the harm that algorithms can do and what we can do as a community and society at large to begin to correct these issues we saw that the three defining characteristics of WMDs are that they're important they're secret and they're harmful they're generally highly scalable black box models that affect real lives on the ground whether it be teachers being fired for the output of the value-added model job applicants being asked proxies for mental health assessment questions in the interview process or recidivism models that are biased against underrepresented groups being fed into parole hearings we saw that potential solutions are offered by a future in which algorithms are both more transparent and auditable Kathy herself works in the algorithmic auditing space and we discussed several key techniques such as a sensitivity analysis which figures out how sensitive any given algorithm is to its inputs and Kathy gave the example of the teaching value-added model being overly sensitive in that if you make small changes to a teacher's data it can alter the output wildly we also discussed the concept and practice of an ethical matrix which lays out all the stakeholders under the different ways the algorithm in question can impact them as Kathy said if the ethical matrix becomes part of the yoga of building a data-driven algorithm then it will theoretically at least help us consider much more broadly what it means for an algorithm to work what it means for it to have long-term negative consequences and my dear listeners remember data science doesn't predict the future it causes the future thank you all for tuning in all year long we have had so much fun producing these 50 episodes and cannot wait to be back on the proverbial airwaves in 2019 don't forget to take the survey in the show notes and to make any suggestions for future episodes thank you all once again my listeners and thank you to all my fantastic guests who've been on the show this year I am your host Hugo von Anderson you can follow me on twitter at you go Bao and data camp at data camp you can find all our episodes and show notes at data camp comm slash community slashwell in this episode 50 our season 1 finale of data framed a data camp podcast I have the great pleasure of speaking with Cathy O'Neill data scientists investigative journalist consultant algorithmic auditor and author of the critically acclaimed book weapons of mass destruction Cathy and I will discuss the ingredients that make up weapons of mass destruction which are algorithms and models that are important in society secret and harmful from models that decide whether you keep your job a credit card or insurance or algorithms that decide how we're policed sentenced to prison or given parole Cathy and I will be discussing the current lack of fairness in artificial intelligence how societal biases are perpetuated by algorithms and how both transparency and auditability of algorithms will be necessary for a fairer future what does this mean in practice stick around to find out as Cathy says fairness is a statistical concept it's a notion that we need to understand at an aggregate level and moreover data science doesn't just predict the future it causes the future I'm Hugo von Anderson a date assigned as the data camp and this is data frame welcome to data frame a weekly data cam podcast exploring what data science looks like on the ground for working data scientists and what problems have been solved I'm your host Hugo Bound Anderson you can follow me on Twitter that's you go back and data camp at data cap you can find all our episodes and show notes at data camp comm slash community slash podcast listeners as always check out the show notes for more material on the conversation today I've also included a survey in the show notes and a link to a forum where you can make suggestions for future episodes I'd really appreciate it if you take the survey so I can make sure that we're producing episodes that you want to hear now back to our regularly scheduled programming before we dive in I'd like to say thank you all for tuning in all year and thank you to all our wonderful guests we have had so much fun producing these 50 episodes and cannot wait to be back on the airwaves in 2019 we'll be back words season to early in 2019 and to keep you thinking curious and data focused in between seasons we're having a data frame challenge the winner will get to join me on a segment here on data framed the challenge is to listen to as many episodes as you can and to tweet excerpts that you find illuminating to at data camp at Hugo Bound and the relevant guests using the hashtag data framed challenge that's hashtag data framed challenge at the start of season two will randomly select the sender of one of these tweets to join me on a podcast segment the more tweets you send the more chance you have however will delete the duplicates first that's hashtag data framed challenge hi there Kathy and welcome to data framed thank you I'm glad to be here that's such a great pleasure to have you on the show and I'm really excited to be here to talk about you know a lot of different things surrounding data science and ethics and algorithmic bias and all of these types of things but before we get into the the nitty-gritty I'd love to know a bit about you so perhaps you can start off by telling us what you do and what you're known for in the data community well I'm a data science consultant and I started company to audit algorithms but I I guess I've been a data scientist for almost as long as there's been that title and actually I would argue that I was a data scientist before that title existed because I worked as a quant in finance starting in 2007 and I think of that as a data science job even though other people might not agree so I mean and the reason I'm being I'm waffling is because when I enter data science maybe I would say in 2011 in a large question in my mind was like to what extent is this a thing and so I wrote a book actually called doing data science just to explore that question I co-authored it with Rachel shut the idea of that book was like what is data science is it a thing is it new is it important what you know is it powerful is it too powerful things like that so what am i known for I think I'm known for being like a gadfly for sort of calling out print for possibly you know people think of me as overly negative about the field I think of myself as the antidote to kool-aid well yeah I think of a lot of the work you've done as kind of a restoring or restorative force to a lot of what's happening at blinding speed in tech in data science in how algorithms are becoming more and more a part of our daily lives yeah that sounds nice thank you you're welcome and I like kind of the way you motivated your book that you co-wrote doing data science in terms of you know exploring what what data science is because you actually have a nice working definition in there which is something along the lines of like a data savvy quantitatively minded coding literate problem solver and that's how I like to think of data science work in general as well yeah you know I've actually kind of discarded that definition in in preference for a new definition which I just came up with like a couple months ago but I'm into and this is a way of distinguishing the new part of data science from the old part of data science so it's not it's not really a definition of data science per se but it is what is sort of a definition of what I worry about in data science if you will which is that data science doesn't just predict the future it causes the future so that distinguishes it from astronomers likes astronomers you use a lot of quantitative techniques they use tons of data they're not new so are they data scientists in the first definition that you just told me for my book probably yes but in the second definition no because like the point is that they can tell us when Haley's comment is coming back but they're not going to affect when Haley's comment is coming back and and that's that's the thing that data science or I should say data scientists need to understand about what they're doing for the most part they are not just sort of predicting but they're causing and that's that's where it gets these sort of society-wide feedback loops that we need to start worrying about agree completely and I look forward to delving into these these feedback loops and this idea of a feedback loop in data science work and in algorithms and modeling is you know one of the key ingredients of what you call a weapon of mass destruction which I really look forward to getting back to but I like the idea that you've moved on from a quasi definition you had in in your book doing data science because a question I mean doing data science was was written five or six years ago now is that right yeah 2012 I think yeah right so I'm wondering looking back on that if you were to rewrite it or do it again what what do you think is worth talking about now that you couldn't see then well I mean it to be clear it was it was each chapter was a was a different lecture in a class at Columbia it was taken from those lectures including a lecture I gave about finance including a lecture that we had the you know that the data scientists from square speak you know we had people that were probably not considered data scientists but statisticians speaking etc so he was like a grab-bag and in that way it was actually really cool because it was all over the place and broad and we could see how sort of these techniques were broadly applicable various techniques and we could also go into you know networks in one chapter and you know time series in another and that was that was neat because we could sort of like have a survey if you will of stuff but it wasn't meant to be a deep dive in any given direction if I rewrote it now I would probably if I kept with that survey approach I would be surveying a totally different world because we have very different kinds of things going on now I guess we also have some sort of through streams like we have some some things that are still happening that we're happening then that we emphasize more I think in particular I would spend a lot more time on recommendation engines although we do have a chapter on recommendation engines from the former CEO of hunch I believe to try to understand a person and by 20 questions and then sort of recommend what kind of iPhone they should buy or something like that but nowadays you know I spend a lot more time exploring things like to what extent do the YouTube recommendations radicalize our youth that's really interesting because I think what what that does is it puts data science and data science work as as we've been discussing already into a broader societal context and assesses and communicates around the impact of all the work that happens in data science so I think that provides a nice segue into a lot of the work you've done which culminated in your in your book weapons of mass mass destruction that I'd like to spend a bit of time on so could you tell me kind of the basic ingredients of what a weapon of mass destruction actually is sure a weapon of mass destruction is is a kind of algorithm that I feel we're not worrying enough about it's important and it's secret and its destructive those are the three characteristics important by important I mean it's widespread it's scale that's used on a lot of people for important decisions I usually think of the categories of decisions in the following like financial decisions so it could it be a credit card or insurance or housing or livelihood decisions like do you get a job do you keep your job are you good at your job do you get a raise or Liberty so how are you police how are you sentenced to prison how are you given parole your actual Liberty and then the fourth category would be information so how are you fed information how is your environment online in particular informed through algorithms and what kind of long-term effects are those having on different parts of the population so those are the four categories they're important one of the things that I absolutely insist on when we talk about welcoming weapons about destruction or algorithms or regulation in particular is that we really focus in on important algorithms there's just too many algorithms to worry about so we have to sort of triage and think about which ones actually matter to people and then the second thing is that there secret almost all of these are secret even people don't even know that they exist nevermind understand how they work and then finally they're making important secret decisions about people's lives and they up like they they make mistakes and it's destructive for that individual who doesn't get the opportunity or the job or the credit card or the housing opportunity or they get in prison too long so it's destructive for them but as an observation this goes back to the feedback loop thing it's not just destructive for an individual but it actually sort of undermines the original goal the algorithm and creates a destructive feedback loop on the level of society yeah and a point you making in your book which we may get to is that they can also feed into each other and exacerbate conditions had already exist in society such as being unfair on already underrepresented groups so before we get there though could you provide I mean you've provided a nice kind of framework of the different buckets of these algorithms and WMDs and where they fall but could you provide a few concrete examples of what you consider to be the most harmful WMDs yeah I'll give you a few and I choose these in part because they're horrible but also because they all fail in a totally different ways and I want to make the point that there's like not one solution to this problem so the first one comes from the world of teaching public school teaching so there was a thing called the value-added model for teachers which was used to fire a bunch of teachers and unfairly because it turned out it was not much better than a random number generator didn't contain a lot of information about a specific teacher and in instances were where it did seem to be a sort of an extreme value that it was often it was manipulated by previous teachers cheating so like you couldn't really control your numbers but if you're a previous teacher cheated then your number would go down so it was like in this crazy system yeah because if I remember correctly the baseline is set by where your students were in the previous year or yeah with them right yeah the idea was like how well did your students do relative to it their expected performance in a standardized test and it was a very noisy question in terms of statistics unless the previous teacher in the previous year had cheated on the under those kids tests and those kids did extremely well relative to what they actually understood which would force them of course to do extremely badly the next year even if you're a good teacher so it would look really bad for you but long story short it was normally speaking when there wasn't cheating involved just a terrible statistical non robust model and yet it was being used to fire people so that's the first example the next example is this example from hiring which is a story about Kyle beam this young man who noticed in a personality test that he had to take to get a job that he failed he noticed some of the questions were exactly the same questions that he had been given in a mental health assessment and when he was being treated for bipolar disorder so that was like an embedded illegal mental health assessment that you know is illegal on to the Amaris with Disability Act which makes it illegal for any kind of health exam including a mental health exam to be administered as part as part of a hiring process so that's that's another example but in and I should add that like it wasn't just one one job it was you know Kyle ended up taking seven different versions of this test I should say he ended up taking the same exact test seven different times when he applied to seven different chain stores all of them in the Atlanta Georgia area so he wasn't just precluded from that one job he was precluded from almost any minimum wage work in the area and it wasn't just him it was like anybody who would have failed that so health assessment which is you know to say a vast community of people with mental health status so that's a great example of the feedback loop I was mentioning you know because of the scale of this Chronos test it wasn't just destructive for the individual but it was undermining the exact goal of personality test and also undermining the the overall goal of the ABA which is to to avoid the systematic filtering out of sub population and so that's the second example and the third example I would give is what we call recidivism risk algorithms in the criminal justice system where you have basically again questionnaires that end up with a score for recidivism risk that is handed to a judge and being told to the judge like this is objective scientific measurement of somebody's risk of recidivism recidivism being the likelihood of being arrested after leaving prison and the problem with that well there's lots of problems with that but the very immediate problem with that is that the questions on the questionnaire are almost entirely proxies for race and class so they ask questions like did you grow up in a high crime neighborhood I mean you grew up in a high crime neighborhood if you're a poor black person in fact like that's almost the definition of high crime neighborhood that's where the police are sent to arrest people historically from the broken windows policy the theory of police saying to the present day and by the way I should add like in part that has been propagated by another algorithm which is predictive policing so you're being asked all these proxies for poverty proxies for race and class other questions are like are you a member of a gang do you have mental health problems do you have addiction problems a lot of this kind of information is only available be cut or only held against you if you are poor and you know Pete rich or people white people get treated they don't get punished for this kind of thing so long story short it's a you know basically a test to see how poor you are and how minority you are and then if your score is higher which you which it is if you are poor and if you're black then you get sent Prison for longer now I should say like as toxic as that algorithm is and as obvious as it is that the that it creates negative feedback loops one of the things that sort of the jury is still out on is like whether that is actually that different from what we we have already we have already sort of racist classist a system not to mention judges and we have evidence for that and the idea was we're going to get better we're going to be more scientific we're gonna be more objective it's not at all clear that kind of scoring system would do so nor is it clear by the way because there's been lots of not lots but there's been some amount of testing since my book came out about how judges actually use these scoring systems it's not clear that they use them the way that they're intended and there's all sorts of evidence now that judges either ignore them or they ignore them in certain cases but listen to them other cases like for example they ignore them in black court courtrooms and they they use them in white courtrooms so they actually like keep a lot of people and especially if they're being used for pretrial detention considerations like they'll let white people out of incarceration pretrial but then they're they're going to ignore them in urban districts where they're gonna keep black people incarcerated before trial long story short there's also a lot of questions around how they're actually being used but it's it's a great example of a weapon of mass destruction sort of just created as if it just that the nature of algorithms will make things more fair I mean I guess going to your earlier point no algorithm is perfect and we couldn't expect that to be perfect but it's the reason these sort of society-wide just destructive feedback loops get propagated get created by these algorithms isn't just because they're imperfect it's because they're being used as I said in that example but more broadly they're more like funneling people in different classes and different for different genders or races or different mental health status or disability status they're funneling them in onto a path which they were sort of quote-unquote already on depending on their demographics yeah and I think speaking to your point of the fact that these algorithms may not be creating new biases I mean they may as well but that their encoding societal biases and keeping people on a path that they may have been on already I think something distinct from that is that they're actually scalable as well right yeah right so we shouldn't be surprised of course now that we say that loud like the propagating past practices or automating the status quo they're just doing what was done in the past and acting like Oh since this happened in the past in a pattern it's we should predict that it will happen in the future but the way they're being actually utilized it means not just that it we predict it will happen but we're gonna cause it to happen if you are more likely to pay back alone you're more likely to get a loan so the people who are who are deemed less likely are going to be cut out of the system they're not going to be offered a credit card and since all the algorithms work in concert and similarly to each other this becomes like a rule and it's highly scaled even if it's not the exact same algorithm which it was in the case of Kyle being with a Chronos algorithm the same exact algorithm being used but even if it isn't the fact is data scientists do their job similarly across across different companies in the same industry so online credit decisioning is going to be based on similar kinds of demographic questions we'll jump right back into our interview with Cathy O'Neil after a short segment now it's time for a segment called data science best practices I'm here with Heather nullus a machine learning engineer at t-mobile hey Heather hey you go Heather we have a lot of problems in data science and a huge practical one is running code on different computers operating systems in production and in the cloud and so on right absolutely so for me whenever I started a new job the worst part is that first day a few days where you just spend your time trying to get your computer to work like you'd expect a development or data science computer to work it requires a lot of setup and you have to install the correct operating systems get your programming languages installed find the IDE that everybody else is using make sure that you have the right version then download all of your right packages any libraries that you went and so on and this takes a long time and it's really complicated and for me it's like I said the most painful part of starting a new job and no matter how well somebody thinks that they've documented the process to set up a computer to run their code they probably haven't this becomes an even bigger problem when you're thinking about writing something in production or on the cloud or on tons of other computers where you have to have multiple versions stabili exists in a scalable capacity because how can you make sure that the setup is correct on all of those machines are there any solutions to this there are and so today I would like to tell you about one called docker so just imagine that you can write all of the machines for setting up your computer once in a very clear understandable legible language and then have that setup happened on any machine that you want that is essentially what docker does and so for data scientists the most useful thing is it eliminates the machines setup it allows for seamless code handoff so you're working on a model on your machine you really need it to run on your co-workers machine you can just wrap it all up in a docker container and hand it over and they don't even need to develop an AR Python or whatever your they don't need to have any language installing their computer in fact yeah I'm a huge fan of docker can you tell us a bit about how it works a docker file is a set of instructions that tells the Machine how to set itself up and then to run your code when a darker file is finished building we call that a snapshot and that's a docker image it's just like a picture in time of what this little miniature machine looks like when the images run it creates a container which is a mini virtual machine containers are really cool because you can have a lot of them running on your computer and they don't interact unless you tell them to and so for instance when I'm writing programs and Python 2.7 and 3.6 on the same machine I run into a lot of trouble because this libraries communicate with each other and this is a huge hassle to keep them from interacting but if I do it in containers I could have a container of written in Python 2.7 and a container written in Python 3.6 running on the same machine at the same time and neither one of them complains so this is wonderful protection environments where people are writing in all sorts of languages because each container is isolated it allows for all sorts of code to be used scaled and maintained on a single server and so for data scientists the important thing is if you can wrap up your models into a docker file and no setup as needed then you can just hand that to an engineer and your code will just run even cooler than this is others can do the work of constructing a docker image for you so you're like I don't know anything about docker I want to do this for the very first time you can go to docker hub and find images that other people has already created so you can find images for Python images for our images for Python 2.6 with a really specific package installed there's probably a darker container for Fortran if you want to like absolutely anything that you want somebody's probably started making it and do you use docker at t-mobile we do so at t-mobile our goal is to create machine learning models that run constantly in production in customer-facing ways not wanting to create our models and then recreate them in Python we decided to use our to create all of our api's dark docker is perfect to do this because the setup of our model is really complicated when you go to run our models we do deep learning and so we need are a bunch of our packages including Charis Charis runs Python on the back end then you need a bunch of really specific Linux libraries to make sure that your Python is running correctly and then we need a way for our API to pass the security requirements for t-mobile as a whole because you can't just have these API is open out in the world and our initial docker image for doing this top six gigs which caused a ton of trouble it actually took down production clusters but after a lot of work we've gotten it down to one four eight five gigs which is well within the range of acceptable by our DevOps team and we're going to release it in a few weeks on docker hub so it includes all the necessary functionality and security features that you would need to run a deep learning are tensorflow Kerris model in production and it will be open sourced because we're super passionate about using our and production or you want to empower data scientists who are most comfortable and are to go ahead and make their own api's without having to rely on a python dev to eventually do that work for them are there any potential pitfalls you'd like to warn our listeners about learning darker can be a little bit confusing at first just because it's a totally different way of setting up a file and if you've never used containers before it can take a little bit of a brain switch to get used to but after that it's really simple and then aside from that the other thing that you just really have to look out for is if you want your container super production stable they do have to be really small by really small I mean probably under two gigs when you go putting out 20 gig docker containers because it was easy DevOps doesn't get too happy and with our that means eliminating a lot of the extra packages and our is not super super great at tracking packages so you just have to be really thoughtful whenever you install things into your docker image about how it will affect the size after that it's pretty intuitive thanks Heather for that introduction to docker after that interlude it's time to jump back into our chat with Kathy I also think I mean there are a lot of different avenues we can take here and for people who want more after this conversation I highly recommend Kathy's book weapons of mass destruction something I'd like to focus on is that in all of these models you know the value-added model for teaching the hiring model these models to predict recidivism rate one really important aspect of these is that they're not interpret we can't tell why they make the predictions they do the fact that they're black box in that sense and the relationship between this inability to interpret them the inability of a teacher to go and say why have you given me this rating and they're pointed to the algorithm and the fact that this combined with the scalability really makes on mass lack of accountability and lack of fairness correct yeah I mean it's exactly right I mean and I talked about that is in fact a characteristic of a weapon of mass destruction that it's secret and that's a really important part of it because when you have something that's important and secret like it's almost always going to be destructive there's no there's no you know a good data science model has a feedback loops and and it incorporates its mistakes but there's no reason for their mistakes to be incorporated when we don't alert people to them I'm so that's this sort of unaccountability is a real problem for the model but it's also obviously a real problem for the people who are scored incorrectly because they have no appeal there's no due process and to that point there were six teachers in Houston that won a lawsuit they were fired based on their value-added model scores they sued and won and the judge found that their due process rights had been violated and I'm sort of sitting around waiting for that to happen in every other example that I have mentioned but also in lots of lots of other examples that are similar where you have this secret important decision made about you like why is that okay so this is a retroactive I suppose I don't wanna use the word solution but a way of dealing with what has happened I agree that action needs to be taken across the board I'm wondering what some viable solutions are to stop this happening in future and I love the fact that we open this conversation with you telling us that you work in consulting now in particular in algorithmic audits and I'm wondering if that will be a part of the solution going forward and what else we can do as a data science community to make sure that we're accountable I mean yes I mean so there's two different approaches and one of them is transparency and one of them is audit ability and honestly I think we need to consider both very carefully we have to think about what it means for something to be transparent certainly it wouldn't be very useful to hand over the source code to the teachers to tell them oh this is how you're being evaluated on here are the coefficients that we have trained on on this data no that would not be useful so we need to understand what we mean by transparency and I sort of worked out a kind of idea that I think is worth a try it's kind of a sensitivity analysis I mean that's a technical term but really what it looks like is you know hey first confirmed that the data that you have about me is correct next what if you know what if something had changed a little bit what if this kid had gotten a slightly better score what if that kid hadn't been in my class what if I had yet another kid what if I'd been teaching a different school what if I'd been teaching in a different classroom in my school what if I'd had you know 30 kids instead of 20 how would my score change and it's not going to prove everything it would catch obvious errors it would catch obvious instabilities which actually that algorithm in particular had so you know if you found out that your score would go from bad to good based on one small change then you would know that this is a bogus algorithm so that's one idea at the level of transparency but I would insist on suggesting that you know you really don't know whether an algorithm is fair just knowing your own how your own score works even if you really really understood your own score you wouldn't know if it's a fair fairness is a statistical concept it's a notion that we need to understand at an aggregate level so I am pushing for the idea of auditing as just as important as transparency really to ask the questions along the lines of for whom does this algorithm fail does this fail more often for black people or white people just failed more often for women than for men etc and that's a question you cannot get to just by an understanding your own score or whether your own data is correct or incorrect it's a question that has to be asked at a much higher level which much more access now to your point that I myself have an algorithmic auditing company I do but guess what I it doesn't have that many customers sadly and it's a result of the fact that algorithms essentially don't have that much scrutiny we there's not much leverage to convince somebody to audit their algorithms I have a some clients and those clients are great and I love them they are clients who really do want to know whether their algorithm is working as intended and they want to know either for their own sake because that money's on the line or the reputations on the line or yeah for some third front on behalf of some third party like the the investors or their customers or the public at large they want to know whether it's working what I really started my company for though is to understand to audit algorithms that I think are generally speaking the algorithms that companies don't want to have audited if you see where I'm going with this like it's those algorithms that are profiting from racism or profiting from bypassing the Americans with Disability Act those are the very algorithms that I want to be auditing but I don't have I don't have those clients yet and I don't have them because we're still living in a sort of plausible deniability situation with respect to algorithms so it may not currently be within these company's interests to be audited right so where do these incentives or where do you see them coming from I can imagine the end game could be legislators catching up with with technology and another thing we currently have is that data scientists and the community as a whole are in relative positions of being able to make requests to their own companies so you could imagine you know we're having this conversation now around checklists versus odds versus codes of conduct within the data science community as a whole and you could imagine algorithmic order it's becoming part of a checklist or an oath or code of conduct so I'm wondering where you see the incentives for companies in late stage capitalism coming from yeah I mean I know there's a lot of really cool data scientists out there and I love them all but I don't expect their power to be sufficient to get their company that they work for to start worrying about this in general so I think it has to come from fear honestly and that's either fear of federal regulators I'm not holding my breath for that to happen or fear of litigation so that you know that essentially they're their compliance officer says you have to do this or else we're taking out too much risk and we're gonna get screwed just in his example like Kyle Beane was applying to work at Kroger's grocery store when he got red lighted by that Chronos algorithm so Kroger is grocery store was licensing the Chronos algorithm with a license agreement that said they wouldn't understand the algorithm that Chronos had built but they understood that if there was any they had this indemnification clause extra contract on top of their licensing agreement that said if there's any problem with this algorithm Chronos would pay for the problem so they would take on the risk but you know Cronus is not a very big company it was working with seven huge companies just in the Atlanta Georgia area taking on the risk which is stupid because honestly like the fair hiring law the ABA the onus is on the large company not on some small data vendor so when when Kyle's father who's a lawyer sued he filed a class-action lawsuit seven class-action lawsuits against every one of those large companies those large companies are on the hook for the settlement if it if it ends up as a settlement it's not in Chronos is going to go bankrupt very very quickly if that ends up being settled for lots of money so I it's just one example but it's I think a very important example to demonstrate the fact that the companies using these algorithms for HR or what-have-you and that's often the framework the setup is that like some but some third small company builds the algorithm and then licenses to some large either company or government agency in the case of predictive policing or recidivism or two for that matter teacher evaluation and they can't actually they can't just offshore the the risk because it's those large companies that are going to be on the hook for that for the lawsuits right now the world is that those large companies do not they do not see the risk they do not acknowledge the risk and for so far they've gotten away with it your discussion of Cronus they really reminded me something that really surprised me when reading weapons of mass destruction was how I mean I knew about a lot of these cases but about a lot of the data vendors and small companies that build these models I'd heard of hardly any of them that kind of shocked me with respect to how much impact they are having and can have in the future on on society yeah you know it's this kind of where we as a society are waking up and that's a very important thing the public itself is starting to say hey wait algorithms aren't necessarily fair but how do we how do we know that it's because we use Google search and we we use Facebook and we see these I would say consumer facing algorithms one one by one you know on a daily basis and so we see the flaws of those things and we see the longer term sort of societal effects of being outraged by the news we see on Facebook every day those happen to be sort of obvious examples of problematic algorithms but they're also they also happen to be like some of the hardest biggest most complex algorithms out there I would not know actually how to go about auditing them I mean let me put it this way there there be like a thousand different ways to audit them and you'd have to sort of think really hard about each way I've had how to set up a test whereas just asking whether a specific personality test or application filter which is also used you know an algorithm that filters applications for jobs whether that is legal is a much more finite doable question but because of the nature of those algorithms like we may send in an application for a job we don't even know our application is being filtered by an algorithm so how is the public gonna find out it's it's wrong or it's wrong or they their their application was wrongly classified it's completely hidden from our view and I would say that most of the algorithms that are having strong effects on our lives college admissions officers all use algorithms now too like we don't know about them we can't and we can't complain if they're they go wrong because we just were never made aware of them and yet those are the ones that desperately need to be audited and so in terms of where people can find out more about these these types of algorithms and the challenges we're facing as a society I know you know for example the recidivism work pro-public has done a lot of great work on on that I followed data in society and a inow Institute but I'm wondering do you have any suggestions for where people can read more widely out what's happening now I mean the good news is that there's lots and lots of people thinking about this the bad news is ProPublica AI now any kind of sort of outside group even with the best intentions doesn't have access to these algorithms you know I mean that's a large part of why I did not go that route I'm not an academic I'm not I don't have the goal of having a sort of think tank that audits algorithms from the outside because you can't you literally can't audit algorithms that are HR algorithms from the outside you have to be invited in so that's why I started a company that theoretically anyway could be invited in to audit an algorithm but then the the problem I still have you know in spite of the fact that I'm willing to sign a nondisclosure agreement is that nobody wants my services because of this plausible deniability issue literally there are people that have talked to you that they want my services but then their corporate lawyers come on the phone and they say if what if you find a problem of with our algorithm that we don't know how to fix and then like later on with somebody Sue's us and in Discovery it's found that we knew there was a problem with this algorithm that's no good we can't use your services goodbye we'll jump right back into our interview with Cathy after a short segment now it's time for a segment called data science best practices I'm here with Ben Scranton an independent data science consultant hi Ben hi q go it's great to be back on your show do you know what color a unit test is no but I do know that I'm red green colorblind it is a bit of a trick question but your unit test should be red green cream listeners may recall that we've discussed unit tests in a previous segment awesome to recap unit tests are a key software engineering tool in our Arsenal to make sure code is correct they help you catch errors immediately as you write your code and you can check if one of your rocket scientist co-workers push code which broke the build righteous data scientists always run the unit test before pushing their changes so how does this magic happen bin unit tests work by providing a framework that runs your code before the entire application is complete the framework will call your functions or instantiate objects and then check that assertions are true which lets you find out immediately if you coded your brilliant idea correctly when everything is still fresh in your mind and easiest to fix a good framework will even set up the resources to run a test such as a fake database or simulated data before calling your tests those of you who heard the Vivian UQ segment on how to think about correctness of scientific models will recall that this is the verification step and how you provide proof that your code correctly implements the model so Ben do you have a favorite unit test framework in Python check out unit tests and in our hadley wickham's test that really the most important thing is that you use a unit test framework look for the one which provides the least friction for you and your team so what is this chromatic magic you mentioned red green green is one of the best philosophies for writing unit tests and part of the test-driven development approach the idea is to make the feedback loop on whether the code you wrote is correct as tight as possible this will boost your productivity and increase the quality of your work essentially for free so how does it look the first red means that you should write your unit test code and write a stub for the function you want to test then when you run the unit test it should fail hence the name red this gives you confidence that your unit test will fail when the code doesn't work and the first green the first green means that you then implement the code and get the unit test to work at this stage you only care about correctness don't forget canuse comment that premature optimization is the root of all evil in programming cool and the final green the final green means that now that your code and unit tests work you can refactor it to make it faster cleaner or add other improvements because you have a unit test you know that your optimizations which you should make only if necessary were correct remember only refactor in the presence of working tests Martin Fowler may have said something to that effect I'm excited to try this out next time I write some tests yeah once you start writing code this way you will have much more confidence in your system and be more productive because unit tests enable you to catch most bugs sooner while you are first implementing the code bugs are just easier to fix when you still have all the code stacked up in your head thanks Ben for that dive into test-driven development time to get straight back into our chat with Kathy so I'd like to move slightly and just think about the broader context of data science and the data revolution and I'm wondering what other important challenges you think we're now facing with respect to the amount of data there is data privacy and all the work that's happening I mean I'd say the biggest problem is that you know we live in a putative Lee free society and we're having a lot of problems out how to deal with this in a large part because it doesn't it doesn't give way to that many individual stories like I think I found a few stories like Kyle beam story etc that I find some teachers who were fired unfairly by their value-added model but the way our policymaking works in this country is like they need to find victims and the people get outraged and then they complain and then the policymakers pass laws and the nature of this statistical harm is different and it's harder to measure and so it's harder it's harder to imagine laws being passed and that's the best case scenario when you live in a society that is actually cares I guess the best best case scenario might be happening in Europe where they actually do pass laws although I think it's much more focused on privacy and less focused on this kind of algorithmic discrimination but in terms of what I worry about the most I'm looking at places like China with their social credit score which are intrinsically not trying to be fair they are just sort of explicitly trying to nudge people or strong-arm people really into behaving well and sort of there are social control mechanisms and they're going to be very very successful so what lessons do you think we can take from history to help approach this issue and I mean in particular from you know previous technological revolutions such as the Industrial Revolution but there may be others well I mean I so there's lots of different answers to that one of them is like it took us a while to catch up with pollution because it was like who in particular is harmed by a polluted river so it's kind of mixed journal what is it called externality and so we have externalities here which we are not keeping track of in the same kind of way and it took us a while to actually care enough about our environment to worry about how chemicals change it but we ended up doing that in large part because of the book Silent Spring but other things as well and then another example I like to give is if you think about the exciting new invention called the car like people were super excited by the car but it was also really dangerous and over time we have kept track of car related deaths and we have like lower them quite a bit because of inventions like the seatbelts and the crash test dummies etc and we started paying attention to like what makes something safer not to say that they're totally safe because they're not they're still not but we have traded the convenience for the risk I feel like best-case scenario in our future interactions with algorithms we're going to be doing a similar kind of trade where we're like we need algorithms they're so efficient and convenient but we have to be aware of that risk and the first step of that is to sort of measure the deaths we measured car deaths car related deaths we need to measure algorithmic related harm and that goes back to the point I'm most making at least twice already which is that we aren't we aren't not aware at currently of the harm because we're because it's invisible to us and so when I talk to policy makers which I do I beg them to not to regulate algorithms by saying you know here's how you have to make an algorithm cuz I think that would be possibly too restrictive but regulate algorithms and saying tell us how this is going wrong measure your harm show us who's getting harmed that's the very first step in understanding how to make things safer and I think this speaks also to a greater general cultural contextual challenge we're facing is in that as part of a political cycle the amount of debts incurred in a society forms a fundamental part in a lot of respects you know in America and in other countries but the amount of unfairness and poverty isn't necessarily something that's discussed in the same framework right can you say that again yes so deaths are something which immediately quantifiable and be able to brought to legislators and politicians as part of the political cycle whereas the amount of poverty isn't necessarily something that is as interesting in the news cycle and the political cycle yeah that's that's a good point I mean it's it's harder to quantify inequality than it is to quantify deaths and you know that goes back to like the question of what does our political system respond to if anything I mean right now it's just a complete show but like you know even in the best of times it responds better to you know stories of cruelty and death and it does to silent mistakes that nevertheless cost people real opportunities so it's you know it's hard to measure what is an opportunity loss cost to like not getting a particular job or not being here's another one that's not even relevant to current lawsuits going on with Facebook like not being shown at an ad for a job that you might have wanted because Facebook's getting in trouble for showing ads to only to young people and so it's like an age issue or only two men so so women don't get to see the STEM related job ads and so that and so well how much harm is that for a given person you know it's it's not obviously the most harmful thing that's ever happened to someone so it's not as exciting at a policy level but if it happens systemically which it does like it's a problem for society yeah and that speaks to another really important point in terms of accountability and trance Farren see that you can be shown stuff in your online experience that I and I'm shown something totally different and legislators are shown something entirely different and this is something that we this type of targeting is it is a relatively new phenomenon that's right I mean it's one of the reasons it's so hard to pin down is that it's going to my earlier point you get to see what you get to see but but that's not a statistical statement about what people get to see flipping that from in the other direction it is an amazing tool for predatory for predatory actions like payday lending or for-profit colleges it's like they can't believe how lucky they got they used to have a lot of trouble locating their victims desperate poor people but now it's like they couldn't be happier because they've got this this system and it's called the internet that finds them for for them and they're cheaply and on mass and scaling and is in a way that's exceedingly easy to scale so it's it's a fantasy come true for for those kinds of bad actors but then the question becomes how do we even keep track of that if they are actually going after those people that are in sort of a very real way voiceless and don't have the political capital to make their problems a priority so I've got one final question for you Kathy you know we're in the business of data science education here at data camp because of that a lot of our listeners will be the data analysts and data scientists of the future and I'd like to know what you'd like to see them do in their practice that isn't happening yet I just wrote a paper it's not out yet but it will be out pretty soon about ethics and artificial intelligence with philosopher named Hannigan it's called the ethical matrix and actually don't know it is called but I think it's something along the lines of the ethical matrix and at least it introduces this concept of an ethical matrix and it's a very simple idea the idea is to broaden our definition of what it means for an algorithm to work so when you ask somebody does this algorithm work they always say yes and then you say what do you mean and they just like oh it's efficient and so you know you're like oh but beyond that does it work and and you know that's when it becomes like what do you mean is that an and and then even if they want to go there they're like is that an in complicated question that I don't know how to attack I'm so the idea of this ethical matrix is a sort of to give a rubric to address this question and it's something that I claim but we should do before we start building an algorithm that we should do with everyone who is involved and so like to that point the first step in building of ethical matrix is to understand who the stakeholders are I mean to get those stakeholders involved in the construction of the of the matrix and to embed the the values of the stakeholders you know in a balanced way relative to their concerns so the rows are the sake holders the columns are the concerns and then you go through each cell of the matrix and try to decide are these stakeholders at high risk for this concern to go very very wrong it's as basically as simple as that but our theory is that if this is if this becomes part of the yoga of building a data-driven algorithm then it will theoretically at least help us consider much more broadly what it means for an algorithm to work what it means for it to have long-term negative consequences things to monitor for making sure that they're not going wrong etc and it will bring us from the from the narrow point of view it's working because it's working for me and I'm making money which is like I call the one by one ethical matrix the stakeholder is me or my company and the only concern is profit brought in that out to look at all the people that were affecting look at all this including maybe the environment look at all the concerns they might have fairness transparency you know false positives false negatives and consider all of these and balance their concerns explicitly well I for one I'm really excited about reading this paper and will include a link in the show notes to it as well great when it's a when it's a yeah great yeah fantastic look Cathy thank you so much for coming on on the show I've enjoyed this conversation so much great thank you for having me thank you thanks for joining our conversation would Cathy O'Neill about weapons of mass destruction the harm that algorithms can do and what we can do as a community and society at large to begin to correct these issues we saw that the three defining characteristics of WMDs are that they're important they're secret and they're harmful they're generally highly scalable black box models that affect real lives on the ground whether it be teachers being fired for the output of the value-added model job applicants being asked proxies for mental health assessment questions in the interview process or recidivism models that are biased against underrepresented groups being fed into parole hearings we saw that potential solutions are offered by a future in which algorithms are both more transparent and auditable Kathy herself works in the algorithmic auditing space and we discussed several key techniques such as a sensitivity analysis which figures out how sensitive any given algorithm is to its inputs and Kathy gave the example of the teaching value-added model being overly sensitive in that if you make small changes to a teacher's data it can alter the output wildly we also discussed the concept and practice of an ethical matrix which lays out all the stakeholders under the different ways the algorithm in question can impact them as Kathy said if the ethical matrix becomes part of the yoga of building a data-driven algorithm then it will theoretically at least help us consider much more broadly what it means for an algorithm to work what it means for it to have long-term negative consequences and my dear listeners remember data science doesn't predict the future it causes the future thank you all for tuning in all year long we have had so much fun producing these 50 episodes and cannot wait to be back on the proverbial airwaves in 2019 don't forget to take the survey in the show notes and to make any suggestions for future episodes thank you all once again my listeners and thank you to all my fantastic guests who've been on the show this year I am your host Hugo von Anderson you can follow me on twitter at you go Bao and data camp at data camp you can find all our episodes and show notes at data camp comm slash community slash\n"