How that Machine Algorithm Ended up being Racist

Vikram Bath

Vikram Bath is the pseudonym of a former business school professor living in the United States with his wife, daughter, and dog. (Dog pictured.) His current interests include amateur philosophy of science, business, and economics. Tweet at him at @vikrambath1.

Related Post Roulette

52 Responses

  1. Doctor Jay says:

    I love this piece. It’s sort of an extended essay on the Fundamental Attribution Error as applied to gifted-language programs. Wonderful.Report

  2. Jaybird says:

    Awesome post.

    There was one study I’ve read about that treated crime the same way as an infectious disease. What made this study so interesting is that it had predictive power in addition to explanatory.

    Wouldn’t this one also be racist, using these measuring sticks?Report

  3. Oscar Gordon says:

    Seriously, excellent post!Report

  4. James K says:

    Very nicely done Vikram.Report

  5. notme says:

    Clearly in this brave new liberal world having any standards or criteria at all is racist.Report

    • El Muneco in reply to notme says:

      Engaging with the commentariat at a site gives twofold benefits:
      – A chance to smite the unbeliever with cunning arguments. To drive their propositions from the field before you and hear the lamentations of their conclusions. After all, a simple untruth is simply countered and facial inaccuracies easily exploded.
      – Establishing a history of positive interactions provides social capital. While some might aspire to high reputation as a master of subjects, there’s no shame in honest dealing and a history of credibility. Or even a reputation for niche knowledge, pop culture references, and an occasional ability to turn a phrase. Do anything positive long enough, and it will be noticed more widely than you might think.

      Social capital, in turn, can be invested to lead debate as well as following along. Naked disparagement of the other person’s position can be seen as a form of this. If you push enough chips in, it provides a certain gravitas, and the onlooker is obliged to give it consideration – the capital is what makes it no longer just a trivial insult. What makes it not just an extended drive-by.

      Unlike a lot of people, I still believe that you are capable of engaging as a full contributor – that you just choose not to for reasons of your own. That choice can still be unmade, if you want to pull eyeballs, much less win hearts and minds.Report

      • notme in reply to El Muneco says:

        Thank you. Sadly I don’t believe any of the liberals here are open changing their minds or even having an honest discussion. It’s why to still have folks here claiming that Eric Garner was murdered for selling cigarettes.Report

        • El Muneco in reply to notme says:

          I see your frustration. I’ve been there. I can see where you’re coming from. I think that quite a few liberals on this site can do. Maybe not agree, but see it as a reasonable position.

          Most questions are about the verge, about what happens in the grey area. Garner was committing a crime. A misdemeanor. The question is how to deal with him – is he going to escalate? Is he likely to? Reasonable people can disagree, but from what evidence we have, he wasn’t, so the use of force can be seen as disproportionate. Is there a systematic problem that the Garner case is pointing out? That’s a possibility. Was he murdered? That’s rhetoric. That’s preaching to the choir.

          Look, I’ve been dealing with law enforcement and former law enforcement officers for a decade, and I believe that we need a criminal justice system. I also believe that the one we have has some big holes in it. And that we need to fix the holes, not burn it all down.

          Liberals don’t necessarily oppose everything you do. Some are reachable. Some are reasonable. If you reach out and reason with them, results can happen.Report

          • Chip Daniels in reply to El Muneco says:

            Back when I was a conservative, I had a coworker, an older guy who was a frothing-at-the-mouth liberal, and I enjoyed teasing and taunting him with my conservative snark.

            Slowly I learned his history, that he grew up an Okie in the Depression, working as a child in the California orchards, a real Tom Joad type, experiencing literal hunger and privation. He went into the Army, thru college on the GI Bill, worked a career for the US government as an architect doing schools as part of foreign aid.

            His political beliefs were the result of this deeply engrained experience; He didn’t “believe” in the New Deal, like a fable that someone told him. He literally experienced it; the failure of the market economy, the triumph of central planning and military efficiency, the prosperity that it all brought.

            What absurdity, for me to think a snarky quote from the National Review would erase his entire lived experience!

            By the same token- I came of age in the 1970s. In my hometown there was a factory, a unionized glass bottle factory that would hire anyone, without any skills or education, any day of the week.

            You could literally walk in there, and be hired on the spot and get a starting wage of triple minimum wage. Jobs were easy to find, and anyone who claimed they needed government assistance was IMO either lying or a fool.

            The old Okie and I were separated by a gulf of understanding. Our lived experience made us impervious to the weak and feeble efforts of reason and logic.

            When we argue here, what we are really doing is giving testimony about our experience and perceptions.Report

        • greginak in reply to notme says:

          @notme notme. You are wrong. Plenty of the liberals here, me especially, have changed our minds or learned more about many of the things we have talked about over these long years. If you can hear this, your tactics have never and will never lead to that because you don’t discuss. You toss talking points and attack lines. We can all do that. Everybody here can learn if they want to. I would hope some of the conservatives and libertarians and other political mutts have learned also.Report

          • DensityDuck in reply to greginak says:

            “I became even more strongly aligned with the positions I only vaguely held at first” is not, I think, what notme means by “changing your mind”.Report

            • RTod in reply to DensityDuck says:

              That’s also not what he meant. Speaking for myself, I’ve had my opinions reversed on topics I wrote actual posts about.

              But, as I’ve said before, this site is made up of both Debaters and Communicators, and it’s folly to try to approach one as the other, or expect either to want to try to be the other.Report

              • Stillwater in reply to RTod says:

                I’ve heard you make the distinction before, Tod. I think it’s valid. And interesting, too, in terms of where we’re at politically and where we’re going and how those two dimension are in a sorta fundamental tension, not only at the public level but in terms of what people think they’re trying to accomplish.

                Any chance of hearing more on the topic? 🙂Report

              • Tod Kelly in reply to Stillwater says:

                I dunno. Would it make an OK post, or would it be kind of boring at this point?Report

            • greginak in reply to DensityDuck says:

              Density- That is not what i meant. I have changed my mind on some things and others thought more deeply to the point where my views are at least more nuanced or in other cases less strong. I had a generally positive view of rent control until the various debates here. Now i think it is of the devil. I’m more sympathetic to some complaints of socons even if i think they are wrong in principle. I also tend to chat most on topics that i care about which tend to be health care and foreign policy. I’ve been less impressed and affected by debates here on those.

              Dave and James K, for two, have given me a lot to think about regarding regs and business.Report

              • Stillwater in reply to greginak says:

                Lots of people contribute to the ever-changing conversation about what’s important. It’s not at all surprising to me, tho, that DD is focusing on Obstinacy in the face of counter arguments when his stratagem for changing people’s minds seems to be ridiculing attributed strawmen.Report

              • greginak in reply to Stillwater says:

                People who don’t’ want to learn or open their minds to changing assume everybody is like them. They project their closed minds on others. Assume bad faith when they like to insult others.

                Well most people assume everybody is like them of course.Report

              • Tod Kelly in reply to greginak says:

                Yeah, most people are awesome.Report

              • Most people like classical music, think wrestling is silly, and hate superhero movies.

                I don’t know how I wound up surrounded by all of you.Report

              • DensityDuck in reply to Stillwater says:

                “It’s not at all surprising to me, tho, that DD is focusing on Obstinacy in the face of counter arguments…”

                Well, the only posts of mine that anyone seems to pay much attention to are the ones where I disagree with them.Report

        • Francis in reply to notme says:

          Since no one was convicted of the crime of murder, it is factually false to say that Garner was murdered. However, he did die while in the process of being arrested, and his criminal conduct was very minor.

          As the more coherent members of the Occupy movement pointed out about the banking industry, the real shocker is not the lack of arrests and prosecution but instead what is absolutely legal. The laws regarding police conduct are extraordinarily deferential.

          Moreover, Prosecution in this country is handled almost entirely at the county level. County prosecutors are entirely reliant on cops to win their cases. So even if a cop does commit a crime in the course of an arrest, there is no one in the legal system with any incentive to prosecute the case with any vigor.

          The American system of government is sold to the public on the premise of check and balances. The legitimate complaint about police conduct in this country is that there is no systemic check against misconduct.Report

    • James K in reply to notme says:

      @notme

      This isn’t about Political Correctness its a matter of actual, literal correctness. The model was persistently mis-estimating the potential criminality of black people and white people. This is objectively a bad thing because predicting criminality is the entire point of the model.Report

      • Mike Schilling in reply to James K says:

        This isn’t about Political Correctness its a matter of actual, literal correctness.

        That’s even worse. Try telling a conservative who’s just invoked Martin Luther King as his argument against Affirmative Action that King was in favor of it.Report

      • notme in reply to James K says:

        Before vikram discusses the computer program he mentions this tidbit: The federal government now considers using criminal records backdoor housing discrimination. That is the kind of liberal PC BS I’m talking about. Beyond that, I take issue with calling a computer program racist. A computer program can’t be racist though it can give invalid results based on it’s algorithm.Report

        • El Muneco in reply to notme says:

          I see the underlying point, but the big thing underlying the whole post is “there are a lot of variables that are co-correlated and no one is untangling the correlations before jumping to conclusions”.

          Is criminal record in and of itself any reason to deny housing? God knows that I would have slept better last year if my landlord had denied the dude who moved into the unit above mine and spent at least one weekend a month in the pokey and at least ten days a month having screaming rows at two AM with his girlfriend who ended up not able to take any more and stole his pickup and everything from the apartment that could fit in the back.

          Would I say he should deny the next guy? No. The last guy got laser-guided karma. The next guy might actually be trying harder to get straight.Report

      • Brandon Berg in reply to James K says:

        James K: The model was persistently mis-estimating the potential criminality of black people and white people.

        This is not correct. See my 11:42 comment, which links to my comment in the original thread that explains this in detail and works through an example that shows where the alleged bias comes from.Report

  6. Glyph says:

    Heh. Nice post. This is a real-world, somewhat more complicated example of something I once thought-experimented with Chris – I said I could easily envision police robots, deployed preferentially to high-crime areas but with no knowledge or bias w/r/t race, still ending up perpetuating racially-biased policing results – and it was because the data the program was fed would itself be tainted by racial history in the US (deployed to high-crime-neighborhood = likely-poor neighborhood = likely-minority neighborhood).

    Automated racism, without anyone meaning to do it, and in fact achieved via attempting to design a colorblind system.Report

  7. Brandon Berg says:

    Did you see my explanation here? The classification of defendants into high- and low-risk groups is arguably biased against whites, albeit only slightly and perhaps not statistically significantly, in that white defendants classified as high risk actually reoffended less than black defendants classified as high risk, and likewise for white and black defendants classified as low-risk. The bias claimed by Pro Publica is an artifact of looking at the data in a particular way, and is inevitable given the probabilistic nature of the classifications and the racial skew in the actual recidivism rates.Report

    • trizzlor in reply to Brandon Berg says:

      >>The bias claimed by Pro Publica is an artifact of looking at the data in a particular way

      No. You are focusing on a population-level metric and they are focusing on an individual-level metric. These metrics have different interpretations, but it’s incorrect to claim that one is artifactual and the other isn’t. Let’s unpack this:

      * You are on the parole board, you assess two inmates – one white and one black – that are classified as “high-risk – 60% +/- 5% likely to reoffend” and you make your decision guided by the prediction. We follow the inmates for the next year and find that the 63% of the black inmates are back in jail and 59% of the white inmates are back in jail. No significant difference. The algorithm appears to have worked.

      * You are a black inmate who will not reoffend (an Oracle has informed us that you were wrongly convicted) and you want to work on your case in the library, which you are not allowed to do if the algorithm classifies you as “high risk”. Because you’re black – and because of the correlations the algorithm has learned from *other* black folk – the algorithm is 1.9x more likely to classify you as high risk, keep you from working on your case, maybe put you in a more dangerous part of the prison. Indeed, your white bunkmate who *is* guilty and *will* reoffend (again, the Oracle told us) has basically as good a shot as you do at getting into the library because of what the algorithm has learned from *other* white folk. For you, the algorithm has failed.

      And disagree with the claim that the model worked but it’s application failed. The model does not spring out of nowhere. It is developed by statisticians with certain inductive assumptions about how the world works. It is sold as a fair and just solution to a specific problem backed by statistical rigor. For a statistician to not evaluate how the model deals with common confounders is equivalent to a surgeon forgetting his tools inside the patient.Report

      • Brandon Berg in reply to trizzlor says:

        Because you’re black – and because of the correlations the algorithm has learned from *other* black folk

        No, not because you’re black. Race is not part of the model. It’s because you have characteristics that predict a high risk of reoffending for both black and white offenders.

        You are a black inmate who will not reoffend (an Oracle has informed us that you were wrongly convicted) and you want to work on your case in the library, which you are not allowed to do if the algorithm classifies you as “high risk”.

        I don’t see anything in the article about restricting library privileges, so I assume you’re just wildly speculating about things people might use risk scores for. If we’re going to argue on the basis of made-up anecdotes, I have one where, in order to free up jail space, we need to parole some prisoners. Because political validity is more important than predictive validity, we throw out the model that allows us to distinguish between a subset of prisoners of whom 10% will commit additional violent crimes and a subset of whom 20% will commit violent crimes, and instead parole prisoners in accordance with the parole board’s gut feel, and in proportion to the racial makeup of the subset of prisoners eligible for parole. Consequently, instead of 10% of the paroled prisoners going on to commit violent crimes, 15% do, and in total the parolees murder 18 people instead of 12

        I get that the model doesn’t perfectly predict outcomes. It’s inherently uncertain, because people are complicated. This doesn’t bother me that much, mostly because no one’s going around randomly screening the population for risk factors and imprisoning people who haven’t done anything solely on the grounds that the computer says they have a 60% chance of committing crime in the next two years.

        That would be terrible, but that’s not what’s happening. When it’s used to make sentencing and paroling decisions, it’s within the constraints of the sentencing and parole guidelines established by law. People aren’t being sentenced to life in prison for graffiti because their risk scores are high, and they’re not having murder sentences suspended because their risk scores are low. The worst that can happen is that someone who has been convicted of an imprisonable crime serves the maximum sentence for that crime.

        Yes, that’s bad in the small minority of cases in which there was wrongful conviction, but it seems inconsistent to object to risk scoring on the grounds that it might lead to an innocent person being imprisoned too long, but not go all the way and say that we should abolish imprisonment altogether in order to avoid imprisoning innocent people for too long.

        All of which is beside my point, which is that Vikram’s example isn’t a valid analogy, suggesting that, like almost everyone else who read this highly misleading article, he didn’t actually understand the issue. Neither did people like James and Victoria, and this stuff is their day job. I only did because I read the data appendix. From reading the article, you might think that the model is only barely better than random chance, and that it works differently for black and white people, neither of which is true. The fact that you’re talking about confounders and saying stuff like “because you’re black” suggests to me that you don’t get it either, even after reading my explanation.Report

  8. J_A says:

    Isn’t there here a confusion between “Reality” and “Causes of that Reality”?

    Reality is that in 2016’s USA race is highly correlated with criminality. Any statistical analysis -in the USA in 2016- that doesn’t show that correlation is suspect. If you tweak your model so that the correlation disappears you are not doing science, or public service, or whatever it is you wanted to do. At best, you are lying to yourself. At worst, you are being devious.

    But knowing that race and criminality are correlated tells me nothing about why. To assume that it is because the genes for criminality are the same genes as melanin production is stupid. We need to recognize the existence of the correlation, and to try to figure out the causation (most likely related to racist policies from years and decades past, still flushing their effect through the system, even if those policies are no longer in place), aiming to revert that correlation.

    Data is data. Replacing good data with bad data in the name of political correctness is the same as rejecting that sexual orientation is innate in the name of telos or God or something.Report

    • Jaybird in reply to J_A says:

      There seems to be a very dangerous thing swimming under the surface and I’m trying to figure out how to phrase it.

      It’s something about whether the racism of a statement and the truth value of a statement are orthogonal to each other. To say that they are orthogonal is to say something problematic. It also seems that to say that they are tied to each other is also to say something problematic.

      Less problematic to not think about both things at the same time.Report

      • J_A in reply to Jaybird says:

        To say that crime and race have a correlation HERE, TODAY, is data

        To say that it is at least plausible that policies that were in place for 80 years even though they are no longer in place might have a causal effect, is a hypothesis worthy of study.

        To say that the effect of policy changes and societal attitudes flow instantaneously across society, and therefore, the moment Civil Rights Laws were enacted, all races in America were now subject to the same societal conditions and the only remaining difference between people is melanin, is ignorance about history, culture, and society.

        To say that melanin deficiency IN ITSELF causes virtuous behaviour, and thus the more melanin, the less virtuous, irrespective of any other elements, is racism.Report

  9. J_A says:

    On a separate subject. There is a good correlation between growing up bilingual and being able to pick up a third or fourth language later in life.

    My mother learned her second language at 4, her third at 12, her fourth in her late 20s, and she is still now fully fluent in all four. I grew up bilingual, picked my third at 8, my fourth in my teens and my fifth in my 40s

    I used to travel a lot to Turkey (five-six times a year), and after a couple of trips i was able to pick up about 20% of what was written in short texts I saw in the street.

    So software that picks up kids growing up bilingual is actually picking up good candidates for a languages intensive school. These kids are good candidates for a third language.Report

  10. Brit says:

    Am i right in thinking this is an argument why chief justice Roberts is wrong about the best way to end discrimination around race is to stop discriminating on race? Affirmative action actually takes account of relative achievement in a way race-blind policies do not?Report

    • J_A in reply to Brit says:

      Roberts is right in a sort of useless way, useless because he doesn’t give us a road map to get from here to there, except “Stop doing this right now!”.

      Affirmative Action and Racial Preferences are a very blunt tool (for instance, lumping together deep South blacks with traditionally more successful Afro Caribbeans), and carry with them some undesirable side effects.

      I would replace race based policies with class based policies, which would capture most of the people AA is trying to target while allowing the country to start phasing out positive race discrimination, and helping the new left-behind white underclass. Other alternatives exist, like the Top 10% of every single high school preference in the UT system that Texas enacted many years ago.

      100 years of racist policies still have effects in our culture. Those effects are probably less every day, but they are still there, and will be for us for quite some timeReport

      • trizzlor in reply to J_A says:

        >>I would replace race based policies with class based policies, which would capture most of the people AA is trying to target while allowing the country to start phasing out positive race discrimination, and helping the new left-behind white underclass.

        How do class-based policies address the fact that a black person with no criminal record is as likely to get called in for an interview as a white person with who just got out of prison? Or that jurors consistently interpret black juveniles on trial as older, more aggressive, and more animalistic than otherwise identical white counterparts?Report

        • J_A in reply to trizzlor says:

          You cannot force cultural change. You can only foster policies that will flow though the system creating cultural change over the long run.

          Race and crime are correlated HERE, NOW. That doesn’t tell us anything about the individual person. To break that correlation will take decades. As Theodore Roosvelt probably didn’t say “If this project will take a hundred years to complete, we must then start right now”

          Race based affirmative action in job interviews, for example, would mean mandating by law that people called Shawanda or Latoffe be called for job interviews at twice the rate of those called Marys or Cristinas or Soo Ming. A class based AA would require that people, for instance, with a zip code in the 25% percentile be called for job interviews at twice the rate of those with zip codes in the 75% percentile.Report

          • Oscar Gordon in reply to J_A says:

            Which, of course, means people will start to game the system by giving / changing names, moving to zip codes, etc.Report

            • Me a couple years ago with respect to elite college admissions:

              Would you prefer that applicants have lower GPAs? Ask it, and those students will show up at your door. Do you want them to not have any global travel experience? Ask, and the next batch will assiduously avoid or hide it. If you control something incredibly valuable and name a price, don’t be surprised when those willing to pay show up at your door. Make “experienced a major life setback” a requirement for admission to Yale, and you can be sure that parents will get their kids hooked on meth so that their kids can explain how they struggled with and eventually overcame the problem. Businesses will set up summer meth camps to make it easy. The next David Brooks column will complain about applicants being uniformly perfect avatars of success in this newly defined way. “The I learned I lot from my meth habit” will become the new “I learned a lot from helping those people in Mozambique.”

              Report

              • greginak in reply to Vikram Bath says:

                There is only so much people people can do to fake poverty or race.Report

              • Oscar Gordon in reply to greginak says:

                Doesn’t matter, all that matters is the metric becomes the measure, the system will be gamed, and people who should not be taking advantage of something, will.

                I.e. there is no easy fix for it.Report

              • greginak in reply to Oscar Gordon says:

                Yes to a degree. People will always try to game a system. But there are multiple ways to a goal. Will rich folk pretend to be poor. almost certainly not. Will middle class folk? Probably not unless it is really easy to do. If, off the top of my head, a young adult or his parents have to send their tax returns to qualify for a low income scholarship very few people will even try to game that.

                It’s like the old bs about millions of people cheating on welfare based on anecdotes about a few cheating. Were a few cheating; yes. Were most: no. If a few people cheat that sucks but doesn’t invalidate the larger concept.Report

              • trizzlor in reply to greginak says:

                Yeah, I think people are taking the axiom “when the metric is the measure it becomes statistically biased” and assuming it means “when the metric is the measure the program will never achieve it’s desired goals”.Report

              • Oscar Gordon in reply to greginak says:

                Actually, my point is that, in the US, our welfare systems like nice, bright, defining lines regarding who can, and who can not, gain a benefit. I’m not worried about the trust fund kid slumming it, my concern is aimed at the folks just on the wrong side of that line. The families that make just $1000 more than is permitted.Report

              • Brandon Berg in reply to greginak says:

                Straight-up welfare fraud is kind of a distraction. The real issue is moral hazard and legally gaming the system. People intentionally having children they wouldn’t otherwise be able to afford, knowing that welfare will help them make up the difference. Couples not getting married so the mother will qualify for welfare. Welfare recipients cutting back on work hours to stay under key income thresholds. Married women quitting work because of the EITC. Deliberately blowing job interviews to milk unemployment. These things aren’t fraud as such, but they’re abuse of the welfare system. Worse, they exacerbate poverty.Report

              • DensityDuck in reply to greginak says:

                “There is only so much people people can do to fake poverty or race.”

                ah-HEH.Report

            • J_A in reply to Oscar Gordon says:

              if you move into a bad zipcode to improve your kid’s chance for college you are a gentrifyer.

              And I have a lot of nice things to say about gentrifying, since I live in a gentrified neighbourhood myselfReport

          • trizzlor in reply to J_A says:

            I still don’t understand why you prefer a class-based solution to a race-based problem. I mean, I get it if you say “the government has no business forcing employers to do anything” but it seems like you’re saying “the government can force employers to address discrimination, but only using a indirect measures that are highly correlated to race not race itself”.Report

            • J_A in reply to trizzlor says:

              Because a class-based solution will catch up poor blacks, poor whites, poor hispanics. It should have a disparate effect if more black than white people are poor, but that’s a feature, not a bug.

              At the end of the day, race based preferences are problematic and perhaps unconstitutional. They will be over at some point, for sure before the time when they are no longer needed. A class based solution is not constitutionally suspect, and is politically more palatable because there is no US vs THEMReport