Every Toy in Its Box: The beautiful lie of Myers Briggs and modern personality testing
The first time I remember playing with a cootie catcher, I was perhaps in the fifth grade.
If you’ve never seen a cootie catcher (pictured here), it is simply a sheet of paper complexly folded, with a treasure of secret messages awaiting just below the folds. You would pick a number or a word and the person operating the cootie catcher would rapidly open and shut the origami contraption both vertically and horizontally in a manner that corresponded to the number you chose (or, if it was a word, the number of letters in said word). Eventually you would work your way to a single random fold, which would contain that most treasured of secret messages, the one that told you about yourself: You were strong and brave, or perhaps the smartest kid in the neighborhood. Or more likely, since it was fifth grade and you were just dipping your toe into the pool of sexual attraction, it would tell you who liked you, or who you were going to marry, or –most daring of all – who would let you kiss them were you only to ask.
Heady stuff.
I was thinking about cootie catchers this weekend, when Jaybird invited contributors and readers alike to take two separate personality tests, the Myers Briggs and the Enneagram. A lot of people took them, myself included, and it was a fun way to break the recent tedium of election bickering. Those of us that took them scanned the others’ scores, wondering who did or didn’t score exactly as might have been predicted, and whose score revealed an as-of-yet undiscovered intellectual soul mate. To me, these kinds of personality tests are the modern-day adult version of cootie catchers: they’re fun, they can break the ice and provide topics of conversation, and they allow us to focus on that most precious of topics, ourselves. The other and most important thing that personality tests have in common with cootie catchers is that they are terrible predictors for people’s behaviors, and provide no real insight save what we might glean from observing ourselves observe the results.
My first experience with the Myers Briggs was in the mid 1990s, when I needed to take (and, for lack of a better word, pass) the test as a condition of employment for a job. I was also required to undergo a handwriting analysis, which was a little worrying since I have pretty atrocious, doctor-prescription-like penmanship. But I passed each and was hired, and within a short time was heading up my department and reviewing other perspective employees’ Myers Brigg test results. (The absurd handwriting test was discarded shortly after my hire.) If I recall correctly, we paid about $350 for each Myers Briggs test administered. When I look back at the people we hired based on Myers Briggs scoring, I go back and forth on which part I find more amusing: the fact that the tests were such consistently terrible predictors of success, or the fact that my company clung so strongly to the belief that they were a necessary tool to building a stable, competent sales force despite all evidence to the contrary.
For a long while I wondered if reason for its failings was that the Myers Briggs was too easy to cheat. We were hiring salespeople. Most of the people applying really wanted to job. So perhaps, I theorized, they weren’t answering honestly and just telling the test what they thought it wanted to hear (in, ironically, pretty stereo-typical salesperson fashion). Now, however, I have come to believe that the entire concept is deeply flawed. Even if those taking the test are being honest, I believe its predictive capabilities rely on nothing more than random luck.
There are a number of reasons I think this, but the biggest is this: People are simply more complex than the prefabricated toy boxes the Myers Briggs-like tests wish to put us away in. As an example, let me take two questions from the tests Jaybird linked to, one from the Myers Briggs and one from the Enneagram:
I’ve been: (a) a bit cynical and skeptical, or (b) mushy and sentimental
YES or NO: You prefer meeting in small groups over interaction?with lots of people
Both of these question suffers from the same problem: each mistakenly assumes that people are divided along very neat dichotomies of black and white. These assumed dichotomies do allow us to process perceived patterns more easily. But they also force us into error, because those dichotomies are a little more than a beautiful lie. For myself, neither of these choices makes any sense, and I know whatever choice I make will lead to the test making an incorrect determination about my skills, comfort levels, and very essense.
The first question asks me, essentially, do I use reason or do I feel emotions; this is absurd because everyone I know does both. I might well choose either. In fact, whichever one I do choose probably says more about what meeting I just got out of or what book I just put down than it does who I am as a person.
The second question is even worse for someone like me. Because of the sentence structure, I am forced to answer “NO,” because I don’t prefer small groups to large ones – I like them both just fine. I could sit down and have a martini with the good Doctor Saunders or have pitchers of beer with the entire League staff, and both of those scenarios would be pretty space awesome. Neither makes me uncomfortable. But I am aware that if I answer “NO” the test is going to assume a lack of comfort with intimate inter-personal communication, and if I answer “YES” the test is going to assume a lack of comfort being in front of people. Neither of those things is remotely true. The truth is that I’m just more complex than that. Everybody is more complex than that.
Another problem with the tests is made obvious with questions such as these:
YES or NO: You readily help people while asking nothing in return
Much of my success has been: (a) due to my talent for making a favorable impression, or (b) achieved despite my lack of interest in developing “interpersonal skills”
Each of these questions relies on a high degree of self-awareness that far too many people lack. I suspect that almost everyone answers the first question “YES,” despite the obvious fact that under some circumstances almost everyone helps just for the sake of helping, and under most circumstances almost no one does. Similarly, I have done a lot of management and employee intervention work for clients over the years. And based on that experience, I can tell you that the more likely someone is to not question how awesome their “interpersonal skills” are, the more likely they are to be really terrible at playing with others. If there is a person out there that attributes his or her complete and constant inability to get along with most other people in a group for reasons other than “it’s the groups fault, I’m doing everything perfectly,” then I have yet to work with them.
Lastly, a huge issue I see with personality testing is the same one that I see in pop-psychology dream analysis books: It makes the error of assuming that words show up the same to all people under all circumstances. Above, I framed the question about small groups vs. large groups as to two competing social scenarios with my colleagues here at the League. It is important to note that not only is it possible that not everyone will frame it in a similar fashion, it’s quite possible that I won’t always frame it so. It depends entirely upon my mindset when taking the test. If I was taking the test having come out of a staff meeting, might I answer that question differently? Might I answer it differently if I was about to have a disciplinary talk with one of my children that had just been caught taking money out of my wallet without permission?
I thought about this yesterday while sitting in an airport in Denver, and it made me curious enough that I went back and quickly retook the Enneagram test an additional three times. The first was imagining that I was answering the questions as they pertained to managing my team at work, the second thinking of myself as a parent, and the third thinking about how I discuss difficult issues with my wife. My results for all four tests skewed like this:
Initial test: Dominant score in #3, The Achiever
Second test (thinking about managing people at work): High scores in both #3 The Achiever and #8 The Challenger
Third test (thinking about parenting): High scores both #1 The Reformer and #8 The Challenger
Fourth test (thinking about being a husband discussing difficult issues): High score both #9 The Peacemaker and #5 The Investigator
That’s a lot of different results. So who is the real Tod Kelly? Is he really the Achiever, the Challenger, the Reformer, the Peacemaker or the Investigator? Which of this really best describes him?
The answer, clearly, is all of them. We humans are, as I said above, far more complex and complicated than we like to imagine ourselves. We all have most or all of these various masks inside of us (and many more as well), and we slip back and forth between them so effortlessly that we’re not even aware that we’re doing so. I may well be an extrovert – (I am) – but I still spend huge chunks of time wanting to be happily alone with my thoughts. My sister is an extrovert as well, but I suspect for years she thought of herself as an introvert: If when we were growing up I was the universally identified extrovert, she was by default the introvert – except that she wasn’t, really. Not ever.
The company I work for now – the one I am retiring from next month – has a far better track record for hiring sales people than the one I worked for back in the 90s. In the past 15 years, they have hired just under 20, which is less than most of its competitors hire every two to three years. In that time, they have never had a salesperson or partner quit or retire (though that last bit is about to change), and in that time they have only ever felt compelled to terminate one salesperson. They are also the only firm in the area that does not use any form of Myers Brigg-like testing. I do not believe this is coincidental.
When my firm hires someone for a sales or executive position (which are the positions Myers Briggs are most often used for), the process is long – usually a couple of months or longer. There are many preliminary talks, and many of those are over lunches and dinners. By the time someone is made an offer, they have spent a lot of time with most of the partners and everyone who will be part of their team. By that time, everyone on both sides of the equations have a pretty good idea of how well all the pieces might fit together. When the other firms in our area decide they want to hire someone, they decide that they need to hire someone as soon as possible; they use Myers Briggs as a tool that allows them the luxury of not having to pay attention to whoever they are looking to hire. They want a spreadsheet to quickly, ridiculously and entirely quantify the very essence of someone they spent half an hour with a couple of days ago. But as I said earlier, believing you can do that with another human being is a beautiful lie we like to tell ourselves.
It tells us, in other words, far less about the potential people to be hired than it does about the companies looking to hire them.
Plenty of personality tests are great predictors. Take type a hostility, for example.
It’s true that personality tests do change over time. Type A people are much, much more likely to increase their type a hostility scores over time.
Myers Briggs makes the fatal mistake of trying to make all the scales “happy” scales.
Oh, and if the group hates me, it’s probably my fault.Report
note: type a hostility predicts atherosclerosis.Report
Myers Briggs makes the fatal mistake of trying to make all the scales “happy” scales.
I get into this with Chris over his support of The Big 5. It’s not that all of the scales are “happy” it’s that none of them are sad. The Big 5 uses virtuous language for two of its five types and that’s two too many, as far as I’m concerned.
A personality test should never say “good” vs. “bad” as much as “mayo” vs. “mustard”. There’s not a better answer *EXCEPT* for the whole “this answer is more accurate than that answer” thing going on.Report
*taps foot* So now it’s good if you get atherosclerosis?
Some people are trying to use these as tools to determine what personality traits cause/exacerbate illness (with an eye towards preventing the illness, natch).
Others are trying to use this as a “let’s figure out how people are going to act” game. the second, while possible, probably requires a lot more questions than most people are likely to want to take.Report
Atherosclerosis is my third favorite ailment.
I’m mostly interested with MB insofar as we’re all trying to talk to each other and, many times, we can’t. I’m trying to figure out why. I think that personality preferences may hold one of the several keys we’re overlooking.Report
Kimmi do these posts count as Ascii Tourettes? Does that evince itself in verbal settings as well?Report
Ward, be nice.Report
That /was/ me being nice jbReport
… perhaps. but really, i have no idea what you’re fucking talking about. I thought, when you first brought it up, you just meant my habit of swearing.
But now I’m confused.
If you care to explain, i’d appreciate it.Report
Would you be fine with the Big 5 if they changed the labels of those two factors?
I ask because it is an incredibly useful tool (for research at least, which we’ve established is all I really care about) with pretty damn good validity and reliability for a sweeping personality measure.Report
Would it cease to be the Big 5 at that point? Wouldn’t it be a Myers-Briggs-Plus test?Report
Well, we’d use the same dimensions, just relabel them so that they don’t contain value judgments. Because the labels should be purely descriptive. It’s the measures of the dimensions that are valuable. That’s why I prefer the Big 5: using well-researched measures of the Big 5 is productive.Report
That’d blunt my biggest issue, sure. Anything else after that could probably fall under the “criticisms” section of the Wikipedia.Report
Exactly if you take the course to administer MB you find that the statement is all types are equally good there are no right or wrong answers. People just have different types. It is just like another instrument the Adaptation Innovation survey of Kirten, which relates to how one tackles a problem, either within the bounds of the existing system (adapter) or whatever is needed damn the system.Report
M-B isn’t supposed to be predictive, though. I mean, I agree, in practice lots of companies use it because they have some fantasy in their head about how ABQZs are the best programmers evah and BAZQs make great sales people, but that’s because hiring managers are… underskilled.
The M-B questions that you point to (with the probably-false dichotomies) are formulated that way on purpose, which is why the number of questions you have in the test is important (and those shortened tests are worse than worthless), *and* the letter results aren’t as important as the raw score. If you consistently choose the same extroverted answer between the “false” dichotomy offered in 50 questions, that illustrates something that 1 question doesn’t. You actually *do* lean extrovert.
What does that mean, from a predictability standpoint? Not much.
I agree that the only thing an M-B test is really good for is getting people on a team to recognize that other people on the same team have different practical reflections of their underlying personalities. If people already know that, you don’t need a test to point it out. On the other hand, when a bunch of hard-extroverted people in one team run up against a bunch of hard-introverted people in another, having some sort of exercise like this can be helpful to getting the teams not yelling at each other all the time.Report
I think the reason companies use it is so that they have a fall-back when every manager rates every one of their employees “excellent, don’t fire this one or the company will fall apart”. They can say “well, we have lots of RPJDs, we need a better balance of YMPLs” and then pretend like their decision on who to lay off was something better than a coin flip.Report
This is interesting – I’ve had a different but similar theory that it’s used primarily as a CYA – both for explaining bad hires to people above you, and for suits about discriminatory hiring practices.Report
Both of these hypotheses are eminently credible.Report
Patrick, this OP underscores why I don’t believe social “science” is a science.
Humans are far too complex and subjective and piling up nonsense about a large number of them doesn’t make it more objective but the statistics give it that fine patina of respectability.Report
Even if you can’t come out and point at the moment in which 5 O’Clock shadow becomes negligence becomes a Goddamn Beard, it doesn’t mean that that there’s no difference between negligence and a Goddamn Beard.
In that vein: while Wilson Bentley showed us that no two snowflakes are alike, it’s certainly possible to create categories.Report
Ward, I don’t know of anyone that I’d consider a credible social scientist who would defend M-B as predictive, based upon current data.Report
Pupil dilation is a remarkably robust measure of how much you’re thinking, at the moment.
Yes, humans are subjective, but we do have objective measures of many things.
Including upset stomachs!Report
Note that there is a second series of MB tests that break down into 20 subelements, but it is not very common except for those who have the training on MB. Of course the tests on the web are basically worth little one needs to full instrument and a trained person to help one interpret the instrument. (I took the training about 12 years ago)Report
i’ve always found it quite odd that some of my bro/broettes of the more sam harris / rational + science = hell yeah / stone cold punching people (in their minds) for reading horoscopes and the like are, as with nearly everyone else, are usually nonetheless into meyers-briggs stuff. or at the very least don’t apply the same 64 ounces of virulently scoffed haterade to the situation.Report
At least the chinese zodiac signs symbolize something. Not that people born in the year of the pig are more reliably piggish, but…Report
64 ounces of virulently scoffed haterade
I love that.Report
i aim to please, though upon reflection i’m not even sure if it reaches the level of nonsense, though it surely is pretty.
i must admit to being a bit shaken, if not actually horrified, that companies use this personality test when evaluating job candidates. i don’t know how i’d react were i to go on an interview and an MB test was thrown my way. my first impulse might be to consider it one of those “cutesy” interview things where an interviewee is given three sheets of paper and a variety of rubber bands to construct some kind of model of a bridge or whatever.Report
Of course using the test that way violates the ethics of MB trained personnel, where the instrument should not be used against anyone. (See the MBTI manual for details here is a link to it at Amazon http://www.amazon.com/Manual-guide-development-Briggs-indicator/dp/0891061304 .Report
While I’d agree with your general thesis that people are far more complex than these kind of tests reveal, I don’t think they’re totally useless (although I’ll admit that I didn’t see a point to the Enneagram). I’ll also stipulate that they’re not great predictors of success. But that’s not really what they were meant to do and using them as qualifications for employment is, to my mind, stupid.
What MB can tell you, however, is how people have different approaches to situations. The infamous Introvert-Extrovert scale, for instance, tells you where you draw your energy from–yourself or other people. It doesn’t tell you anything about how social you are or how much you enjoy the company of others. I do think these kind of tests can be useful in showing you how you approach problems or deal with other people and, when used in group situations, can drive home the point that everybody does things differently. What works for you may or may not work for someone else.Report
My employer recently started using a pretty different assessment, one that’s pretty much all about figuring out motivational factors and approaches to problem-solving. So far, it’s mainly been used for helping manager/employee relationship development (at all levels) and has that ‘it takes all kinds’ built into the literature, but it’s really hard not to notice that nearly every high ranking exec gets placed in the same quadrant.Report
Most ENFPs are skeptical of this sort of thing.Report
This particular ENFP certainly is.Report
Maribou said the same thing.Report
External validity!Report
My wife and I compared scores with our familes. Very interesting results: Her mother, father and her brother all scored the same on the Meyers-Briggs as hers. On my side myself, my mother and my siblings all had wildly different scores.
My said she was proud of that.Report
It seems to me the issue is misuse rather than inherent illegitimacy of the MB. The purpose of the assessment seems to be to understand yourself, and your particular cognitive/emotional approaches. And it touches on communication style, so it can be helpful in understanding how you and someone else differ, and enable you to understand each other better, or better communication and collaboration.
Although it’s merely anecdotal, I’ve known too many people who really recognized themselves in their MB designation, and not much or at all in the other categories, to think it’s pure phrenology type voodoo. Not everyone, but those others have tended to be folks who didn’t strongly fit categories.
Keep in mind the MB is something that is reasonably subject to replica biliary and validation by other measures. Whether it has been, I don’t know (Chris might have some idea), but it at least can be. My prediction would be that it has a high level of validity, but even a low rate of missing the mark would make it of little value to potentially tens of thousands of people, leading to a lot of skeptics–but their experiences are also anecdotal.
But using it to plug people into jobs…that’s pretty clearly not what it is valuable for.Report
“Although it’s merely anecdotal, I’ve known too many people who really recognized themselves in their MB designation, and not much or at all in the other categories, to think it’s pure phrenology type voodoo.”
This is true – but I would add that I find the same true for astrological write-ups.
I didn’t put this in the OP, but I used to find the 5-page reports we got on potential hires to read very, very similar to astrology readings: vague, “he likes to work with others, but that doesn’t mean he doesn’t sometimes like to work alone” stuff that, I was always convinced, people wouldn’t notice so much if they were mixed up accidentally.Report
I pretty much agree with this.
And as a tool for self-analysis, MB seems to me to function much in the way a daily horoscope functions: it helps you consider yourself from a different perspective, thus broadening your potential start points for considering something after. I’d think the effect short lived, mere moments after reading a horoscope; days at best after getting MB results.
It’s usefulness in hiring? Probably the same thing; a slightly different skew on considering the person; much potential to reaffirm already held impressions, too.
Fortune telling, be it art or science, has a value — what it helps us see about ourselves. But the same value can be got through other more worthwhile paths; actually attempting things, evaluating your attempts, and working at something long enough to accomplish or even gain mastery being most crucial.Report
but I would add that I find the same true for astrological write-ups.
I am very much in disagreement. Astrologies are purposely written to be vague and non-exclusive. The average person could grab any astrology write up from anywhere at any time and find something that seems to apply to them. The MB categories, by contrast, are drawn more tightly and meant to be exclusive: You should recognize yourself in this particular description, maybe partly in this description, and not really at all in this other.Report
Being bipolar occasionally has its advantages. All such Personality Tests are a contradiction in terms: either you’re talking about a person on his own, in which case we can talk about Personality — or we can talk about a Test, in which case we’re talking about accomplishing anything.
I’ve run teams time out of mind: the most important aspect of building any team is working with the dynamics of internal cohesion. Does someone respond well to authority? That depends entirely on who’s giving the orders. Does a person work well with others? Again, enumerate the Others and I’ll tell you how things go on that front. Is a person introverted or extroverted? Depends on the consequences of exposing one’s self to the Others. What about cultural considerations? Or corporate culture? Nobody can tell at first glance who’s going to fit in and who won’t. What about people who need a challenge or others who respond well to pressure? What about someone who’s got the personality of a garlic fart and hygiene issues to boot, who’s so technically brilliant it’s worth shoving specifications over the transom and keeping them away from the rest of the team and especially away from the client?
Therein lies the fundamental flaw of Meyers-Briggs or Enneagram or any of these personality tests. Don’t ask anyone to judge his own personality. Ask others and you might get different — and better — answers in terms of meeting objectives and building teams.Report
MB serves the same purpose as most HR tools: shrink the applicant pool to a manageable size.Report
1) I will start crushing the ice and pull out the good vermouth the second you step through my door, my friend.
2) The “skeptical vs. sentimental” question is one that jumped out at me when I took the test myself. I am both a tar-hearted cynic much of the time and ridiculously sentimental. Making them an either/or dichotomy is laughable.
3) It strikes me as the height of lunacy to use any of these tests as the basis for hiring anyone.Report
Great post, Tod. I agree with you, and I think the main reason I enjoy talking about M-B types is that it gets on my nerves less than “Men are from Mars, Women are from Venus” as a starting point for conversations about communication differences, personal space, etc – not because it’s true, but precisely because it is so cheerful and easy to put on like a cloak.
(Trufax: Jaybird made me read “Men Are from Mars…” when we were dating. I told him if he ever tried to pull that crap on me in a conversation, I would show him who was from Mars. Then we had a conversation about how he was raised by women from a more conventional household that was pretty interesting, but didn’t make me any more inclined to the theory. So I sent him an article about how John Gray is a fraud with a degree mill degree. Then he threatened to make me read The Rules, and I threatened him with Mary Daly, and we called a truce and went back to talking about Douglas Hofstadter.)
(PS I showed this comment to Jay and he says “I WAS COMPLETELY KIDDING”. However, since he also cast my horoscope including moon sign, and told me to read awesome books like Sophie’s World, I’m sure you can see why I took him seriously. Ah, to be 21 again.)Report
(Dude. I was.)Report
One of the important things in an equitable relationship is knowing when to put the guns away.
“Peter, you’re mad. Never dare to suggest such a thing. Whatever marriage is, it isn’t that.”
“Isn’t what, Harriet?”
“Letting your affection corrupt your judgment. What kind of life could we have if I knew that you had become less than yourself by marrying me?”
He turned away again, and when he spoke, it was in a queerly shaken tone:
“My dear girl, most women would consider it a triumph.”
“I know, I’ve heard them.” Her own scorn lashed herself—the self she had only just seen. “They boast of it—’My husband would do anything for me….’ It’s degrading. No human being ought to have such power over another.”
“It’s a very real power, Harriet.”
“Then,” she flung back passionately, “we won’t use it. If we disagree, we’ll fight it out like gentlemen. We won’t stand for matrimonial blackmail.”
He was silent for a moment, leaning back against the chimney-breast. Then he said, with a lightness that betrayed him:
“Harriet; you have no sense of dramatic values. Do you mean to say we are to play out our domestic comedy without the great bedroom scene?”
“Certainly. We’ll have nothing so vulgar.”
“Well—thank God for that!”
His strained face broke suddenly into the familiar mischievous smile. But she had been too much frightened to be able to smile back—yet.
“Bunter isn’t the only person with standards. You must do what you think right. Promise me that. What I think doesn’t matter. I swear it shall never make any difference.”Report
<3.Report