Converting 3rd Party MCAT Scores to Actual Scores

(3/26/18 – this is an unfinished post that I’ve put up so the data can be of use to others while I add to it / revise it. Also don’t look at this on mobile, the graphs are probably butchered on there)

The AAMC only has 3 scored practice exams released, leaving prospective test takers in a tough situation – to know where they stand, they have to take an AAMC scored exam. But those exams are a precious resource, as they’re the closest thing we can get to the real MCAT. If their score isn’t where they want it to be, they’re left with only 3 AAMC scored exams. Test takers of the past had 10 AAMC exams, but the post-2015 MCAT renders those practice tests worthless. Here I attempt to predict MCAT scores using 3rd party practice exam scores and user-submitted data from Reddit.

Data:

The data has been taken from this user submitted score spreadsheet, comprised of 844 user-submitted scores. It’s comprised of users of the MCAT subreddit and the Student Doctor Network forums. I only included individuals who took the MCAT between January and September 2017.

There is a tremendous amount of self-reporting bias in this data, which I’ll touch on at the end. Impossible scores were thrown out (one user reported a ‘406’ on NextStep Exam 1, which, like the real MCAT, is scored 472-528). I also excluded data from one other user who reported a 472 on the real exam after reporting 505, 504, 509, and 509 on NextStep Exams 1-4, and a 509 and 510 on AAMC #1 and #2. I don’t know if it’s possible to drop the ball that badly on test day so I’m calling it a fluke. These were the only two scores excluded.

Kaplan (n=309):

The short story is – Kaplan’s scores are heavily, heavily, deflated, but still have predictive power. As an extremely crude conversion, you can add 10 points to your Kaplan score to get your AAMC score. This becomes less predictive at the upper and lower extremes. Kaplan and NextStep had the strongest correlation to actual MCAT scores, though this isn’t necessarily saying they’re the best practice material. It does mean that their scaling is the most accurate.

NextStep (n=354):

NextStep exams were slightly less deflated than Kaplan, but they had a similarly tight distribution (r2 of .536). An extremely crude conversion factor would be to add 7 to your NextStep average to estimate your actual score. NextStep seems to pride themselves in giving accurate scaled scores, which makes me wonder why theirs are still so deflated still.

The Princeton Review (n=190):

Princeton Review’s exams are absurdly deflated. The average person who scores a 503 on a TPR exam gets a 518 on the real exam. Princeton Review’s exams had the worst correlation to actual MCAT scores (this graph is an unfinished stand-in for one I’ll post later but the data should be the same).

Notes:

So why are these tests so deflated? Part of it is my skewed data set (see next section). But I think it’s primarily because the test prep companies prefer someone scoring 505 on their practice exams and a 515 on the actual MCAT, rather than the opposite. Their “100% money back guarantees” rely on you outperforming your practice test score. Kaplan’s “Higher Score Guarantee” program will only redeem if your actual MCAT score is below your diagnostic score. They’ve built a ~10 point cushion into the scaling of their practice exams to ensure this won’t be redeemed often. Kaplan is the most popular MCAT prep company, and they have a treasure trove of student data that could be used to give accurate scores, if they desired.

Princeton Review offers a similar program, but the baseline score can be either your previous actual MCAT score (if taken within 90 days of the start of the review course) or the Princeton Review diagnostic exam taken at the beginning of your course. If you aren’t a retaker, and the latter is used, there is almost no chance you fail to beat that score.

I don’t think it’s a coincidence that NextStep, a prep company without a full-refund ‘Better Score Guarantee’, doesn’t deflate their practice test scores as heavily. They do have a guarantee for their tutoring, but it only redeems a free 2-hour tutoring session.

Personally, I was dejected when I got a 500 on my first Princeton Review practice exam. In reality, most people who score that on their exams get between 511 and 516 on the actual exam. Had I not worked with this data, I would have thought I was on track for a 500 on the real exam – a score that has a dreadful mean acceptance rate of 22.3%, per the AAMC [1]. Princeton Review at no point notes how deflated their scores are, and many students probably don’t realize this. I now understand it’s their way of protecting their revenue while still offering a money-back score guarantee.

When I have more time, I’d like to add more test prep companies (namely Altaius and ExamKrackers), and compare individual sections to actual MCAT section scores to see if any companies have a notably higher correlation on one subsection.

Pitfalls:

Among the many shortcomings of this data, the most damning is the fact that the 3rd party test prep companies can change their scaling algorithms anytime without warning (and they do). NextStep has stated via email that they are constantly fine-tuning their algorithm, with a major overhaul in January 2017. Kaplan appears to do the same (as discussed above, I think NextStep may actually desire to be accurate in their score reporting and I imagine they adjust more frequently than other companies). But there are a host of other problems still:

I am betting that users visiting the MCAT subreddit and other pre-medical internet forums are more dedicated than the average test taker.
Massive self-reporting bias – people are more likely to submit their scores if they’re impressive (the average score in my data was an absurdly high 515).
These practice tests are expensive, and the demographics of these test takers are almost certainly skewed towards rich folk.
3rd party exam scores were averaged and no consideration was given to how many exams they took. An individual who took 10 Kaplan exams and got 510 each time was treated exactly the same as an individual who took 1 Kaplan exam and got a 510, even though I suspect the person who took 10 exams is better prepared to score well on the actual exam. I’ll look into this one when I have more time but I’m guessing it’s less important than the scores themselves.

All in all, this is an extremely imperfect science with many pitfalls, but it was able to predict my score within two points, but will serve as a nice ballpark estimate for those wondering where they stand without any AAMC exam scores in hand.

Footnotes:

1: Applicants with MCAT scores between 498 and 501 had an acceptance rate of 22.3% (1,241 / 5,571) for 2016-2017 and 2017-2018, from AAMC’s Table A-23.