Similarity Scores and Uniqueness Index

I was going to add this stat to the FAQ, but ended up with enough material that I felt it warranted a brief post. I’ve been working with a new stat: Uniqueness Index (UQI).

Similarity scores are becoming increasingly common in the draftnik world. They were originally used by Bill James to track baseball career arcs, and the same concepts can be applied to relate athletic profiles. We can use this kind of tool to compare a player from the current class to the historical NFL data set, determining a list of similar players to aid in analysis.

Uniqueness Index is a simple extension of this idea. In my simScore formulation, I define “S80” as the list of all players who achieve an 80 similarity score and above. This forms the list of all significant comparable players, with 80 representing one-half of a standard deviation from a perfect match. This is a pretty common idea with similarity scores.

I also calculate “S60,” which relates a list of players who fall, on average, within a full standard deviation of the given athletic profile. 60 represents a very weak comparison, and I don’t typically use any comps which fall below 80. Still, the players above 60 are, in a general sense, from the same category as the target profile.

UQI is simply the percentage of players in the given position group who fail to meet the S60 requirements. The basic formula is as shown below:

Players of a unique profile will have a high UQI as a smaller portion of the available data set meets the S60 similarity requirements. Standard athletic profiles will be within reach of a large portion of the available data set, and will thus have a low uniqueness.

While it’s theoretically possible to achieve a 100 UQI, the highest I have on record is 99, a mark reached by 3 of the 4 current members of the 3sigma club — Calvin Johnson, Evan Mathis, and Lane Johnson. J.J. Watt, the Fourth Musketeer, rings in at a 97.

The lower end of UQI is typically about 25-30, which represents the most dead-average profile possible. LSU OT prospect La’el Collins has one of the lower marks in the current draft class with a 27 UQI. It’s not a judgment of his ability, but does show that his build and athleticism are entirely fungible.

The correlation between UQI and SPARQ exists but is not perfect. Players who test better and worse will typically have fewer peers than those who test in the average range, but there are examples of average testers with unique profiles. For example, Wes Saxton and Jesse James are side-by-side in the TE SPARQ chart, but their UQI are very different.

The two have a nearly identical pSPARQ but fall in very different worlds of UQI. Jesse James is a more rare athlete than Wes Saxton.

That’s not necessarily good or bad. Part of the idea behind UQI is that we gain an appreciation of what we have and haven’t seen before. We can try to project Bud Dupree, but with a UQI of 95, we haven’t really seen many athletes like Bud Dupree before. To give an idea of how it works, here are a few other UQIs from the current class.:

I’ll be expanding on some further metadata related to similarity scores, but UQI is a good enough start for now.

-ZW

Leave a Reply

Your email address will not be published. Required fields are marked *