Analytics and the NFL Draft

Welcome to 3SigmaAthlete.com — I’ll be talking about SPARQ, Approximate Value, the Combine, etc., and if you’re wondering what these things are, please refer to the FAQ section of the site. I’ve tried to lay out all methodology and relevant information there.

With Combine week upon us, there will be those who feel compelled to remind the masses that football is played in pads and not underwear, fought on a gridiron and not a track. To many people, the idea that such a complex sport could be influenced by a participant’s vertical jump or short shuttle time is laughable and quickly discarded. This isn’t entirely unreasonable, as there’s too much variability inherent in the career arc of any given prospect for there to be a universally accurate projection system.

This doesn’t mean that all athleticism data should be thrown out without examination. The best approach to the draft is to look at all available data, considering information with as little bias as possible. This should never be a scouting vs. analytics issue, but rather a scouting and analytics method.

Because listing things is much easier than writing points into the natural flow of an article, here’s a list of three keys ways in which analytics can inform the scouting process.

  1. Players generally won’t succeed without a certain level of athletic ability, or “functional athleticism.”

One of the biggest stumbling blocks in draft season is the comparison to an outlier. They go something along the lines of:

“Anquan Boldin ran slow at the Combine and ended up being Anquan Boldin, so Young Slow Receiver X can also have a long and successful career!”

This isn’t necessarily a criticism of these comparisons. It’s difficult to avoid the outliers because, by their nature, outliers are the data points that stick in our minds. It’s easy to remember Anquan Boldin. The problem is that there aren’t many Anquan Boldins out there.

As discussed in the FAQ on this site, the z-score stat I cite is relative to the NFL positional average. This means that a z-score of 0 refers to a player who is athletically comparable to the 50th-percentile NFL athlete. A 0 z-score is average, a -1 z-score is below average, and a -2 z-score is a real problem.

Drawing from a database of all players drafted since 1999 and using Approximate Value (explained in the FAQ) as a rough measure of ability, there has been 1 significant guard with a z-score below -1.5. There has been 1 significant center with a z-score under -1.5. There has been 1 significant offensive tackle with a z-score less than -1.5.

For running backs, you’re looking at Reuben Droughns and Domanick Williams as the most successful sub-1.5. At corner, Brent Grimes is just about the only successful player with a z-score that falls below -1.0.

Just listing names isn’t as powerful as actual data analysis, and I’ll get to that shortly. I’m simply trying to make the case that while the outliers exist, they’re much stronger in our memories than in reality. I don’t think every offensive lineman needs to test out like Lane Johnson, but the data shows that they need to at least meet a minimum athletic requirement.

  1. More athletic athletes are better athletes.

Athleticism matters. It doesn’t matter in every case, for every Boldin or Wes Welker, but it generally matters. I’ve spent the last 10 months collecting and processing data in the hope of proving this correlation statistically, and I discuss the result in a piece at great length. While I find it all fascinating, you may not be as much into the t-tests and methodology. Here’s the plot and regression that I ultimately arrived at:

Plot2

If you skipped the linked article, the x-axis (horizontal axis) represents athleticism and the y-axis (vertical axis) represents NFL production. What we see is that there’s a clear trend toward more athletic players producing a higher AV3. If there was no relationship between athleticism and production, this line would be flat, parallel to the x-axis (i.e., zero slope). This relationship is statistically significant with a p-value of approximately zero.

This doesn’t mean that the more athletic player is always going to produce at a greater rate. It means that consistently picking players from a more athletic group will yield more production over the long run.

Let’s look at the first page of the wiki page on analytics.

“Analytics is the discovery and communication of meaningful patterns in data.”

The goal is to find meaningful patterns in data, often on a large scale. We aren’t able to look at 5 players and make definitive rulings on their career paths at the Combine. Over the course of time, we want to make the best value propositions with our draft capital, and you will generally find more success consistently selecting from a pool of ‘plus’ athletes than average ones.

  1. Analytics should make us ask questions and re-evaluate.

Analytical athleticism comparisons can help us look back at past evaluations and question what skill or ability makes the current prospect more able to adjust to the NFL.

We don’t have Combine values for Nelson Agholor yet, but, courtesy of data maven Tony Wiltshire, I have some rough numbers he put up at his USC junior day.

Agholor

Now, we don’t know that Marqise Lee won’t be a good NFL player, and he struggled through injury in his rookie season. But we know that he’s a pretty similar athlete to Agholor, will probably slot in at a similar draft position, and didn’t have an overly impressive rookie season.

In this particular case, it’s not difficult to isolate the skill that differentiates Agholor from Lee: Nelson frequently catches the ball when it’s thrown to him. The athletic comp asks how the two players are different, and we’re able to answer the question.

There’s also the case of certain athletic profiles tending to produce a high number of successful NFL players. Jordan Matthews wasn’t generally regarded as an explosive athlete last spring, but his top 4 athletic comparisons showed that he probably had the requisite athleticism to be a good NFL player. Players built like Jordan Matthews tend to do pretty well.

Matthews

This doesn’t mean that he’s necessarily going to be as good as AJ Green. The intent is to look back and question the initial evaluation. Is there something that shows up on the field that contradicts the results of his athletic testing? Is there a reason to grade down his future potential on an athletic basis?

For a given prospect, the athletic profile may not tell the full story, but it’s always worth looking at the numbers and going back to the initial evaluation.

Contrary to the beliefs held by some, analytics aren’t an attempt to replace scouting or based on a belief that everything is calculable with enough spreadsheet cells. The very best NFL evaluators consistently fail. It’s a difficult job, and the hit rate isn’t great for anyone. The use of analytics is an attempt to do things more efficiently, to get incrementally better at what we’re doing.

Combine week starts tomorrow, of course, so check back in later on and I’ll have a few previews and a recap of each testing day in Indy.

5 thoughts on “Analytics and the NFL Draft

  1. Doug

    What happens if you eliminate the AV3 scores that are approximately equal to zero? An AV3 of zero or near zero occurs (I am assuming) due to injury or suspensions or other events that are otherwise “noise”. The only reason to include this data would be if there was a relationship between pSPARQ and injury (or other random events). Am I wrong?

    Like

    Reply
    1. zachwhitman Post author

      Hey Doug — I did run the regression with a bound that only included players for whom AV3 > 0. It didn’t significantly impact the results. While injury can be the reason for a player accruing zero stats, it’s also quite common for 6th and 7th-round picks to never play because they can’t make a roster.

      I do think the injury stuff is noise. That’s why I’ve gone with AV3 and not an AV of the player’s first 4 years — I’d really like to isolate the best possible peak performance of the player, and get rid of the injury influence.

      Like

      Reply
  2. David

    First, thanks for doing this. Fascinating stuff.

    Second, do you see a role for incorporating other nonSPARQ data to see if the fit of the data changes? I’m thinking of other measurable traits like hand size, wing span, Wonderlic etc.

    Like

    Reply
  3. Pingback: Finding The Superior Athlete: Impact Defensive Ends For The Cowboys In The 2015 NFL Draft - Sports Train

  4. Pingback: Finding The Superior Athlete: Impact Defensive Ends For The Cowboys In The 2015 NFL Draft - Tonic Sports - Tonic Sports

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s