It’s draft time, and we are currently being bombarded by draft analysis, both written and spoken. Jazz fans have been subjected to David Locke’s “percentiles” for the past 6 weeks. And with the rise of artificial intelligence (think chatGPT), we see some of the more tech-savvy analysts developing predictive models to identify top draft prospects. One such method for building these models is called “machine learning”. A machine learning model is constructed and trained by a set of observations with known outcomes and applied to a new set of observations with unknown outcomes to predict their result. For example, if we were to create a simple model, we could train it with basic per-game stats (like points, rebounds, and assists) and some measure of NBA success (e.g. all-NBA). After that, we’d be able to feed it the per-game stats for draft candidates this year and see what predictions look like.
But we don’t go for simple at LowerTheRim. Instead of numbers, what if we build our model with words? You see, there’s an absurd amount of draft analysis word vomit floating around on the web. Most of it manages to say the same thing about a player in slightly different ways. This isn’t to say the content is bad – I read this stuff every day and enjoy it. But there’s only so much we can say about a player. Amen Thompson is a freakish athlete, has good size at the point guard slot, possesses strong passing instincts, but he can’t shoot right now. We’ve heard it hundreds of times by now from all kinds of analysts. But what does that mean? Is there a history of draft candidates who possess a similar combination of strengths and weaknesses? Did they become really good players, or were they forever hindered by a poor jump shot? What happens if you consider that Amen is one of the hardest workers of them all? Something that can’t exactly be quantified, but has been put into words in most mock draft write-ups. I think there’s an opportunity here to build something that is capable of considering all of this information together and generating a useful probabilistic forecast for NBA success.
Here we go. I fed a machine learning model a bunch of NBA draft analysis words (from https://www.nbadraft.net) and told it what players became all stars within 7 years, and what players didn’t. Within 7 years, a player might still be with the team that drafted him, and has had ample time to show his potential. To summarize: our model uses written draft analysis to predict the probability of a player becoming an all star within 7 years.
Let’s get the numbers out of the way. The model is over 80% accurate. It rarely falsely predicts an all star scenario for a player, while it will more frequently falsely predict a non-all-star scenario. In other words, you’d better pay attention if the probability is greater than 50%, because more often than not that player goes on to become an all star. On the other hand, a non-all star prediction does not necessarily guarantee that a player won’t go on to become one within the next 7 years. Curious about what past draft predictions look like? Take a look at 2009:
Our model picked out 5 of the 6 all stars from the 2009 draft, and missed on Jeff Teague simply because we do not have draft analysis data for him. Take a look at Hasheem Thabeet – an 11% chance as a number 2 pick! Jrue Holiday was pick 17 and yet he showed a 63% chance for an all star outcome. I believe this demonstrates the power of a human-written draft analysis – something that goes beyond just the per-game numbers and advanced stats. A subjective analysis includes opinion, which is inherently imperfect, but that imperfect view captures so much that the numbers can’t. Perhaps we’re picking out all of the right attributes and aspects that are predictive of success in a player. All that’s left to do is consider all of it together over the backdrop of historic analysis – a whole is greater than the sum of its parts type thing.
With all of this context in mind, let’s take a look at the 2023 draft predictions:
The absurdly high Wemby all star probability is a great sanity check here. As long as he’s healthy, Victor Wembanyama is a sure-fire all star. Moving down the list we find Nick Smith, Taylor Hendricks, Bilal Couliably, Anthony Black, and Scoot Henderson with similarly decent shots at an all star future. The Nick Smith result is most surprising to me, as I’ve been low on him from the very beginning. In fact, most draft talk I read doesn’t like him that much either. Even the input analysis I used for this model appears to be underwhelming. Jazz fans should feel great about the predictions. 4 of the top 6 have a decent chance of being available to them at either pick 9 or 16. I’m taking Hendricks for sure if he’s there at 9.
There’s a decent drop-off after that list of players. Ausar Thompson looks like he could be a bust. So does Brandon Miller. Jarace Walker also slides quite a bit relative to his projected draft slot. If I were taking a swing at someone based on these predictions in the back half of the first round, I might try Ryan Rupert or Jordan Hawkins. Both have mock draft ranges from picks 15-35 and both show a greater than 10% chance at that all star scenario.
With all that said, this model isn’t perfect. In the 2008 draft, it thought Jerryd Bayless was the most likely all star and only gave Kevin Love a 5% chance. Overall, the model shows promise though. Perhaps if we scrape the web for more words to throw at it, it will get even better. For now, let’s store these results in the archives and pull them out 7 years later so I can either say “told you so” or “I should probably change careers now”.