What makes an NBA game fun to watch? (Hint: It’s not the basketball)

A statistical analysis of what factors make an NBA game popular

Photo by NeONBRAND on Unsplash

Imagine you watched an NBA game and a friend remarks the day later, “Ahh dang, I missed the game! Was it any good?”

What would you respond with?

Maybe you would talk about how high-scoring the game was. Or maybe you’d talk about any big plays down the stretch made by All Stars. Maybe even it was the passing or defense.

In this article, I’m going to use statistical modeling of all the NBA games in the 2019–2020 season to come to an answer on what makes an NBA game fun to watch.

TL;DR: People want extremely close games above all. Having All Stars score big numbers is also important to a lesser degree. Strikingly, nothing about the actual game of basketball is that important.

(Feel free to skip the next 3 sections if you just want analysis! Just know that popularity is on a 10pt scale.)

How to Define Popularity

First, how to define popularity. There is a site (wikihoops.com) where users upvote and downvote every NBA game based on what they thought about it, after the game ends. This provides a wonderful way to give each NBA game a “popularity” score.

As opposed to upvotes on Reddit game threads or TV viewership numbers (which are not even accessible to me), these scores are less affected by which games were just on national TV and had a lot of viewers. Wikihoops.com is more about the actual quality of a game (upvotes AND downvotes) than just how many people watched.

Technically, I am converting these upvotes and downvotes into a wilson lower bound and then normalizing popularity to a score between 0 and 10. This balances both the upvotes % and the total number of votes into one number.

Important to note, this score is from people who presumably watched the game already. This analysis is NOT about what makes a random person interested enough to turn on the TV, it’s about what those that DO watch find entertaining.

The Variables

As predictor variables, I looked at a wide array of public data.

  • Box Score: Margin of Victory (MOV), Total PTS, FG, FGA, FG%, 3P, 3PA, 3P%, FT, FTA, FT%, ORB, DRB, TRB, AST, STL, BLK, TOV, PF, TSA, TS%, 20/30/40/50 point scorers, and whether the game went into overtime or not.
  • Play-by-Play: Average margin during the game, largest comeback, largest 4th qtr comeback, number of lead changes, number of 1st/2nd/3rd/4th qtr lead changes, scoring differential in the 1st/2nd/3rd/4th qtr, and largest score difference during the whole game.
  • Advanced Data: Total Passes, Deflections, Charges Drawn, Screen Assists, Loose Balls Recovered, Box Outs, Points off TOV, 2nd Chance Pts, Fastbreak PTS, Points in the Paint, Distance Run, and Hockey Assists.
  • Team Quality: Total All Stars, Avg wins for both teams, Difference in wins between the two teams.
  • Other: Indicator variables for all 30 teams.

The Methods

I’m taking a “wisdom of the crowds” path here and using three different statistical models for different purposes before combining them into insights in the next section.

1. (Most important) LASSO Regression: Bootstrapped LASSO models with cross-validated tuning parameters (using the 1 SE rule). I am building confidence intervals for each predictor off each bootstrapped model.

2. Random Forests Regression: Used just to get feature importance after using cross validation to select hyperparameters.

3. Stepwise Linear Regression: Stepwise methods are a little taboo to use, but it’s still telling to see which variables are chosen first and how much relative increase in performance is gained at each step.

In order, the three above models had a cross-validated R² of 45.6%, 44.6%, and 47.7%. All very similar.

Important: Close Games

The one thing has been subpar about this season so far has been the large number of blowouts. As a perfect example, there was a lot of talk about how boring the Christmas games were because every game was lopsided.

The data backs this up. All models universally point to close games being BY FAR the most important aspect of a fun game. A simple plot of Margin-of-victory (MOV) vs. popularity makes this very clear.

Not all close games are fun, but pretty much all blowouts are boring.

Of the >70 variables I tested, only 6 actually were significant, with 4 of those being related to how close the game was in the end: MOV, whether the game went into overtime, 4th qtr lead changes, and the largest 4th qtr comeback.

My Random Forests model gave MOV a 71% feature importance while the next most important variable only had a feature importance of 9% (woah!). As well, MOV was the 1st variable selected in my stepwise model, giving by far the most improvement in R².

Using my LASSO model, I estimate the difference between a 20pt blowout and a 1pt overtime game to be at least ~2.5 popularity points (on a 10pt popularity scale).

This is one big reason the 2019–20 Bucks had so few fun games. They were just too good in the regular season, winning too many games in boring blowouts. While we know superteams can draw big ratings, it’s important that another team can at least challenge them or else the game will be a dud.

Important: Big games from All Stars

The only other two significant variables in my model were (1) the number of All Stars playing and (2) the number of 30pt scorers in the game. These are both fairly simple measures of star power and individual scoring.

My model estimates the difference between a game with 0 and 4 All Stars is ~0.9 popularity points, all else equal. Similarly, the difference between a game with 0 and 4 thirty point scorers is ~1.1 popularity points.

Not Important: Basketball

In this analysis, I made a big effort to look at a ton of variables about games to try and tease out what was fun to watch. To my surprise, no variable about the actual game of basketball was even close to being significant.

Pace, total PTS, passing, fouls, points in the paint, three pointers, etc.… I did not find evidence that any of these affect how much people enjoy a game.

As a big basketball fan, my prior opinion was that maybe people liked high scoring games with very little fouls, but nope! Bummer for me because I thought this analysis would have a ton more nuance and interesting findings!

Conclusion + Implications

What I’m trying to say here may seem obvious. “Duh, nerd! People like close games where All Stars go off!”

But, the degree to which the game of basketball is unimportant to enjoyment of a basketball game is striking.

This is strong evidence that people mostly watch NBA games for the narrative. A blowout win is like a mystery novel that’s solved on page 15. That’s no fun at all!

All stars having big scoring games provides a convenient “Hero’s narrative” in a close game. Which individual will triumph over their opponent? If the highest scorer only has 22pts, it’s not easy for common fans to exactly follow what is going on.

Much is made over how to make the NBA product more popular. Fix the charge rule! More offense! No, more defense! More diversity in playstyle! But, based on this data, my recommendation is to focus way more on making the ends of games as close as possible.

The average fan cares about having a tight game down the stretch, not how much zone defense is played.

This is certainly not the easiest thing to control, but there are some possible steps that the NBA can take here. For one, The Elam Ending deserves a long hard look as a way to spice up the ends of games. As well, the new play-in games are a great way to get equally-skilled teams in some exciting elimination games.

Hello! I am currently a data scientist at Facebook NYC who likes writing a few articles on topics that I enjoy!