We have almost a full season of men’s basketball and are well into tournament time. I wanted to play around a bit with the data and see if anything interesting was there. I used the statistics from college basketball reference http://www.sports-reference.com/cbb/seasons/2017-advanced-school-stats.html.
I wanted to look at what correlated highly with win percentage. To even be the least bit intellectually honest, you have to assume that you can hold any of these independent from each other (since I’m looking at these separately), when there’s all kinds of correlation. For instance, true shooting percentage and assist rate are correlated because teams that make assists get assists because people are hitting shots. This is just exploratory playing around on an internet site, so I think we can violate few rules of reality in the name of a little fun. It’s sports!
First, I looked at pace, which is a measure of possessions per 40 minutes.
Oof. Rough start. A slight negative correlation and a fair amount of variance. Interestingly, some of the outliers that play slower are better. Anyway, not a lot here.
Assist Percentage is a measure of how many field goals made were assisted by a teammate. An assist is a pass that leads to a score (not I pass it to you and you dribble around for 15 seconds).
This one’s a bit of a mess. There is a positive correlation but there’s a ton of variance. This might be one to look at in a bit more detail.
3 point attempt rate measures the percentage of field goal attempts measures the percentage of shots that were 3 pointers. If you’ve watched the college game lately, there is an absolute plague of 3 point chuckers who are not that good but take a lot of threes since the line is so close.
Yeah, that’s about what I guessed. A decent amount of variance, but not a lot of relationship to winning. By the looks of this plot, this one might could use some transformation.
A better measure of whether you’re actually a good 3 point shooter is true shooting percentage which measures your shooting percentage, but is then weighted for the value of your shots. That is, shooting 40 percent on threes is considered as 60 percent. Likely, if you have a high true shooting percentage you’re not only taking a lot of threes, you’re actually good at them.
Ahh…there we go. You take shots and hit them, especially threes, it stands to reason that you’re going to win a lot which is why the 3 pointer has also overtaken the NBA game.
But there’s one that always will be highly correlated.
I did this in R and I’m pasting my code below in case anyone wants to use this same data.
ncaa<-read.table(file=”untitled.txt”,sep=”,”, header=T, quote=””)
“TeamPts”,”OppPts”,”Pace”,”ORtg”,”FTrate”,”ThreeAttRate”, “TrueShoot”,”TotalRb”,”AssPct”, “StlPct”, “BlkPct”,
plot(ncaa$WLPct,ncaa$Pace, xlab=”Win Loss Percentage”, ylab=”Pace”, main=”Pace and Winning”)
plot(ncaa$WLPct,ncaa$AssPct, xlab=”Win Loss Percentage”, ylab=”Assist Percentage”, main=”Assist % and Winning”)
#Three Attempt Rate
plot(ncaa$WLPct,ncaa$ThreeAttRate, xlab=”Win Loss Percentage”, ylab=”Three Point Attempt Rate”, main=”3 Point Attempt Rate
plot(ncaa$WLPct,ncaa$TrueShoot, xlab=”Win Loss Percentage”, ylab=”TrueShooting”, main=”True Shooting and Winning”)
plot(ncaa$WLPct,ncaa$PtDiff, xlab=”Win Loss Percentage”, ylab=”Point Differential”, main=”Point Differential and Winning”)