Ranking College Football Coaches |

Ranking College Football Coaches

Jun 5, 2020 Data Science

PC: CollegeFootballNews.com

Background

I’m an avid college football fan with a keen interest in data science. For my first dive into data science, I will use some basic statistics and data manipulation to evaluate coaching in college football.

Defining the Problem

Football is the crown jewel and breadwinner in any collegiate athletic department. A successful football program is not only a source of revenue for the university, it is also a powerful marketing tool, as studies have shown winning to be associated with increased donations, academic reputation, and lower acceptance rates. Therefore, it’s no surprise that coaches wield significant power and lofty salaries.

Coaches also carry the burden of high expectations, poor job security, and an average tenure of only 3.8 years. While the decision to hire or fire a coach is a multi-factorial process, it is primarily driven by their record. Consequently, a coach’s value, particularly with donors and fans, is inextricably tied to their performance on the field.

However, wins and losses are driven by several factors, many of which are outside a coach’s control, including conference affiliation, prestige, location, historical success, and financial investment from the university. As such, a coach’s record is not the most objective way to assess their value. In this project we will set out to define an approach for evaluating coaches while minimizing the impact of these confounding factors. Let’s get started!

Continue reading on Medium: Using Data Science to Evaluate Recruiting and Player Development in College Football

Code can be downloaded at: https://github.com/arsakhar/NCAAF

Technical Skills

Data Visualization

Histograms, scatterplots, qq plots, boxplots

Data Scraping

Scraping data across 4 different websites

Data Cleaning

Removing empty rows in dataframe
Removing non-numeric / nan rows in dataframe
Filtering dataframe by a specific keyword or attribute

Data Manipulation

Joining dataframes
Transforming values on a dataframe column
Applying a function elementwise on a dataframe column
Filtering dataframe based on a specific attribute or keyword
Aggregating on a dataframe column (standard deviation, mean, min, max, sum, count)

Statistical Analysis

Shapiro-Wilks test for normality
Kruskal-Wallis non-parametric test for group differences
Dunn post-hoc testing for pairwise comparisons
Box-cox for transforming skewed distribution to a gaussian

Packages

Beautiful Soup
Scipy
Matplotlib
Pandas
Numpy
Scikit
StatsModels

Data Science

Ashwin Sakhare

Data Scientist

I have a passion for solving clinical problems through actionable insights derived from data-driven approaches.