While most daily fantasy sports (DFS) players usually swing and miss, big data management and predictive analytics have the capacity to increase a player’s chance of winning more consistently. In this article, Milliman’s Michael Henk and Nicholas Blaubach discuss the monetary success that some advance modelers are having on DFS websites using predictive analytics. The following excerpt highlights the steps necessary to build a DFS predictive model.
There are some basic steps that serve as general “rules of thumb” when we set out to develop our predictive model to make us millions in DFS.
First, we need an objective. We want our model to optimize our roster, giving us the most potential points. In our DFS example, we’d want a predictive model that will help us identify the best players for the cost (in order to stay under the salary caps) for any given contest.
Next, we gather our data… Gathering the data and getting it into a proper format for our predictive model is another story, but historical sports data is easy to find online. One thing to consider here is the traditional actuarial concern of credibility. If the data isn’t credible, it’s highly unlikely that we’ll be able to build a successful model from it….
After we choose the data to use, we need to select and transform the specific variables in the data set. The structure of the predictive (or independent) variables in relation to the target (or dependent) variable determines how well a model works. We can transform variables (by taking logarithms, for example) or bucket variables to see what gives us the best fit. Sports data can have hundreds (or even thousands) of variables….
Next, we process and evaluate our model. The key to good model performance is obviously getting the best fit. If we’ve done the other steps up to this point well, this step should run smoothly. Here we identify the ideal number of variables and use performance metrics to evaluate the model fits….
Once all of that is done, it’s important to not merely implement the model and ignore it. It requires routine maintenance. As time goes by and data continues to emerge, we need to take time to reinvestigate the data, update the models, and challenge some of our initial assumptions. The best models are continually updated and recalibrated, audited on a regular basis, and replaced when they are no longer effective.