Free Bootstrap HTML5 Templates of 2020 From Different Categories and Downloaded Over 150000+ Times. From spacing, number, and layout formatting to musical notation, graphing, and notes, these Office blank and general templates give you the right canvas to start your project. It is a multi-platform (Logic or Ableton, Pro Tools and Native Instruments and Push), intensive and practical music. Multi-week calendars tend to be more flexible in situations where all dates need to be on the same sheet of paper. Reason 5 mac crack software.
After attempting to predict the MIP using machine learning
, I switched my focus to trying to predict this offseason's free agent contracts. I hope y'all enjoy! (There's a TL;DR at the bottom)
Day 1 of NBA free agency is by far the biggest day of the offseason for the league. It was a doozy last season, with seismic shifts in the league's power structure
. Arguably 2 of the top 5 players (when healthy) in the league, as well as an additional 3 within the top 25, changed teams.
This year's class? Not as highly regarded as last year. Anthony Davis is the crown jewel, while Brandon Ingram is the best restricted free agent available. The hype of this class was further deflated when players like Kyle Lowry
and Draymond Green
signed extensions with their current teams.
What I wanted to do was predict what contracts this year's free agent class might get based off previous offseasons. Stars generally get star-type money, but in tiers below, contracts of comparable players usually come up in discussing contract value.
- statistical data (regular season totals and cumulative advanced stats) from Basketball-Reference
- I do understand that some players get paid on the strength of playoff performance
- historical free agents (2016-2019) and salary cap history also from Basketball-Reference
- 2020 free agents from Spotrac
- historical contract info from Capology, Spotrac and Basketball-Insiders
- Capology = main source, but inconsistent with min/near-min contracts
- Spotrac requires a premium login to view contract cap hits beyond the last and current contracts, but the cash earnings are free to view
- When player was near-minimum & played for multiple teams in one season, Basketball-Insiders had transaction history in team salary archives (needed player's first team of season)
- set contract years to zero and salary to zero for players who went overseas, had explicitly non guaranteed first years in their contracts (training camp deals, two ways, ten days, exhibit 10s) or had blanks in their contract terms cell
- included option years and partially guaranteed years in my calculation of contract years (looked at it as both player and team intending to see out the contract)
Preprocessing the Data
I started off with contract year stats, because there's anecdotal evidence that players exert more effort in their contract year. I initially wanted to use totals to bake in availability/body fragility, but the shortened season would cause the model to declare all players to be fragile and underestimate their contract
. Stats other than games played, games started, and the advanced stats (OWS, DWS and VORP) were converted to per game. Percentages were left alone. Games played, games started, and the advanced stats were scaled to have a normal distribution (mean of 0 and standard deviation of 1).
In addition to using contract year stats, I summed the past two years and the contract year. Why did I settled on 3 years?
- Players do get paid on past performance, so just using contract year stats was out of the question
- 2 years opens up the possibility of a fluke year
- Kawhi would have his nine game season bring down his averages significantly from his Raptors season: adding another year somewhat lessens this effect
- On the other hand, it's quite unlikely that teams factor in stats from more than 4 years ago, a lot would have changed (the Knicks didn't pay Derrick Rose to recapture his form of his MVP year)
Another reason I settled on 3 years is that I can keep the same model for restricted free agents:
- my thought is that the rookie year is a bonus: great if you did well, but doesn't matter in the grand scheme of things if you did poorly
- For example, if Donovan Mitchell had a worse rookie year but had the same level of play that he has achieved in his second and third year (as well as next year), I highly doubt that Utah would offer Mitchell a significantly less amount of money due to a substandard rookie year
I performed the same processing on the three-year totals, using the three-year game total as the denominator for converting to per game. I had to calculate the three-year percentages, and also re-engineered the win shares per 48 minutes metric.
- removed categories that were linear combinations of one another (total rebounds = offensive + defensive rebounds)
- kept age and experience as predictor variables, but removed position because I felt it would ultimately reflect in the stats
Data Visualization Relationship between targets (first year salary as % of cap and contract years)
Relationship between win shares as first year salary as % of cap
- box and whisker with transparent layer of actual points (added random variation to see all points)
- correlation coefficient is 0.77 (strongly and positively correlated)
- median value of first year cap % (middle line in each box) increases w/increase in contract length
Distribution of contract length by offseason
- shouldn't be too groundbreaking: better players get paid more
- no surprise that as contract length increases, the percent of contracts of that length given out decreases
- in 2016, the amount of players who didn't receive contracts was lower than the amount who received 1, 2, or 4 year contracts
- 2016 was the year the salary cap spiked from $70 million to $94 million
- Similar to a lot of people with new-found money, teams spent somewhat recklessly.
Dealing w/Target Correlation
As mentioned, the target variables are correlated with a Pearson correlation coefficient of 0.77. My method to combat this:
- predict one target first without the other as a predictor
- choose the best model (be that a single model or an ensemble of multiple models)
- use the first target's predictions as an input to predict the second target
So I will have a model that predicts years first and salary second, as well as a model that predicts salary first and years second. I know it's not the greatest method (correlation does not imply causation!), and I'm open to hear alternative ways y'all would have gone about it!
One potential problem is compounding errors. If there's an incorrect year prediction, it might lead to an incorrect salary prediction and vice versa.
Algorithms to Train
- a linear regression model as a baseline
- a k-nearest neighbors model: take the distance between the statistics of two players (the absolute value of the difference) and then take the average of the outcome variable of the k nearest neighbours
A very simple example:
|Player ||PPG ||RPG ||Contract |
|A ||25 ||7 ||4 yrs, $90 million |
|B ||24 ||8 ||? |
|C ||6 ||0.3 ||1 yr, $2 million |
|D ||5 ||0.5 ||? |
With a 1 nearest neighbour model, you can clearly see that B is most similar to A, and D is most similar to C. Therefore, B's predicted contract is 4 years and $90 million, and D's predicted contract is 1 year and $2 million.
- a decision tree model: maybe as a player passes certain statistical thresholds, their contract increases?
- only using for predicting the contract years; since there are so many different salary percentages, a solitary decision tree would either be useless or far too complicated
- two random forest models: better than decision trees in that they reduce instability by averaging multiple trees
- unfortunately, the cost is we don't get an easily interpretable tree
- a support vector machine: attempts to separate classes with a hyperplane
- support vectors are the points closest to the hyperplane, named as such because the hyperplane would change if those points were removed
- Here's an image from Wikipedia.svg) that I believe succinctly explains SVMs
Testing the Models
Years First, Salary Second years performance metrics
- Mean absolute error is the measure of the average difference between forecasts, while the residual mean squared error penalizes large errors
- The random forests provide by far the best performance, being above 80% accuracy when no other model is above 60%. However, they have a hard time distinguishing max contract year players.
- This is somewhat understandable, as the fifth year is only accessible to players resigning with their current team.
- The models could get confused seeing similar players in stats, but one signed for five years and one signed for four.
To alleviate this, I propose that if a player is predicted by ANY model to be a 5 year player, their contract year prediction is 5. Else, use a median of Rborist, ranger and SVM.
Why am I including SVM? Well, it has the best performance among the other models. I’m assuming that the random forest models will generally agree with each other. In the off chance they don’t, I’ll rely on the SVM to break the tie. years decision tree
The decision tree maximizes its prediction when a player does all of the following:
- has above average offensive win shares in the contract year
- plays more than 28 minutes per game in the contract year
- shoots better than 41% from the field in the contract year
- has less than 9 years of experience
The decision tree minimizes its prediction when a player does all of the following:
salary performance metrics
- has below average offensive win shares in the contract year
- has at most slightly above average defensive win shares in the contract year
- steals the ball less than 0.56 times per game in the contract year
- is an unrestricted free agent
- has played almost half a standard deviation less games than the average in the last 3 years
- With the MAEs being relatively similar, I’ll take the median of the models.
Salary First, Years Second salary performance metrics
years performance metrics
- The MAE range for the salary-first model is much smaller than the equivalent for the salary second model.
- I’ll take the median, as I did for the previous.
years decision tree
- All models achieved at least the same if not a better correct prediction percentage than when predicting years first.
- The number of max year predictions has substantially increased (and in the linear model’s case, has outstripped the actual number of max contracts given out).
- I’ll take the median of all models except KNN, which has the lowest prediction accuracy by around 8%
- The singular decision tree again has trouble with predicting max contract length.
- Salary makes up 50% (4 of 8) of the decisions in the tree.
- The decision tree minimizes its prediction when a player has a predicted salary of less than 0.76% of the salary cap.
- The decision tree maximizes its prediction when a player is less than 31 years old and their predicted salary is above 16% of the cap.
Evaluating the Models Here's a google sheet of all predictions separated by whether a player had a player option or not
- Totals are based on a $115 million salary cap (might not happen, but is a concrete number we have at our disposal) and 5% annual raises
Selected Player Option Decisions
I was unsure how to deal with club options, but players who decline player options become unrestricted free agents.
|player ||Y1S2 Cap % ||yrs_Y1S2 ||total_Y1S2 ||S1Y2 Cap % ||yrs_S1Y2 ||total_S1Y2 ||2021 Option |
|Gordon Hayward ||0.1788988921 ||3 ||64.86 ||0.1812429743 ||3 ||65.71 ||34.19 |
|Andre Drummond ||0.2221571423 ||3 ||80.54 ||0.2000980177 ||4 ||99.18 ||28.75 |
|Anthony Davis ||0.3014811978 ||5 ||191.58 ||0.2868035625 ||3 ||103.98 ||28.75 |
|Otto Porter ||0.08297334043 ||2 ||19.56 ||0.07925015245 ||2 ||18.68 ||28.49 |
|DeMar DeRozan ||0.2781873551 ||3 ||100.85 ||0.2657556943 ||3 ||96.35 ||27.74 |
|Nicolas Batum ||0.01857416549 ||1 ||2.14 ||0.02604507845 ||1 ||3 ||27.13 |
|Evan Fournier ||0.213951204 ||4 ||106.05 ||0.1969257167 ||4 ||97.61 ||17 |
|James Johnson ||0.02424448669 ||1 ||2.79 ||0.03168259373 ||1 ||3.64 ||15.83 |
|Jerami Grant ||0.132122261 ||3 ||47.9 ||0.1139293428 ||3 ||41.3 ||9.35 |
|Stanley Johnson ||0.01658695123 ||1 ||1.91 ||0 ||0 ||0 ||3.8 |
While some might be surprised at the DeRozan and Drummond projections, we have to note that the model doesn't take into account intangibles like playing reputation, team fit, or willingness to take a reduced salary to be on a championship contender. If Anthony Davis re-signs with the Lakers, he is eligible for eight percent raises rather than five. That would bring the Y1S2 total to $202 million, which is exactly what he is eligible for.
In terms of the three year projection, Davis could sign that shorter contract with the last year being an option in order to be eligible for the 35% max contract after 10 years of service (but the model wouldn't know that).
Jerami Grant and surprisingly (to me at least) Evan Fournier look like players set to cash in after declining their option. Players like Nicolas Batum, James Johnson, Otto Porter and Stanley Johnson (who's predicted to be out of the league by the S1Y2 model) would be wise to accept their option.
Selected Free Agents
|player ||Y1S2 Cap % ||yrs_Y1S2 ||total_Y1S2 ||S1Y2 Cap % ||yrs_S1Y2 ||total_S1Y2 |
|Brandon Ingram ||0.2548324081 ||4 ||126.31 ||0.2465662829 ||4 ||122.21 |
|Danilo Gallinari ||0.2173187706 ||3 ||78.79 ||0.2149018106 ||3 ||77.91 |
|Fred VanVleet ||0.1961214299 ||4 ||97.21 ||0.1888455092 ||4 ||93.6 |
|Montrezl Harrell ||0.19201499 ||3 ||69.61 ||0.1782262995 ||4 ||88.34 |
|Hassan Whiteside ||0.1941521099 ||3 ||70.39 ||0.1723290935 ||3 ||62.48 |
Assuming Brandon Ingram
(a first-time All-Star this season) doesn't make one of the three All-NBA teams, his first year contract salary is capped at 25%, which is right in line with predictions. However, I (among many others) expect Ingram to get the full five year extension.
The Heat wanted to acquire Danilo Gallinari
from the Thunder this trade deadline, but ultimately were unable to broker a deal. Gallinari's career has been marred by injuries, but he has played over 80% of games the past two seasons. He's averaging 19 points per game this year, on 44% field goal shooting and 40% three-point shooting.
After his breakout in last year's Eastern Conference Finals and NBA Finals, Fred VanVleet
has blossomed again in a starting role this season. Even so, Masai Ujiri and Bobby Webster might balk at paying him $24-25 million per year. Montrezl Harrell and Hassan Whiteside
are the first cases of dissension between the two models. The S1Y2 model projects Harrell to get 4 years, while the Y1S2 model only projects three. For Whiteside, the difference is almost 8 million dollars less in salary under S1Y2.
|player ||Y1S2 Cap % ||yrs_Y1S2 ||total_Y1S2 ||S1Y2 Cap % ||yrs_S1Y2 ||total_S1Y2 |
|Marcus Morris ||0.1675722624 ||3 ||60.75 ||0.1587575186 ||3 ||57.56 |
|Joe Harris ||0.1539417297 ||3 ||55.81 ||0.1586893002 ||3 ||57.53 |
|Serge Ibaka ||0.1536284642 ||3 ||55.7 ||0.1416514857 ||3 ||51.35 |
|Bogdan Bogdanović ||0.1284454435 ||3 ||46.57 ||0.1216267861 ||3 ||44.09 |
|Jordan Clarkson ||0.1357167335 ||3 ||49.2 ||0.1062462145 ||3 ||38.52 |
reneged on a verbal 2-year, $20 million contract with the Spurs last offseason to sign a 1-year, $20 million deal with the Knicks. He was the focal point of the Knicks offense and a hot commodity at the trade deadline, ultimately ending up with the championship-contending Clippers.
In February, Zach Lowe on his Lowe Post podcast thought that Joe Harris
had a chance to double his current salary of $8 million. The models are even more optimistic, projecting his first year salary as over $17 million. Bogdan Bogdanovic
has been mentioned as the second-best RFA in the class. He might be squeezed out of Sacramento, considering the recent contracts to Buddy Hield and Harrison Barnes as well as upcoming contracts to De'Aaron Fox and Marvin Bagley III. Jordan Clarkson
has the largest discrepancy in salary ($11 million) between the two models.
|player ||Y1S2 Cap % ||yrs_Y1S2 ||total_Y1S2 ||S1Y2 Cap % ||yrs_S1Y2 ||total_S1Y2 |
|Dāvis Bertāns ||0.1170676039 ||3 ||42.44 ||0.1080289004 ||3 ||39.16 |
|Goran Dragić ||0.1030787964 ||2 ||24.3 ||0.1104918333 ||2 ||26.05 |
|Dario Šarić ||0.1045436656 ||3 ||37.9 ||0.1021354069 ||3 ||37.03 |
|Christian Wood ||0.1121053606 ||3 ||40.64 ||0.0925855695 ||3 ||33.57 |
|Derrick Favors ||0.09816080257 ||2 ||23.14 ||0.1064310171 ||3 ||38.59 |
has shot the lights out this season, earning himself a raise over his current salary of $7 million. Christian Wood
has flashed star-like potential since being inserted into the starting lineup in place of the departed Andre Drummond. But the key here is 12 games started
. Wood played limited minutes in the first four months of the season. His contract situation, as well as that of Malik Beasley
(who flourished after being traded to Minnesota mid-season), are intriguing to say the least.
Limitations, Methodology Changes and Future Work
- unable to quantify intangibles like playing reputation, team fit, or willingness to take a reduced salary to be on a championship contender
- also can’t determine which team will sign which player
- highly depends on a sequence of events: if Team A signs this player, they don't have enough money to resign Player B, who then goes to Team C for less money, etc
- due to the shortened season and the uncertainty of the cap, higher-tier players might take shorter deals to give them more flexibility
- maybe I should have predicted them as a tuple instead of sequentially
- unfortunately, caret doesn't have that capability of multi-target regression
- maybe should have implemented a time factor or weighted recent years more heavily, as team decision makers may have gotten smarter
- wanted the models themselves to perform feature selection and determine what the most important variables were
- looking at the contract years as a classification problem with 6 classes (0, 1, 2, 3, 4, 5)
- try more models, like boosting (in which models are added sequentially, with later models in the sequence attempting to correct the errors of earlier models)
- predicting a third target: whether a contract will end in an option year
- Star players are more likely to demand the last year of their contract as a player option in order to take ownership of their future.
- Attempted to use machine learning on NBA free agents from 2016-2019 to predict contract length & first year salary as % of the salary cap for 2020 free agents
- Used contract year stats as well as summed last-three-year stats (converted to per game)
- games played and cumulative advanced stats (WS & VORP) were scaled due to the shortened season
- since targets were correlated, I predicted one target first and then used its predictions to predict the second target
- Six models were tested: linear, k-nearest-neighbors, decision tree, two random forest algorithms and a support vector machine
- Anthony Davis is the only player predicted to get a five year deal
- Models can't quantify intangibles like playing reputation, team fit, or willingness to take a reduced salary to be on a championship contender; also star players more likely to take shorter deals to maintain flexibility in this uncertain season
I did the analysis in R, and the GitHub link is here
. Hope y'all enjoyed this!