Offseason Blog Part 1: Comparing Projection Systems




There have been 3 Floored babies born in the last 4 months. Dean and Jessie welcomed Erin, Josh and Amber had Lilah, and Michael and Amina had George. Erin is in the 97th percentile in body size for her age, Lilah is in the 1st, and George is in the 2nd, so between the three of them we have 100%. 
Now that Fantasy Football season has ended I am able to turn an eye towards, what else, Fantasy Baseball, but not first without pointing out that Floored league members finished 1st (Brian), 2nd (Max – beating Michael in the semis), 3rd (Michael), and 5th (Josh) in the league that 5 Floored league members played in, Matt…well…he had fun in that league. This time of year there isn’t much baseball to follow other than what free agents sign where and to analyze projections to see who is looking undervalued for next year based on past performance, but are those projections worth anything?

First off, what are projections? Projections, in the way that I am referring to them here, are the estimated upcoming year’s stats of a player based on their past performance, the ballpark they play in, and their estimated playing time based on team roster construction and health history.  Projections typically take the past three years of a player’s numbers into account. They can be found on sites like fangraphs.com and basesballreference.com as well as on the Yahoo Fantasy league players page by selecting the ‘Remaining Games (proj)’ filter. There are a number of different websites and fantasy analysts that build their own projection sets, and frequently they differ greatly. 

This season I set off to figure out which projection sets were the most useful based on an observation I had during trade evaluations the last few years: some projections really suck. Here’s the example case:
Hector Neris had just been named closer for the Phillies early in 2017. I had drafted him as an Elite RP but the team closer had a few bad outings early in the year and Neris was named closer. I didn’t have much interest in closers so I wanted to see what hitter I could get for him. Max needed closers and had a hitter off to a bad start, Hanley Ramirez. Yahoo had just started a new feature that year where you could ‘Evaluate [the] Trade’ and see how it affected your projected stats the rest of the year. Despire Neris’s very good 2016 (2.58 ERA and 11.4 k/9), his previous years were not good (ERAs near or over 4.00 and k/9s close to 8). Yahoo was projecting him for an ERA over 4.3 with unimpressive strikeout totals (I forget exactly what they were). It got me thinking that projections were fairly bogus…or at least Yahoo’s were.

This had been the first year I really dove into Fangraph’s projections. I had watched Dean the year before with his Excel sheets on draft day, striking guys from his list as they went along. I had listened to podcasts where the announcers talked about projecting stats and not being overly reactive to hot or cold starts based on players’ past performances and it all somewhat clicked…but there are faults in the logic.

Projections cannot capture skill changes quickly or when injuries affected past performance or are lurking in the future. When Aaron Judge changed in swing in the offseason after 2016, there was no three years worth of data to capture what was to come. When Xander Bogaerts was hurt in 2017, there was no model to simulate what he would set right in 2018. When a fastball is coming at Giancarlo Stanton’s wrist…seemingly every year…there is no predictive stat that will tell you on what date it will occur. These are not what projections are for.

What projects **can** be used for, though, is everything else. Most of fantasy baseball is finding base load talent that can sustain you week in and week out while you try to find a few breakout players to put you over the top…while avoiding the bust players along the way. So how much of that is projectable?

Dean and I compiled projection sets at the beginning of 2018 and then I utilized end of season data to see how good those projections were. Every percentage you see here is based on how far off the end of season data point was from the preseason data point. Xander Bogaerts was projected to hit 15 Home Runs by the Steamer projection system, but he hit 23. 8/15 is is 53% so his HR total was off by 53%, this is a big delta. On the other hand, Charlie Blackmon was projected to have 116 Runs by the ZIPs projection system and he had 119, a difference of less than 3%, a very small Delta.
Here is how the data resulted. Let’s talk about the hitters first:



Offense PA R H HR RBI SB BB K AVG
Depth Charts 19.8% 26.4% 23.4% 38.3% 29.4% 51.1% 30.5% 22.1% 8.3%
Yahoo-Rotographs 17.5% 39.7% 21.8% 37.8% 27.8% 52.6% 30.4% 20.6% 9.0%
Zips 19.6% 25.8% 23.7% 38.4% 29.6% 50.3% 30.3% 21.9% 8.4%
Steamer 19.3% 26.3% 22.6% 36.8% 27.1% 55.1% 29.6% 21.6% 8.3%
The BAT 19.5% 25.8% 23.8% 38.5% 29.6% 48.1% 30.3% 21.9% 8.3%
Depth Charts per PA 19.8% 14.6% 23.3% 38.5% 28.7% 52.5% 31.1% 21.8% 8.3%
Yahoo-Rotographs per PA 17.5% 33.9% 9.1% 30.5% 17.2% 53.1% 20.5% 12.8% 9.0%
ZIPs per PA 19.6% 12.8% 8.2% 29.0% 18.3% 49.2% 20.5% 13.6% 8.4%
Steamer per PA 19.3% 13.7% 8.0% 27.8% 16.4% 52.4% 18.7% 13.3% 8.3%
The BAT per PA 18.1% 14.7% 12.4% 30.5% 20.0% 48.0% 21.4% 16.4% 8.3%

When the projection delta came in very high on gross total, I took the numbers at a per plate appearance method. This represents the player’s efficiency. Remember we aren’t trying to have noise like injuries and breakouts which significantly change the gross totals affect how we view the player because projections for how we’re using them aren’t intended to predict injury or breakouts. Note that a breakout player may get a ton more at bats than projected which will inflate the all the counting parameters (as in, not batting average).

The first observation is that Stolen Bases really aren’t a predictable statistic. A player or team can just decide to start or stop sealing bases very quickly. Also, Home Runs are less predictive than we’d like them to be. This makes sense from the math involved. Getting a home run projection wrong by 5 HRs is going to lead to a much larger delta than being off by 5 RBIs. Although number does get significantly more predictable from an efficiency perspective.

The next observation is that the Depth Charts per plate appearance numbers are way higher than the other per plate appearance projections. 

Compiling these together we can see which projection system is the best:
Offense Average % DELTAS Average % DELTAS ignoring Stolen Bases
Depth Charts 27.7% 24.8%
Yahoo-Rotographs 28.5% 25.6%
Zips 27.6% 24.7%
Steamer 27.4% 24.0%
The BAT 27.3% 24.7%
Depth Charts per PA 26.5% 23.3%
Yahoo-Rotographs per PA 22.6% 18.8%
ZIPs per PA 19.9% 16.3%
Steamer per PA 19.8% 15.7%
The BAT per PA 21.1% 17.7%

All of them are very close, so in function it may not really matter, but Steamer and ZIPs come out marginally ahead of the rest on an efficiency standpoint. 

Now let’s look at how the pitching data unfolded

Pitching IP W S K ERA WHIP FIP k/9
Depth Charts SP 34.4% 45.5% N/A 38.0% 24.1% 13.1% 15.3% 13.9%
Yahoo-Rotographs SP 18.8% 28.1% N/A 21.8% 19.2% 11.5% 20.6% 11.5%
Zips SP 34.2% 46.3% N/A 37.8% 24.9% 13.5% 16.7% 14.4%
Steamer SP no preseason data available
The BAT SP no preseason data available
Depth Charts RP 65.4% 81.3% 69.5% 66.2% 40.6% 21.4% 30.2% 16.0%
Yahoo-Rotographs RP 40.7% 65.2% 80.8% 42.7% 45.4% 19.9% 41.5% 13.4%
Zips RP 36.8% 60.2% N/A 39.2% 40.4% 21.6% 31.3% 16.1%
Steamer RP no preseason data available
The BAT RP no preseason data available
   

The pitchers are a different animal, you can see how much higher these percentages are relative to the hitter projections. The only thing marginally predictive here is the sabermetric ratios (WHIP, FIP, k/9) for the starting pitchers and the k/9 of the relief pitchers (although I really want to know what Yahoo-Rotographs figured out for their SP Innings Pitched projections because they are on point).
It makes sense now why my Hector Neris observation occurred. Relief Pitching statistics aren’t projectable. This can be for a number of reasons. For one, they only throw a much smaller numbers each year so a few bad performance can really ruin a year-long data point. Next, they are typically the less talented pitchers so their performance can fluctuate year in and year out. Finally, because there aren’t as many innings pitched thrown for the three year data sample, it is more difficult to predict upcoming year’s output because there just isn't as much data from which to pull.

The moral of the story for is: ignore the counting metrics, especially on the pitching side. WHIP, FIP, and k/9 are quite predictive though and should be used to decide which pitchers are going to perform well in the upcoming year.

To wrap it all up, what do we do with this? When setting up a draft board for this coming March, projections can be used to find base load player talent within a 20% margin or error for a hitter efficiency standpoint. Starting Pitcher sabermetric ratios are an even more reliable metric to project to about within 14%, so perhaps there would be more safety in the Starting Pitchers early in the draft…except that gross totals for Wins and Strikeouts are two-fifths of the Starting Pitcher categories (W, ER, K, ERA, WHIP) and fluctuate even more than gross totals for hitters…so that’s a wash. Relief Pitchers are almost entirely unprojectable, the error margin for their statistics is closer to 40%. 
For perspective on these numbers, the total stat accumulations by years end from the first place in a category in our league to last place place in a category is only 30% (as in, Dean’s 931 Runs last year were only 22% above Max’s 726 Runs), so being able to project to within 20% is very valuable. 

To be successful coming out of the fantasy draft, you need to find both the base load players, as well as the breakout players to succeed. Projections can be used on an efficiency and sabermetric perspective to be the most effective to fill that base load, opening up the draft so we can find those breakout players. 


Comments