Showing posts with label Statistics. Show all posts
Showing posts with label Statistics. Show all posts

Wednesday, November 5, 2008

Midseason Progressive Results

And now, what everyone's been waiting for. We're about halfway through the season, so I think it's time to post NFLSim's impressive progressive results. I chose to leave out week 2 because It was still too early for me to be making predictions based on 1 week of data. I'm going to post Accuscore's results as well for a comparison. They're the only other play-by-play NFL simulator I know of and they're a well established, well funded, syndicated, sponsored, and mathematically sophisticated operation. David and Goliath? Let's see...

I'm going to give you several different numbers. First, I'll give you the overall numbers, as in the collective 50-100% predictions for winner, spread, and o/u. Then you'll get their numbers broken down. I'll show you weekly trends, % trends, etc.

Overall numbers:
Winner: 65-34 (65.7%)
Spread: 47-45 (51.5%)
O/U: 50-43 (53.8%)
Spread and O/U combined: 97-88 (53.8%)

Accuscore numbers:
Winner: 87-43 (66.9%)
Spread: 51-50 (50.5%)
O/U: 67-50 (57.2%)
Spread and O/U combined: 118-100 (54.1%)

Accuscore has a slight edge when picking the winner, I have a slight edge picking ATS, and Accuscore has a sizeable advantage picking O/U. BUT! Let's look at how Black Box Sports picks compare when confidence is at least 60%. This confidence is really where NFLSim shines. Here is Black Box Sports's record when the confidence is over 60% compared to Accuscore's overall record (can't find any confidence values for picks). Fasten your seat belts.

Black Box Sports +60%:
Winner: 41-19 (68.3%) ... 22-8 (73.3%) when greater than 70%
Spread: 36-21 (63.2%)
O/U: 21-10 (67.7%)
Spread and O/U: 57-31 (64.7%)
Betting 100 units on the spread and over/under, you made +2290, ROI of 26%, halfway through the season.

Once again, Accuscore's numbers:
Accuscore:
Winner: 87-43 (66.9%)
Spread: 51-50 (50.5%)
O/U: 67-50 (57.2%)
Spread and O/U combined: 118-100 (54.1%)

Picking the winner, I'm ahead by 1.4% when at least 60% confidence, 6.4% when at least 70% confidence. Spread, I'm ahead by 12.7%. O/U, up by 10.5%. Combined, I'm up by 10.6%. That's what I'm talking about. Not to mention I offer all the picks for free... Spread the word everyone.

In tabular format:
Winner
Wins Games Win %

50-59 24 39 61.5%

60-69 19 30 63.3%

70-79 17 24 70.8%

80-89 4 4 100.0%

90-100 1 2 50.0%

Total 65 99 65.7%





Spread
Wins Games Win %

50-59 11 35 31.4%

60-69 24 37 64.9%

70-79 11 16 68.8%

80-89 0 3 0.0%

90-100 1 1 100.0%

Total 47 92 51.1%





Over
Wins Games Win %

50-59 29 62 46.8%

60-69 15 22 68.2%

70-79 2 4 50.0%

80-89 2 3 66.7%

90-100 2 2 100.0%

Total 50 93 53.8%

S&O/U





50-59 40 97 41.2%

60-69 39 59 66.1%

70-79 13 20 65.0%

80-89 2 6 33.3%

90-100 3 3 100.0%

Total 97 185 52.4%

Graphically:
Theoretically, as in, if NFLSim was a perfect analog of reality, those dashed lines would be perfectly in line with the thick black line. It would mean that the confidence values are always spot on and the games end exactly the way they should. If the trend (dashed) lines are below the thick line, the confidence values are not as accurate as reality. The more parallel the thick and trend lines are, the more accurate the changes in confidenc values are, i.e., as confidence increases, the accuracy increases at the correct rate. If that makes any sense. This graph shows all the picks.


You'll notice that above, the win % for 90-100 is at 50%. In week 3, the 90.03% favorite NE lost to MIA. 0.03% is just about a difference of 1 game in the entire set of hundreds and hundreds of simulated games. Had NE been an 89.97% favorite, the graph would look like this:


Check out the 'Wins' line. The Wins line overlaps the theoretical line. You can't even see it. That's absolutely absurd, especially after 100 games. The trendline has a slope of .09, compared to the theoretical line's .1. The "Wins" trendline has an R-squared value (a measure of how closely the data points fit the line) of 0.89. Absolutely insane. In general, I try to temper my enthusiasm, but this is unbelievable... This means that when NFLSim says a team will win 63% of the time, that team will win 63% of the time. For those of you math-minded people, the expected number of wins is approximately 64.15. The actual number of wins is 65. That blows my mind.

Here's a team-by-team accuracy breakdown:

Win % Cover Spread % Over %
ARI 83.3% 40.0% 50.0%
ATL 66.7% 50.0% 33.3%
BAL 57.1% 42.9% 71.4%
BUF 66.7% 50.0% 66.7%
CAR 83.3% 40.0% 50.0%
CHI 50.0% 50.0% 66.7%
CIN 71.4% 71.4% 33.3%
CLE 42.9% 42.9% 57.1%
DAL 57.1% 57.1% 57.1%
DEN 50.0% 33.3% 20.0%
DET 100.0% 66.7% 40.0%
GB 100.0% 66.7% 66.7%
HOU 71.4% 57.1% 85.7%
IND 33.3% 50.0% 33.3%
JAX 33.3% 66.7% 60.0%
KC 83.3% 66.7% 83.3%
MIA 33.3% 66.7% 50.0%
MIN 83.3% 66.7% 50.0%
NE 33.3% 20.0% 20.0%
NO 66.7% 33.3% 50.0%
NYG 66.7% 16.7% 20.0%
NYJ 71.4% 57.1% 57.1%
OAK 57.1% 14.3% 57.1%
PHI 66.7% 0.0% 66.7%
PIT 50.0% 66.7% 66.7%
SD 83.3% 50.0% 33.3%
SEA 83.3% 50.0% 33.3%
SF 50.0% 33.3% 40.0%
STL 57.1% 50.0% 28.6%
TB 71.4% 42.9% 57.1%
TEN 100.0% 66.7% 83.3%
WAS 42.9% 28.6% 33.3%

I'll post some more stats if I have a chance.
Enjoy!

I'd love to hear everyone's reactions and questions, so don't be shy, send me some emails.

Wednesday, October 15, 2008

Passer Rating Explained

One of football's least understood metrics is the Passer Rating. It's an attempt by the statisticians to quantify a quarterback's performance, but how does it work? What does it take into account? Why the hell is the maximum rating 158.3? Given the evolution of the position, is it really a good measure of a quarterback's ability?

The QB's passer rating looks at four elements: completions, yards, interceptions, and touchdowns, all on a per completion basis. It does not include fumbles, sacks, or any type of rushing statistics. The maximum rating is 158.3 as opposed to a more logical 100.

The first element, Completions, "C" uses the formula: C = (COMP/ATT * 100 - 30)/20.
The third element, Interceptions: I = 2.375 - (INT/ATT * 25)
The fourth element, Touchdowns: T = TD/ATT * 20
The second element, Yards: Y = (YDS/ATT - 3)/4

The maximum possible value for each category is 2.375 and the minimum is 0; if COMP/ATT = .8, then C = 2.5. Since the maximum is 2.375, C equals 2.375, not 2.5. If COMP/ATT = .2 then C = -0.5. Since them minimum is 0, C equals 0. If COMP/ATT = .6 then C = 1.33. Finally, plug the values into the formula: (C+I+T+Y)/6 * 100. The result is the passer rating.

Using Tony Romo as an example, from the heartbreaking DAL-ARI game, we see that Romo had 24 completions, 38 attempts, 321 yards, 3 touchdowns, and 0 interceptions.

Therefore:
C = (24/38 * 100 - 30)/20
C = 1.66

I = 2.375 - (0/38 *25)
I = 2.375

T = 3/38 * 20
T = 1.58

Y = (321/38 - 3)/4
Y = 1.36

Rating = (1.66+2.375+1.58+1.36)/6 * 100
Romo's Rating = 116.2

That wasn't too tough.

The passer rating was developed by Don Smith and was first used in the 1973 season. Smith was a statistician and executive with the Pro Football Hall of Fame.1 The odd 158.3 limit came about when he decided that the average QB rating should be 66.6. That was pretty close to reality back in '73 when the average passer rating was 64.7, but last year, in '07, the average passer rating was 82.5, an increase of 28% over the decades. The idea behind his calculations was that he wanted a way to compare all QB's to the average QB at the time. Perhaps Smith's rating made sense 35 years ago when the numbers he used to calibrate the formulae were relevent, though obviously circumstances have changed since then. Completion % has gone from 51% to 61%, yards per game from 140 to 214, TD's from 1.0 to 1.4, etc.

That's a basic overview of passer rating. Once you get through the numbers, it starts to make sense. From now on when your buddies complain about that ridiculous stat, impress them with your knowledge of C.I.T.Y (comp, int, td, yds).

For you stats-minded people out there, maybe this gives you some ideas on how to make a modern formula that makes sense. When you do make one, post it for some peer review, it will be interesting to see what everyone can think up. I'm working on a formula of my own now, so stay tuned.


1. www.baseball-statistics.com/Greats/Century/passer-rating.htm

Monday, February 4, 2008

2007 Picks and Bets

Before I start listing everything, I'll post this website: http://sports.espn.go.com/nfl/features/talent. Granted they're NFL experts, but what do they know anyways? The column you should check out is Accuscore, the industrially-sized handicapping website (also the only other play-by-play simulator). They finished the season 163-89, or 64.7%. But wait- what really matters is the spread, right? Well, sort of, but we'll get into that later. Accuscore claims an (un)impressive 54.7% spread prediction on the season, with spread and over/under combining for 60%. But isn't break-even about 53%? Hmm...

By the way, from weeks 12 through 16 (for week 17, it's not worth the work if most teams don't even show up), I was 67.5%. Accuscore was 63.7%.

Get ready for lots of numbers, as there are many weeks to catch up on:

Winning predictions in green, losers in red, ties in black.

Week 12 Win %
GB 52%
DET


NYJ
DAL 72%


IND 72%
ATL


TEN 58%
CIN


HOU 53%
CLE


OAK
KC 60%


SEA 60%
STL


MIN 56%
NYG


WAS
TB 75%


NO 58%
CAR


BUF
JAX 57%


SF 52%
ARI


DEN 70%
CHI


BAL
SD 65%


PHI
NE 78%


MIA
PIT 81%

The table: If % Range is between 60-69, it includes the 3 teams where the win probability fell in that range, i.e. SD, SEA, and KC. Of those 3 teams, 2 of them won. Therefore the win % for teams in the 60-69% range is 66.7%.

Total % Range
Games Wins Win %

50-59 7 5 71.4%

60-69 3 2 66.7%

70-79 5 4 80.0%

80-89 1 1 100.0%

90-100 0 0 0.0%

Total 16 12 75.0%

My First Post, an Introduction

If you've read the 'About Me' column then you already know about me and my program. Over the past 6 months, I've worked everyday to perfect this simulator. By week 12 I was able to start picking games straight up. I managed to get the spread/over/under feature working just in time for the playoffs.

It hadn't occurred to me to create a blog to document my progress until...this afternoon. I stumbled onto a blog here during my search for solid statistics showing the win percentages of 'experts', handicappers, and websites- some of which, as you know, charge hundreds of dollars for one week of picks. I was curious to see how the success of 'expert handicappers with 20 years of experience' compared to the success of 'some kid with with a computer'. I was ecstatic to see that not one entity came close to matching me.

Unfortunately, my lack of a blog has prevented me from sharing my prognostications with the public until now. Therefore, I shall present to you all of my results, along with all 'theoretical' bets* that may have been placed on those picks. This will demonstrate both the accuracy and efficacy of my program.

*The amount of each bet may seem weird- the size of each bet was determined using a formula which takes into account the probability of a win, the payout, and the bankroll, in order to maximize profit and minimize losses...duh.