Skip to main content

Data Quality Alert! Data Quality Alert!

There's a new piece over at Hockey Analytics which I heartily recommend to those interested in furthering the use of statistics related to NHL hockey. Alan Ryder pioneered the investigation of Shot Quality, which attempts to measure the characteristics of shots (distance, type, situation) to provide a more finely detailed view of offensive and defensive performance. I use a slightly simplified version of Alan's SQ techniques in my analysis here quite often, so when the article entitled "Product Recall Notice for Shot Quality" was posted, it definitely caught my eye. While it is obvious to anyone who has read through the NHL's play-by-play files that data quality problems exist, the presumption has been that these errors are basically random and cancel each other out over the course of 70,000+ shots in an NHL season.

By looking at arena-by-arena details, however, Alan has raised some pretty serious issues with the data, basically demonstrating that scorers in different venues seem to have systematic biases in how shots are recorded. Games played at Madison Square Garden, for example, consistenly have the most dangerous shots recorded in the logs, whereas scorers in Buffalo and Tampa tend towards the opposite view. The implications are that first of all, we always need to keep in mind the limitations of the data that the NHL presents to us, and secondly, look into possible means of correcting for such biases (by using something like the "park effect" that baseball stats junkies use). I guess I've got one more thing added to my summer to-do list...

Back in March I did something similar along the lines of the Giveaway/Takeaway stats, as well as how frequenty different scorers record Missed Shots vs. Saves. In my Give/Take and Missed Shot pieces, for example, I looked at how teams performed at home, how they performed on the road, and how visitors performed in their building, in order to isolate the effect of the official scorer. It was interesting to see that games in Chicago feature an absurdly low number of Giveaways and Takeaways by either team, while in Montreal or Edmonton the per-game figures are five times higher or more!

The potential for statistical analysis to extend our understanding of professional hockey remains largely untapped, but the quality of the data being recorded is a critical obstacle that needs to be overcome if we're to make the best progress we can. I'm not quite sure how best to pursue this issue with the NHL, but I'm open to suggestions.

Popular posts from this blog

Cheer up, it's the holidays...

Why is it that various media outlets continue trying to put their own spin on the "what's wrong with the NHL" story? Our latest example comes from The Hockey News , in a piece by Jay Greenburg entitled, "Excitement Level On The Decline." Take the opening sentence: Attendance is down and yet still up from before the lockout, leaving it arguable whether buildings in New Jersey and Florida are half-full or half empty. It's no surprise that attendance is down from last season, particularly if you compare the first half of 2005-06 to the first half of this year. Coming out of the lockout, there were legions of fans starved to see the on-ice product, particularly in light of the massive rule changes. This year is more indicative of business as usual, so the fact that the league is above pre-lockout levels is a positive. Toss in the projection that overall revenues are increasing despite a 1% decrease in attendance, and I'd say that paying fans have come back ...

How I'm Trying To Make Money Sports Blogging

To kick off this series of articles general sports-blogging articles here at OTF Classic, I think it's best to start with a comment that Brad left here last week, after I shared my goals for 2012 , which include specific revenue targets: I considered diving into the world of internet marketing myself, but I felt that my friends would hate me for bugging them about stuff. I mean, it's pretty low-risk high-reward, so it's tempting. I wouldn't mind reading about tips on how to maximize impact of blogging in general to make it a legitimate income source. Trying to make money at sports blogging can be a very touchy subject - for the vast majority of us, this is an activity we pursue to both exercise our creativity and share our love of the game, whether it's hockey, football, badminton, whatever, with fellow fans. Mixing that personal conversation with a commercial message can turn people off, especially if it becomes too intrusive for the reader. It's not unrea...

My Letter To Gary

Dear Mr. Bettman, When the announcement was made a few weeks ago that Jim Balsillie had entered into an agreement to purchase the Nashville Predators, speculation immediately began that a relocation to South Ontario would come in short order, and many hockey fans in the Nashville area jumped to the conclusion that we'd see a "Major League" scenario, whereby the new owner would deliberately undermine local support of the team so as to trigger the escape clause in the team's arena lease. As for myself, I decided to give Mr. Balsillie the benefit of the doubt - surely as a lifelong hockey fan and player, he wouldn't do such a thing after acquiring one of the best young teams in the game, with the Stanley Cup potentially within reach. I've waited and watched over recent weeks, and was initially encouraged by Balsillie's promise to field a competitive team, giving GM David Poile an ample budget to put together the best team possible. His legal representative ...