Skip to main content

Data Quality Alert! Data Quality Alert!

There's a new piece over at Hockey Analytics which I heartily recommend to those interested in furthering the use of statistics related to NHL hockey. Alan Ryder pioneered the investigation of Shot Quality, which attempts to measure the characteristics of shots (distance, type, situation) to provide a more finely detailed view of offensive and defensive performance. I use a slightly simplified version of Alan's SQ techniques in my analysis here quite often, so when the article entitled "Product Recall Notice for Shot Quality" was posted, it definitely caught my eye. While it is obvious to anyone who has read through the NHL's play-by-play files that data quality problems exist, the presumption has been that these errors are basically random and cancel each other out over the course of 70,000+ shots in an NHL season.

By looking at arena-by-arena details, however, Alan has raised some pretty serious issues with the data, basically demonstrating that scorers in different venues seem to have systematic biases in how shots are recorded. Games played at Madison Square Garden, for example, consistenly have the most dangerous shots recorded in the logs, whereas scorers in Buffalo and Tampa tend towards the opposite view. The implications are that first of all, we always need to keep in mind the limitations of the data that the NHL presents to us, and secondly, look into possible means of correcting for such biases (by using something like the "park effect" that baseball stats junkies use). I guess I've got one more thing added to my summer to-do list...

Back in March I did something similar along the lines of the Giveaway/Takeaway stats, as well as how frequenty different scorers record Missed Shots vs. Saves. In my Give/Take and Missed Shot pieces, for example, I looked at how teams performed at home, how they performed on the road, and how visitors performed in their building, in order to isolate the effect of the official scorer. It was interesting to see that games in Chicago feature an absurdly low number of Giveaways and Takeaways by either team, while in Montreal or Edmonton the per-game figures are five times higher or more!

The potential for statistical analysis to extend our understanding of professional hockey remains largely untapped, but the quality of the data being recorded is a critical obstacle that needs to be overcome if we're to make the best progress we can. I'm not quite sure how best to pursue this issue with the NHL, but I'm open to suggestions.

Popular posts from this blog

Cheer up, it's the holidays...

Why is it that various media outlets continue trying to put their own spin on the "what's wrong with the NHL" story? Our latest example comes from The Hockey News , in a piece by Jay Greenburg entitled, "Excitement Level On The Decline." Take the opening sentence: Attendance is down and yet still up from before the lockout, leaving it arguable whether buildings in New Jersey and Florida are half-full or half empty. It's no surprise that attendance is down from last season, particularly if you compare the first half of 2005-06 to the first half of this year. Coming out of the lockout, there were legions of fans starved to see the on-ice product, particularly in light of the massive rule changes. This year is more indicative of business as usual, so the fact that the league is above pre-lockout levels is a positive. Toss in the projection that overall revenues are increasing despite a 1% decrease in attendance, and I'd say that paying fans have come back

How I'm Trying To Make Money Sports Blogging

To kick off this series of articles general sports-blogging articles here at OTF Classic, I think it's best to start with a comment that Brad left here last week, after I shared my goals for 2012 , which include specific revenue targets: I considered diving into the world of internet marketing myself, but I felt that my friends would hate me for bugging them about stuff. I mean, it's pretty low-risk high-reward, so it's tempting. I wouldn't mind reading about tips on how to maximize impact of blogging in general to make it a legitimate income source. Trying to make money at sports blogging can be a very touchy subject - for the vast majority of us, this is an activity we pursue to both exercise our creativity and share our love of the game, whether it's hockey, football, badminton, whatever, with fellow fans. Mixing that personal conversation with a commercial message can turn people off, especially if it becomes too intrusive for the reader. It's not unrea

Celebrating a milestone month

I've been remiss in providing regular updates on my quest to turn this whole sports-blogging hobby into at least something of a significant side income, if not a career, but good news has a way of prompting action. That, and I've been heads-down busy working on a few different fronts to push things forward...