Skip to main content

Data Quality Alert! Data Quality Alert!

There's a new piece over at Hockey Analytics which I heartily recommend to those interested in furthering the use of statistics related to NHL hockey. Alan Ryder pioneered the investigation of Shot Quality, which attempts to measure the characteristics of shots (distance, type, situation) to provide a more finely detailed view of offensive and defensive performance. I use a slightly simplified version of Alan's SQ techniques in my analysis here quite often, so when the article entitled "Product Recall Notice for Shot Quality" was posted, it definitely caught my eye. While it is obvious to anyone who has read through the NHL's play-by-play files that data quality problems exist, the presumption has been that these errors are basically random and cancel each other out over the course of 70,000+ shots in an NHL season.

By looking at arena-by-arena details, however, Alan has raised some pretty serious issues with the data, basically demonstrating that scorers in different venues seem to have systematic biases in how shots are recorded. Games played at Madison Square Garden, for example, consistenly have the most dangerous shots recorded in the logs, whereas scorers in Buffalo and Tampa tend towards the opposite view. The implications are that first of all, we always need to keep in mind the limitations of the data that the NHL presents to us, and secondly, look into possible means of correcting for such biases (by using something like the "park effect" that baseball stats junkies use). I guess I've got one more thing added to my summer to-do list...

Back in March I did something similar along the lines of the Giveaway/Takeaway stats, as well as how frequenty different scorers record Missed Shots vs. Saves. In my Give/Take and Missed Shot pieces, for example, I looked at how teams performed at home, how they performed on the road, and how visitors performed in their building, in order to isolate the effect of the official scorer. It was interesting to see that games in Chicago feature an absurdly low number of Giveaways and Takeaways by either team, while in Montreal or Edmonton the per-game figures are five times higher or more!

The potential for statistical analysis to extend our understanding of professional hockey remains largely untapped, but the quality of the data being recorded is a critical obstacle that needs to be overcome if we're to make the best progress we can. I'm not quite sure how best to pursue this issue with the NHL, but I'm open to suggestions.

Popular posts from this blog

My goals for 2011: Make sports blogging pay off

In my never-ending quest to figure out a model for making what is currently my hobby & passion into something bringing in at least a side income, I've decided to set a couple goals for myself to complete during the rest of 2011. Simply put, I plan to publish two products over the next few months, which I hope will provide real value to hockey fans, and that they'll be willing to pay for. Will it succeed? Will it fail? The only way to know is to put my nose to the grindstone and get these two things done (I'll keep the details under my hat for now). The important thing to note is that these efforts are in addition to anything I'm doing over at OTF . Taking away what we're doing over there and asking people to pay for it is a surefire lose-lose all the way around, because if there's anything we've learned over the last few years, it's that people love to read about sports, but only for free. I'm also optimistic about Hockey Gea...

My Letter To Gary

Dear Mr. Bettman, When the announcement was made a few weeks ago that Jim Balsillie had entered into an agreement to purchase the Nashville Predators, speculation immediately began that a relocation to South Ontario would come in short order, and many hockey fans in the Nashville area jumped to the conclusion that we'd see a "Major League" scenario, whereby the new owner would deliberately undermine local support of the team so as to trigger the escape clause in the team's arena lease. As for myself, I decided to give Mr. Balsillie the benefit of the doubt - surely as a lifelong hockey fan and player, he wouldn't do such a thing after acquiring one of the best young teams in the game, with the Stanley Cup potentially within reach. I've waited and watched over recent weeks, and was initially encouraged by Balsillie's promise to field a competitive team, giving GM David Poile an ample budget to put together the best team possible. His legal representative ...

Cheer up, it's the holidays...

Why is it that various media outlets continue trying to put their own spin on the "what's wrong with the NHL" story? Our latest example comes from The Hockey News , in a piece by Jay Greenburg entitled, "Excitement Level On The Decline." Take the opening sentence: Attendance is down and yet still up from before the lockout, leaving it arguable whether buildings in New Jersey and Florida are half-full or half empty. It's no surprise that attendance is down from last season, particularly if you compare the first half of 2005-06 to the first half of this year. Coming out of the lockout, there were legions of fans starved to see the on-ice product, particularly in light of the massive rule changes. This year is more indicative of business as usual, so the fact that the league is above pre-lockout levels is a positive. Toss in the projection that overall revenues are increasing despite a 1% decrease in attendance, and I'd say that paying fans have come back ...