March 25, 2011 7 Comments
Lately, I’ve been rather enamored with a the Silverlight PivotViewer control from Microsoft (all in the same realm as Power Pivot and the upcoming release of SQL Server Reporting Services). I’ve been following it since it released but just recently had a chance to play around with it…and I was definitely impressed. The general idea is that the PivotViewer makes it easy to visualize large amounts of data while allowing a user to easily slice into different aspects of the data. It is a magical blend when something really valuable also happens to be really easy to develop. Since it’s so easy, it’s hard not to try and find a bunch of different uses for it.
Hmm…so what has a lot of data that would be a cool visual and be interesting to slice in a lot of different ways? Wait, I know…the WP7 Marketplace data. Thankfully, Brandon Watson posted a quick overview on how to pull some data out of the Marketplace. In less than an hour and a half, I pulled down the entire application data within the marketplace (hovering very near to 11,000) and every associated application image (nearly 400 MB worth). Brandon did a great job in covering pulling the data, and the images were easy to pull too(albeit slower thanks to my internet connection). PLINQ is another great tool that helped shorten the time it took to download all of the images.
I then used the Pivot Collection Tool for the Command Line (pauthor) to actually build the CXML required for the PivotViewer. Despite the poorly named title of the tool it does come with a C# library component so don’t disregard it just because the name has Command Line in it. I’m working on a set of extensions to the pauthor tool that will make it even easier to automatically generate CXML from entities based on attributes. I’ll post some additional information on the technical details if anyone’s interested, but for now the main point of this post is the PivotViewer itself.
Let me first caveat this by saying that the still images really don’t do the experience justice. There are a lot of animations and the discovery and mining is very rapid. As I type this up, there’s a process running on my machine that is building out all of the CXML files. Once I’ve cleaned up everything a bit, I will make the PivotViewer available so that you can interact with it directly. But I wanted to show a preview of it first.
Here’s a visual example of all “Productivity” apps. Notice that it is grouped by Release Date. By simply changing the pivots, I can start to answer questions that may be harder to answer by static information. I’m in control of the data analysis. For instance, how many of these apps have…say…a rating better than 8?
You’ll notice this is a significantly smaller portion of the apps (no real surprise there). How many of these apps have been rated by more than 8 people?
Okay, well maybe that’s just because I’m being too picky about high rating apps. So let’s go ahead and see all of the apps that have been rated by more than 8 people…
Better, but that’s still a pretty small number compared to the total. Because it’s visual, it’s very easy and immediate to draw meaningful conclusions. Let’s switch gears entirely, and get a feeling for pricing rather than release date.
Woah, we clearly have some outliers here. By clicking on the one over in the $150 range, we can bring up some of the details for that particular app.
I don’t know about you, but $170 seems a bit crazy. Even after you remove the outliers, as you’d expect most apps are priced under $1.
It’s an absolute testament to the tools (and part of why I greatly appreciate Microsoft) that I can pull the data, organize it, and assemble a really powerful set of data analysis tools in less than 3 hours. These different types of questions can be answered quickly and because the tool is so flexible I can get answers nearly as quickly as I can come up with questions. Because the answers are often easy and quick to find, it only encourages more questions. It’s a very empowering experience to rapidly answer questions and gain different insight than you would normally get through standard charts and graphs.