What is a “Data Product”?

by Harlan Harris on March 31st, 2014

In his keynote, Tom Davenport talked a lot about “Data Products”. But what’s a data product, and what else would be the product of your work with data?

Two prominent takes on data products come from Mike Loukides, of O’Reilly, and DJ Patil, formerly of LinkedIn, and now with Greylock Partners. DJ Patil says “a data product is a product that facilitates an end goal through the use of data.” So, it’s not just an analysis, or a recommendation to executives, or an insight that leads to an improvement to a business process. It’s a visible component of a system. LinkedIn’s People You May Know is viewed by many millions of customers, and it’s based on the complex interactions of the customers themselves.

Mike Loukides asks “Do we want products that deliver data? Or do we want products that deliver results based on data?” I don’t think that Excel is a data product, even though (in good hands) it takes data and generates deliverable insights and recommendations. A data product has to be a system that was built to address a particular type of problem.

A friend of mine and co-Data Community DC board member, Sean Murphy, further argued that data products have gotten drastically cheaper in recent years, as the cost of collecting, storing, and analyzing data has collapsed, the cost of developing and selling a product has also collapsed, and the demand for these differentiating products has skyrocketed. (Blame Dr. Davenport and his “competing on analytics” meme!)

I recently wrote a blog post about data products, arguing that they’re at the intersection of four things: data, domain knowledge, software engineering, and analytics. PYMK is a data product because it’s based on customer data, it’s based on an understanding of the ways that social networks work, it’s built in software, and it uses mathematical recommendation systems under the hood. UPS’s route optimization system is also a data product, as it’s based on truck and traffic data, it has an understanding of traffic dynamics, it’s built in software, and it uses sophisticated mathematical programming under the hood. Many OR systems over the years fit this definition well.

Here’s a Venn diagram I made as part of my post. Click through to read more of my (likely muddled) thoughts on data products.

This diagram is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Thoughts? Leave a comment, find me in the hallways, or tweet me @harlanh!


From Uncategorized

Comments are closed for this entry.