Guest Post: Problem Definition and Workflows in Analytics Success

This is a guest post by Michael Zargham, who gave a sadly under-attended but content-rich talk this morning. I thought his framing was spot on around how to set up an analytics group to repeatedly and efficiently explore potential new “well-defined, solvable, impactful” problems, determine what should be built, do the research to actually get the problem solved, and deliver it in the best possible way to get the desired results. This organizational and culture problem is incredibly important to get right. These nested workflows seem like a great way to think about it.

Michael Zargham is Director of Data Science at Cadent, a media technology firm specializing in advanced analytics for advertising campaign execution. Michael has been a practicing data scientist since 2005, completed his PhD in optimization and decision theory from the University of Pennsylvania in 2014, and founded the data science team at Cadent in 2015.


A new analytics project often converges from many disparate forces and stakeholders may have diverging even competing incentives. These ad hoc research projects often dominate the backlogs of data science teams and analytics research teams but the nature of such projects is naturally recursive; how can we know what we need to do until we start exploring the data? The most common approach is to dive right in and immediately start “solving” the problem and iteratively discover the best approach. I am a huge fan of build it, break it, make it better, but all too often we miss the step where we clearly articulate what “it” is.

LifecycleIn our data science research and development workflow at Cadent, we have an explicit process for formally defining the problems our analytics research projects; this is the “business intelligence” workflow in the figure. This process acknowledges that business stakeholders have deep expertise and hold strong opinions about what form the solution should take, even without framing their questions sufficiently formally that applications of machine learning or optimization are directly applicable. Disambiguation and formalization of the stated problem falls on analytics professionals (data scientists, researchers and developers) who must rectify the stakeholders’ desires with that which is technically feasible. Even still the opinion of the data will need to be weighed. One could state a formally valid question for which the tools and data available seem perfectly suited online to find that the data doesn’t support the business hypothesis and as a result the proposed solution could not provide the intended business value.

In order to consistently deliver business value, we do research projects which establish the scope of our analytics development projects. Since we follow an agile analytics methodology this can be described in sprints. While one cannot always determine how many research sprints will be required to create a well-defined solvable problem which stakeholders agree is relevant, we have found that 1 to 2 two-weeks sprints is sufficient in most cases and this bench mark is used for project planning purposes. Often the most challenging element of these scoping projects is getting sign off from stakeholders who are generally less knowledgeable about the technologies in play, thinking of advanced analytics as a form of mathemagic. Furthermore, many assume that the analytics rather than their own beliefs are incorrect when their priors are not corroborated by data.

This challenge is not overcome over night, but rather through building relationships with key stakeholders, understanding their incentives and the particular angle from which they see the business. Using data, technology and business expertise we merging all those views into a complete picture, allowing incentive alignment and coordinated decision making. Over time, analytics prove their value and trust in the in new technology is built. At Cadent, our data driven transition has been measurable in corporate KPIs and financial reports but the most rewarding change has been the shift to a data driven culture.