Speed vs. Scalability
Had an interesting debate with SPRIG President Rick Carter yesterday after the Spreadsheet Guru contest, regarding Speed versus Scalability in spreadsheet modeling, specifically when deciding whether to manually enter data or automate with formulas.
One of the things the Spreadsheet Guru contestants had to do was clean some messy input data. Here’s an example. From “Commonwealth of Australia Population in 2010 22.33 (in millions of persons)” you needed to isolate the country, year and value — in this case, Australia, 2010 and 22.33M. This had to be done for 10 countries for 2 years.
Most of the contestants (myself included) wrote some type of text formula. Rick advocated just manually entering the data because the number of data points was so low. There are pros and cons to each approach. The manual approach is quicker for small datasets, but isn’t scalable for large datasets or if the analysis becomes repeatable. The automated approach can take longer to implement, especially if the formulas are complicated. But as the datasets get larger or if the analysis is repeated, this approach makes more sense.
For the contest, Rick probably had it right. In your day-to-day life of building and using models however, especially in data cleansing, it’s not always so obvious. Before deciding whether to go manual or automated, consider these 2 factors:
- How much extra time will it take to automate the process?
- How likely is it that the model will be used again? (be honest…)
What’s your experience?