Aristotle suggested that the whole of a system – be it a human body, an organization, or a device – is more than the sum of the parts. Most recently, a brilliant gentleman named Gary Flake* has been applying this same concept to information that surrounds us.
Every day, we’re inundated with massive amounts of data. Picking a vacation destination (hundreds of options). Picking a puppy for the kid (hundreds of breeds). Buying a holiday gift (millions of options).
How do you make these choices? I’ll make a bet:
- Start with a couple options that you heard about from your friends.
- Google those options, read some webpages, read about them on Wikipedia.
- Read reviews on other sites, keep a few tabs open in your browser at once, compare-and-contrast a couple options.
- Find a few new options – maybe because they have good reviews – and add those to the mix.
- If you’re like me, rinse and repeat the last two steps for 3 hours.
- Finally, make your purchase.
Here’s a question for you: what are you really doing during that research phase? You’re looking for a needle in a haystack of options. You have a mental model – Plato’s Form of what the ideal option is – and you keep sifting through the dataset, one item at a time, comparing each item against the ideal.
There are problems with this approach: it takes forever, it leaves HUGE gaps in your data analysis, and it leaves you frustrated with the process – and frequently, the result.
How can we make this better? Flake’s answer: by merging the concepts of browse and search together. Let’s look at the puppy breed task in detail. Typical Google search is fantastic for finding an answer, not outlining the shape of the landscape that you’re unfamiliar with. Thus, when you search for a specific breed, you’ll quickly find a Wikipedia page about that breed – with pictures for it. But how can you find other breeds like it? Does this breed live long or not – it’s really impossible to tell, without being a subject matter expert, whether 12 year life expectancy is a lot or a little.
What we’re really want to do is to see the data as a whole. ALL breeds at once, not just one breed. With that, patterns will pop: hmmm, 12 years is about average for all dog breeds. Hmmm, small dogs tend to live longer. There’s no way to find that insight from individual Wikipedia pages describing the dog breeds. Even if you were to read every one of those pages, you couldn’t keep every detail in memory – thus, missing out on important trends in the data.
Let’s look at a website that almost delivers on this promise: Amazon. Search for a digital camera inside the Electronics department; you’ll see a bunch of pages of results. Here’s a key, though: on the left side, you’ll notice some characteristics of the ENTIRE dataset. Those characteristics make some non-obvious things pop: Panasonic is in the Top 3 manufacturers, I didn’t even know they made digital cameras; products are evenly distributed across the 1 to 5 star review scale. WOW, cool – now that I know this, I can (1) trust the review system – people don’t just rate everything 5 stars and (2) buy myself a Panasonic camera, because they’re my “trusted brand.”
See how by looking at the forest, I was able to approach my selection from a completely different angle? I narrowed in on a subset of the data through a holistic, systemic analysis. This is profoundly different from the haphazard approach of “reading the reviews on the top 5 search results” that we’re so accustomed to today.
Moreover, another pattern becomes possible: when you saw the forest, noticed an outlier, and learned about it, you can jump back out and expand your query in brand new directions. Wait, that Panasonic camera is WAY cheaper than I expected an SLR to be… Are there other SLR’s in the price range where I’m not even looking?..
This type of interaction – combining search and browse – allow you to consume the dataset as a whole. And the whole finally becomes more than just the sum of the parts. If you found this at all curious, I encourage you to look at this 6-minute TED video and this project called PivotViewer.*Full disclosure: Gary Flake has been my boss for two years. The intent of this article isn’t about ass-kissing – he is no longer my boss. He’s not even in the same company.