The gap starts small.
What we see is pretty much what we know. The questionnaire, the response rates, the frequencies, and the datasets. We can see it all, from top to bottom, nothing hidden.
But over time we started tracking other things.
Every click of a button, every swipe of a credit card, every watch of a video, every share of a tweet, every open of an email, every view of an image, every friend of a reader, every opening of a door, every drive down the road, and every degree change in your thermostat can now be recorded and stored.
And for most people, this data is practically invisible. Just a necessity that allows our technological world to function. Something we choose not to think about too hard.
But it all has potential.
Correlation may not be causation, but it can be predictive.
Did you know that if you feed a computer a bunch of correlated data that it has the potential to make really good predictions?
Based on what we’ve watched in the past, and what other people like us watch, Netflix can guess what we might want to watch next.
Based on sets of player stats, baseball teams can use algorithms to figure out how to get the most value for their money.
Based on what you buy, Target can tell if you’re pregnant.
Based on some characteristics fully outside your control, courts can tell if you are likely to find your way back to jail and they can sentence accordingly…
Like it or not, data science is growing in usage as an evaluation tool. We are trusting unknown equations and variables (fair and considered or not) to help shape the way we see the world and make decisions.
And the biggest source, larger datasets that are unlikely to be utilized in any other way. Stuff that is hard to understand due to complexity or scale.
The unseen data that rests in the area between what we see on a regular basis, and what we could see but likely won’t.
The argument for bridging the gap.
Remember, with enough correlations you can make better predictions. With better predictions you can justify using algorithms based on correlations.
And in this way, data science grows.
Turning mostly overlooked data into incredibly powerful tools. Tools that may or may not be viewing the data in a way that is fair and just. Tools that see data in a way that does not necessarily reflect the ethics of an organization.
Good interactive design has the potential to reduce the gap between what we see, and what only the computers see.
And in that way, we can provide a view of the numbers behind the numbers. We can create a way in for people to see data they usually can’t see.
Or…
We close our eyes to the growing data gap.
And just keep presenting data, louder and slower.
Michael Harnar
This is good Chris. These are similar to concerns I’ve been having about big data and its retrospective use in evaluation rather than a prospective approach of having us at the table from the start. Create meaningful data to answer well-founded questions so that stakeholders can take ownership in the learning from the data.