IBM Scientist speaks on analyzing large data sets
Jeff Jonas, a Las Vegas resident, data expert, and popular conference speaker, shared how he uses his insights into data to shine a spotlight on the bad guys—be they casino scammers or international terrorists. In a keynote address at the annual gathering of the COMMON users group, which advocates and educates on behalf of Power Systems (IBM i, AS/400, iSeries, System i, AIX, and Linux), Jonas explained the tenets of nonobvious relationship awareness (NORA), and how he shows casinos and US government agencies how to piece large amounts of data together like a jigsaw puzzle to pop the miscreants.
Starting with a few simple assertions such as “the data must find the data” and “the relevance must find the user,” Jonas, Distinguished Engineer and Chief Scientist, Entity Analytic Solutions at IBM, provided a high-level look how he interrogates large data sets. Without context, he says, it’s impossible to analyze data across the silos in which it finds itself. He calls this state of affairs “enterprise amnesia.” How, he asked, do you stitch everything together—the siloed data, the structured data, and the unstructured data? He set the problem this way: “How do you accumulate context?” His answer: The arrival of each piece of data must be treated as a query. You can’t know until the data is asked “how does this relate to what I know?” how valuable a piece of information can be.
When data is treated this way, a state Jonas calls “persistent presence,” a seemingly innocuous puzzle part, such as an address change or a repeated misspelled name can give rise to an epiphany that connects other puzzle pieces or let the analyst know that previous assumptions have been false.
Of course if all data is marked this means that not only can the bad guys’ activities be traced, so can everyone else’s. We’re moving toward what Jonas called a “surveillance society.” He pointed to the ACLU clock on social surveillance, designed on the model of the Atomic Scientists’ Doomsday Clock, which says that we’re at six minutes to midnight—the witching hour for total surveillance. (The Atomic Scientists’ clock is currently at five minutes to midnight. It would be an interesting exercise to unsilo the data behind these two clocks.)
Jonas anticipates that piles of data will eventually become one collective intelligence stored in the cloud. As for life in 2050, he predicts that this “collective intelligence will evaluate what you need to know and tell you.” Privacy protections will be a matter of skillful coding. And of course eternal vigilance.