Does Google Connect the Dots?

Can a search engine connect the dots to help track persons of interest?

Can a search engine connect the dots to help track persons of interest?

David Linthicum of Informatica recently wrote about how data integration can improve our counterterrorism efforts. I agree that data integration, along with entity resolution and complex event processing, should play a key role in counterterrorism and relieve some of the burden of manual processes that are required today to connect the dots.

David lays out a seven step plan for leveraging data integration to get there. While I mostly agree with his conclusion, I’d like to offer an alternate view on a couple of his points.

First, David quotes a US official who says, “[T]he system to connect the dots at the speed of light when triggered by any one piece of information is not in place. It exists, [though], because Google does it every day.”  David contends that while some integration exists, it does not appear to be working.

Further, while systems to connect the dots at the speed of light do exist, Google is not one of them. Google is great at what it does, but it doesn’t resolve identities and apply predictive analysis at the speed of light.

Take, for example, Google Alerts. I sometimes get alerts for content that is days, even years old. I recently got an alert for a press release a competitor made two years ago. I also get a lot of superfluous alerts that have little to do with the topic I’m really looking for. One alert for “identity resolution” pointed me to a policy statement titled “Resolution on transgender, gender identity, and gender expression non-discrimination.”

Imagine what would happen if our intelligence community had to rely on a system that generated alerts for threats that had already been carried out (see my post on the need for real-time border security), or inundated analysts with excessive alerts that hid the signal of real threats within a mass of noise.

That would exacerbate the problem of too few analysts trying to assimilate too much information in too little time.

Admittedly, neither David nor the US official he quoted were advocating using Google to overcome intelligence failures. But a comparison of Google’s capabilities to the true nature of the problem does point out the complexities that a true solution must overcome.

Step 4 of the plan advocates the need for data quality. While accurate, reliable data is important, in the realm of counterterrorism, the data is rarely (if ever) 100% accurate.

Pieces of information will be missing. Descriptions will be vague. Data will exhibit cultural variations, typographical errors, and spelling mistakes. A solution that requires that the data be 100% accurate will never be able to connect the right dots in a meaningful way.

Instead, counterterrorism efforts need systems that can accept fragmentary and inconsistent data and make sense of them anyway.

Finally, Step 6 describes a complex event processing system to generate alerts for suspicious activity. I agree with this one, but I might change some of his examples, including this one: “Find conditions that should create an alert such as a guy paying cash for a ticket, and getting on a plane with no luggage.”

The critical question is: how do well these behaviors actually predict threats? They’re true of Abdulmutallab. But how many travelers who posed no threat have done the same thing? I don’t know, but analysis of relevant data could tell us.

How often do legitimate, non-threatening passengers pay cash for their ticket? How often do they not check any bags? Conditional probability uses questions like these to determine the likelihood that these factors predict threats.

A sophisticated solution should be able to aggregate multiple factors to determine the likelihood that a threat is imminent. And when that likelihood exceeds a tolerable threshold, it should alert the appropriate analyst.

It’s a great thing that David and many others are promoting how technology can automate connecting the dots to protect American citizens. Solutions do exist. We just need the right solution for this complex and critical problem.


Tagged as: , ,

1 Responses »

  1. Good catch Alex. A recent salon.com article I was reading made the same assertion.

    "How, in the age of Google, can the Transportation Security Administration not have an instantly searchable database containing every suspect who has come to the attention of the CIA or FBI,"

    I think it would be very beneficial to have more of these conversations that differentiate between the generic search that Google does and higher order semantic search that is needed for applications such as entity resolution.
    http://www.salon.com/news/opinion/feature/2010/01/06/failed_terror_plot/index.html

Leave a Response