Intelligence failure: get the right IT system thinking

**davidbfpo** · 01-16-2010

I've looked through the Intelligence thread and cannot immediately find an appropriate thread for this.

Bear with me, it could fit in the Detroit bombing thread: http://council.smallwarsjournal.com/...ead.php?t=9331 and FBI investigations: http://council.smallwarsjournal.com/...ead.php?t=8828 - both are useful cross references, hence the links.

Robert Haddick today has written 'Computers must take over counter-terrorism analysis', which at first I thought was another "IT can fix it"; pg. 2 of this article
:http://www.foreignpolicy.com/article...nment?page=0,1

Then I recalled Jeff Jonas is an IT expert (at IBM) and was well worth reading, having thought hard on the issues around data. His blogsite is: http://jeffjonas.typepad.com/ and just to illustrate try his post-9/11 ppt on the hijackers associations:http://jeffjonas.typepad.com/SRD-911-connections.pdf

After a long absence he has now commented on what he calls 'The Christmas Day Intelligence Failure', note this is Part One:http://jeffjonas.typepad.com/jeff_jo...elligence.html

He advocates that "data finds data":

The December 25th event is a classic case of enterprise amnesia. Enterprise Amnesia is the condition of knowing something on one hand and knowing something on another hand and never the two data points meet....

Abdulmutallab applies for a multi-entry visa. The terrorist database (TIDE) is checked and found to contain no such record. The State Department issues a visa. Later, a TIDE record for Abdulmutallab is added to TIDE. The split-second this record is added to TIDE, the State Department is notified the visa may need reconsidered.

Devil in the details. For all this to work, the system needs to realize that despite name variations and inconsistent data, the identity in the terrorist database is the identity in the visa system...

Jeff raises difficult issues for non-IT outsiders to think about - as we should be the ones setting the requirements for IT help - and I will add subsequent parts as they appear.

He is a very entertaining speaker on these issues.

**davidbfpo** · 01-16-2010

Another respected commentator on security issues, Bruce Schneier, adds this comment:http://www.schneier.com/blog/archive..._intellig.html

We don't need new technologies, new laws, new bureaucratic overlords, or -- for heaven's sake -- new agencies. What prevents information sharing among intelligence organizations is the culture of the generation that built those organizations....sharing is far more important than secrecy. Our intelligence organizations need to trade techniques and expertise with industry, and they need to share information among the different parts of themselves....We need the bottom-up organization that has made the Internet the greatest collection of human knowledge and ideas ever assembled. The problem is far more social than technological.

**Bill Moore** · 01-17-2010

Critics have pointed to laws that prohibited inter-agency sharing but, as the 9/11 Commission found, the law allows for far more sharing than goes on. It doesn't happen because of inter-agency rivalries, a reliance on outdated information systems, and a culture of secrecy. What we need is an intelligence community that shares ideas and hunches and facts on their versions of Facebook, Twitter and wikis. We need the bottom-up organization that has made the Internet the greatest collection of human knowledge and ideas ever assembled.

I agree with his points about culture change, but to deny the benefits of "value-added" technology is being overly dismissive of a "needed" capability. He even states in the paragraph quoted above (emphasis is mine) that we're working with outdated information systems. Many government agencies have their own Facebook now and can share information with other contacts in other agencies, but that hardly allows one fuse all the data available, then to connect the dots in a way that tells a story. The real challenge isn't sharing the information (we're much better than he gives the community credit for, but there is still much room for improvement), but the bigger challenge is making sense of the volume of information. We desparately need better information technology that helps analysts sort through volumes of data and then connect the dots (analytical support) and display it in a meaningful way. The culture that needs to change quickest is for each government agency/department and military service to stop storing their data in databases that are not accessible to the community of interest at large. Too much data resides in data banks that is not sharable outside their individual system, etc. The result is intelligence failures because the data was not available to the analyst who had a hunch, and if he/she had the all the data available and had the right analytical tools to quickly pull and sort through the relevant data, and then display it in a way that tells a story (visualize the data through link analysis and using temporal analysis), then we will have made a change that will actually result in our intelligence and law enforcement communities being more effective. Facebook and Twitter are only baby steps, they are far from being revolutionary enough to truly move us into the information age.

Another technology he may be bashing is technology to detect explosives and other potential weapons in airports. IMO it would be foolish not to invest in these technologies. Technology in many cases can do a better job at this and other tasks than humans, so why not use it? If it effectively reduces risk to a critical economic system (our air transportation system), why not invest in it? I'm sure if we did a cost comparison of what one attack costs when you consider all the ripple effects we would it find it a worthwhile investment.

**davidbfpo** · 01-18-2010

Earlier today I read this NYT article on the radicalization route for the Detroit bomber:http://www.nytimes.com/2010/01/17/wo...l?pagewanted=2 . I have seen similar before and worth a read, although much I fear is news reporting and not careful, verified investigation.

Then Leah Farrell, an Australian CT analyst, adds her viewpoint - having cited the NYT article:

Still, while he was seen to be “reaching out” to known extremists and appearing on “the periphery of other investigations” into radical suspects there, he was not considered a terrorist threat himself, according to a British counter-intelligence official.

Leah adds:

Edge of network connections–again. Of course the problem is always resourcing. There is never enough time to track down everything. But still, it seems to me that we see this over and over and over again.

No answers provided, some pointers to her earlier thinking on the issue and to an IT "guru" who has tried an answer.

I also wonder how many for example have attended a meeting on a controversy, listened, even spoken to a speaker, who might be a 'known extremist'. Does that merit a CT record? In the Detroit incident, an exchange of information between the UK and USA which apparently did not happen.

**Stan** · 01-18-2010

Originally Posted by davidbfpo

I also wonder how many for example have attended a meeting on a controversy, listened, even spoken to a speaker, who might be a 'known extremist'. Does that merit a CT record? In the Detroit incident, an exchange of information between the UK and USA which apparently did not happen.

David,
It was just a few years ago while attending MET training, the instructors feared their POI was copied and sent to addresses in the Middle East.

Of the countless so-called seminars and conferences held in Europe, I don't recall one instance where my credentials were checked or my name vetted. In fact, most of these folks are so hungry for participants they dump their advertisements onto the internet. I'm a little tired of the spam every morning, but wonder who their audience actually ends up being.

General so and so is your guest speaker with decades of experience fighting (insert key country or conflict here)

**Presley Cannady** · 01-19-2010

Originally Posted by davidbfpo

Robert Haddick today has written 'Computers must take over counter-terrorism analysis', which at first I thought was another "IT can fix it"; pg. 2 of this article
:http://www.foreignpolicy.com/article...nment?page=0,1

Automation is already spreading into the counterterrorism field, but simply saying we need more of it isn't going to produce technology that simply doesn't exist yet. There is no single thing you can point to and call it data-mining. It is a family of hundreds of loosely connected problem spaces and solutions orbiting storage and recall. If we have to resort to a very crude simile, you might liken the field today to that of the vast, also loosely connected realms of neuroscience, cognition, and linguistics. An even less compelling, but still useful parallel might be to the study of episodic memory.

Then I recalled Jeff Jonas is an IT expert (at IBM) and was well worth reading, having thought hard on the issues around data. His blogsite is: http://jeffjonas.typepad.com/ and just to illustrate try his post-9/11 ppt on the hijackers associations:http://jeffjonas.typepad.com/SRD-911-connections.pdf

Ten bucks Jeff Jonas hasn't done serious work in twenty years with any database model other than relational. That is to say that the trick to his analysis here is devising a system of relations--tables in a database--that admits the properties of events to be correlated in hindsight. That is to say even if future terrorists were careful enough not to input identical or even similar contact information, the system would break down if the method of input (say, in this day an age, Travelocity v. Expedia) changed. There is an easy enough way to fix this (and I hope they've done it), which brings us to this:

After a long absence he has now commented on what he calls 'The Christmas Day Intelligence Failure', note this is Part One:http://jeffjonas.typepad.com/jeff_jo...elligence.html

First off, the visa office wouldn't consult TIDE, they'd consult the TSDB--which is sourced from TIDE. This is a non-classified subset of the information contained in the IC's database. The key problem, from news reports thus far, is that information in TIDE was not transferred to TSDB. This is a classic failure in information sharing.

The point is that there are still filters between source repository of collected data (which could be an airline's manifest, a booking agent's order list, or an aggregator like TIDE) and the databases operated on by analysts. Law, I imagine, plays a role in keeping those walls up; I leave that to someone with the appropriate background.

Another wall might simply be competing data structure. The vBulletin software driving this forum has a database schema that is for all intents and purposes fixed during operation. I can input no more data than a relation specifies and in most cases no less than the constraints allow. The only way to change that is to change the underlying structure, which anyone whose ever even played with SQL should understand is a dicey, manual process that should never be taken lightly or without adequate testing before hand.

He advocates that "data finds data":

That's nice, but he has a lot of technical obstacles to overcome first. The walls I listed above are not insurmountable, but they are difficult to overcome. The legal issues have to be resolved by legislation or jurisprudence. The variety in data schema out there is tremendous. And finally, the real world's databases are not self-evolving (yet), and wishing for mature enough technology is not going to change that fact. Research may change that fact in the future, but for the time being human beings are going to be the principal glue that moves information from one large database to another.

Jeff raises difficult issues for non-IT outsiders to think about - as we should be the ones setting the requirements for IT help - and I will add subsequent parts as they appear.

The Non-IT folk--the stakeholders--have laid out clear requirements, in public and on multiple occasions. I hate to see it when folks in the computer sciences hide behind so-called ambiguities in the requirements to hide the fact that a problem may be intractable at this time. This is probably because in advance of the release planning, a ton of promises were made about what a technology could do without any sort of thought into what it couldn't.

**Presley Cannady** · 01-19-2010

Oh and for Chrissakes, why is it every social networking tool in the last decade's been promoted as the next must-have thing for collaboration? I find it particularly disturbing whenever I hear someone say that the IC can benefit from something like Twitter, Facebook or wikis. It's rate to see such a claim accompanied with an explanation of what these tools bring to the table, and you'll never see any analysis of the pitfalls. Take the Twitter and Facebook models for example. Are we going to be ranking the relevance of take based on the popularity of the source? That's what a friend or follower model entails. A wiki is a bit more defensible, but no more so than any other content repository with versioning and open access to anyone--a wiki is no more innovative than say git or svn or Alfresco.

**J Wolfsberger** · 01-19-2010

All of which points to a fundamental truth that was hammered into me years ago, and too many seem to have forgotten or never learned: Analysis is an activity that takes place in a human mind.

There seem to be three problems involved:

1. Information is not winding up in the right place.
2. The eyes looking at the information are not making the right connections.
3. All of the proactive effort I'm hearing about is focused on incidents, not the organization taking the initiative in creating the incidents.

(Might expand on this later.)

**davidbfpo** · 01-20-2010

Part 2: http://jeffjonas.typepad.com/jeff_jo...wish-list.html

Those who understand IT issues will hopefully follow the views.

**HowardHart** · 01-26-2010

Utilizing a Filemaker database with visual selection boxes for certain categories - "warning", "denied visa" and the like - creates a searchable record and also allows for specific commentary, key fact listings, and personal information that can be tied to a specific person interviewing or entering data on them at a specific time. In combination with a paste user function and the ability for tiered users to create records in the database, this allows for a searchable total database record that would be a start for integration between analytics in an intelligence environment and entry-level FSO's in countries deemed "terror risk."

Filemaker is a good program for creating databases that can be linked across networks. There could be a broadband or a satellite connection between the country desk in the given intelligence agency and the embassies. An FSO, when interviewing someone for a visa to the United States would be able to create a record for that person in that country that would be simultaneously searchable by analysts. Creating the databases is easy enough; the problem is linking them between the appropriate agencies and ensuring network security.

**Presley Cannady** · 01-26-2010

Originally Posted by HowardHart

Filemaker is a good program for creating databases that can be linked across networks.

If by linked across networks, you mean sharing DBFs, then we're really not adding much value beyond emailing Excel spreadsheets. You need a real database engine driving a real application. There is no off-the-shelf solution to this problem; it's too domain specific. You're going to have to glue components together no matter how you slice it.

**davidbfpo** · 02-14-2010

A rather shorter comment on Deadly Transparency:http://jeffjonas.typepad.com/jeff_jo...nsparency.html

This is the bulk of his comment:

Press accounts like this make me want to throw up.
Excerpt: “About four months before the attempted bombing on December 25, the NSA intercepted telephone conversations in which the leaders of al Qaeda in the Arabian Peninsula talked about the possibility of using an unidentified "Nigerian" bomber in an attack, according to intelligence officials.”

Every al Qaeda operative (big fish or little fish) in the Arabian Peninsula that said anything on the phone about this attack is going to think they were overheard. As such, they are ALL going to be more careful next time.
Consequences: Bad guys improve their tradecraft. US intelligence loses signal – going deaf is a bad thing. Net net, the likelihood of bad things happening in future, without detection, increases.

Transparency is not an issue unique to the USA and has happened throughout history, just that is quicker today IMHO.

**selil** · 02-14-2010

I don't know much about information technology but I'm game.

Let's see. Large scale information data warehousing has been around since the early 1960s. Some of the original database warehouses were built to handle fun things like payroll and finance. So, we've got a mature set of technologies. SAP and other companies have been building enterprise level data-warehouse systems with multiple levels of access that exist between physical and logical locations for quite awhile. Anybody though who has tried to roll out People Soft or similar can tell you this is not an easy task. The issue is not in the technology of delivery. The information technology exists to deliver the correct information to the correct individuals for analysis and/or action. Visual systems exist that do this within life critical systems currently. The Aviation Administration handles thousands of targets that have to be analyzed constantly with a good amount of accuracy. This is done over a large area with multiple sensors and tied into a backbone utilizing rule sets for delivery. Not close enough to the target use? Mail systems like HotMail, Gmail, and others do the same basic task but like the FAA example these are limited and the rule sets are finite.

The issue is the rule sets.

Can you succinctly describe terrorist behavior?

terrorist == male, adult, radicalized, weaponized

but wait if terrorist absolutely equals a male adult who is radicalized and has weapons what do we call a United States Marine?

So that suggests you need analysis that is fuzzy and each of the attributes needs to defined. So we end up with (simplistic examples)

terrorist.male = anti-american, violence prone, opportunist
terrorist.adult = old enough to push a button, decision aware, etc.
terrorist.radicalized = volunteered, choses violence, ???
terrorist.weaponized = has opportunity, can develop or deploy weapons

What happens next in the analysis phase (oh wait you mean you intelligence guys don't do it this way?) is that the COMPUTER needs rules. Computers are not very smart in fact they only do really stupid things really really fast. So, you tell it to do the wrong thing it will do it fast.

What you want is the edge cases. So for each attribute (e.g. terorrist.male) you are not looking to just get the male but the "edge" cases. Those males that have all the other factors. But, wait you say that females and children are terrorists too! Well of course. That is one of the rubs. You have to identify ALL terrorists. Take all of the known cases of terrorism and make similar rule sets. Somebody is reported you have terrorist.female.maoist.senior_citizen(reported) and that is your case that comes up before some human or stops them from getting on an aircraft. Though I wish they'd just not get a ticket. Oh and if you back through the logic you now know why no-fly lists are stupid and don't work.

I'm sure somebody has to have done this somewhere already. The above is a very simplistic semantic/ontological search and filter sequence based on object oriented techniques. To those of you with computer science degrees (I KNOW I BROKE THE RULES) but to everybody else I hope it made sense.

Then again I don't know much about this stuff so...

**selil** · 02-14-2010

So another point to this ....

If terrorist{Bob, Ted, Joe, Stan} we will then assume "Bob Johnson", "Bob Smith", etc. are terrorists. Same for Ted, Joe, Stan.... Because we filter at the "Bob Johnson" level doesn't fix the issue because there are N(Bob Johnson)s in the world where N= A really big number. So if you now spend your time vetting "Bob Johnson" you are not looking for the one "Bob Johnson" that is really a terrorist.

So you want to look for the specific terrorist. The edge cases are specifically important. You know that if "John Doe" is a wanted terrorist and he is described as (5'10", 185lbs, Dark Hair, living in Italy) that the (6', 300lb, blond) is likely not the guy in front of you. In fact you can coordinate between other elements and build a matrix that excludes beyond name many of the people with even close attributes.

So we want the edge cases. Those that are close to the boundary (exclude, include) in our target identification. Once we have those attributes we can then use all that stupid computing power to quickly iterate through records. If we hit a case of include (as a threat) then we do so and have a human look at it. Most of the kind of data is already available but we have a tendency to not trust it. No matter what my drivers license says I'm sadly not 200lbs. People lie. A bunch.

So, we can look for secondary and tertiary mechanisms of available data to collaborate outside the control of the individual. Facial recognition and other biometrics are only as good as the databases they are found in. Numbers and values in databases are also subject to variable veracity in values. Using our semantical construct though that data can be softened and again the edge and outlier cases identified.

Now don't be fooled we're talking a lot of processing power. Not nearly as much as one google search for the 1967 World Series MVP batting average in his high school junior year. Once you know the right terms (iterative search) getting the answer is either impossible (edge case) or you don't get the answer (include) or you do get the answer (include/exclude). That is the principle of fuzzy logic that get's you to the least amount of human interaction required. It also stops the terrorist before they get to the airport.

I guess I need more coffee.

**Presley Cannady** · 02-15-2010

Originally Posted by selil

I don't know much about information technology but I'm game.

Let's see. Large scale information data warehousing has been around since the early 1960s. Some of the original database warehouses were built to handle fun things like payroll and finance. So, we've got a mature set of technologies. SAP and other companies have been building enterprise level data-warehouse systems with multiple levels of access that exist between physical and logical locations for quite awhile. Anybody though who has tried to roll out People Soft or similar can tell you this is not an easy task.

Setting aside the time granularity of the data, the problem's scale has increased by at least nine orders of magnitude (megabyte to petabyte) over the sum total of hard and electronic storage in that era. Even then, you're only talking about snapshots over several months to several year intervals per target. Also, you'd probably like a taxonomy and tools to crunch this data in a reasonable period of time; the architectures of the early-2000s are woefully inadequate. Anyone want to take a guess how relevant technology from forty years earlier is?

From an enterprise point of view, things that worked well twenty, thirty or forty years ago form a solid foundation for evolving technology. That makes sense. Businesses aren't employing orders of magnitude more people than they were before. Accounts payable still goes out mostly per diem, weekly, bi-monthly, or monthly, and an hour is still and hour. That whole area is largely concerned with tweaking around the edges.

Business analysis--which we seriously need to think of separately from the rest of the enterprise--is a whole other animal. It's concerns itself not with data pertaining to the operation of the business, but the far larger, far finer set necessary to answer arbitrary questions in an arbitrarily short amount of time.

The issue is not in the technology of delivery. The information technology exists to deliver the correct information to the correct individuals for analysis and/or action.

Information technology is no Oracle (no pun intended). It's perennially immature hardware and software that works precisely as badly as it's implemented. We've got maybe twenty years experience with the sort of data centers we need for this kind of work, and less than ten in learning how to federate them properly. In fact, you could say we still don't know how to do it, because hardware and software are still catching up. We can't simply hand wave in the technology if it doesn't exist or perform as advertised.

Visual systems exist that do this within life critical systems currently. The Aviation Administration handles thousands of targets that have to be analyzed constantly with a good amount of accuracy.

That speaks more to the power of billions of dollars in cost overruns thrown at making software, hardware, and people multiply redundant than the technology itself. On paper, the problem isn't terribly difficult, but the FAA's trials and tribulations with implementing good tracking systems are well documented. You can taste the disappointment if you think about how few life critical systems--expensive as they are--come even close to approaching the scale of the air traffic control challenge. Launching the Space Shuttle twice a year costs almost about as much as the FAA's annual operations budget.

This is done over a large area with multiple sensors and tied into a backbone utilizing rule sets for delivery. Not close enough to the target use? Mail systems like HotMail, Gmail, and others do the same basic task but like the FAA example these are limited and the rule sets are finite.

Federating data is not easy when not designed for from the start. It took ten years to get to the point where you could manage the logins for several webmail services seamlessly, and even today it's only the major players that have gone ahead and done so. Only recently are we starting to see consumer tools for going the next step and providing people with a common inbox for all their accounts. Once again, the problem isn't hard on paper (or even in prototyping). It gets real hard when you start thinking about how many people are going to use this tool, how many cycles and how much storage will you need to support the demand, how much pipe will you need to move data from God knows how many places to God knows how many more, etc. We're starting to work through the problem, but that doesn't mean we're there yet. Amazon S3--as dirt cheap a way as you can get to never having to worry about back ups again--has been around for a couple of years now. Is your employer using it?

Federating data in business analysis, intelligence, or the like by definition precludes a priori design. Doing it in realtime...? It's a hard problem. I don't think it's intractable, but at some point IT needs to come clean and let the business know exactly where we're at.

**selil** · 02-15-2010

Gotta disagree. Since Knuth wrote his book data structures and algorithms of moving data have changed little. Gordon Moores Rule is about transistors and not simply speed. It is easy to get caught up in vendor hype and techno creep but the reality is fundamental rules and patterns of information technology are older than the computer revolution. Technology is not some special animal that it isn't above repeating itself again and again in new and interesting ways (according to vendors).

Grid, cluster, cloud and many other technologies are patterns of delivery/processing that have existed for quite some time. Community Memory in the 1970s has a direct connection to the Sun SunRay appliances, and that has a direct connection to the consideration of cloud computing. And if you believe Amazon S3 means you don't need backups you never heard of the Microsoft/Danger Side Kick debacle and various other "cloud" data losses.

Oh and to processing power the schemas and methods of high speed processing are just one Visa terminal away, one check cashing location, or any other place identity is assured. Google does how many billion transactions? To say it is difficult is a red herring.

Yes federated is difficult if you don't build it in or don't build with open standards. LDAP and an entire set of standards do that for you, but that isn't what the original question was about. How do you create technology that will solve a specific problem or processes that technology can support to solve that problem?

So, what is it your exactly disagreeing with? That the problem is solvable or that you have some spectacular new way you are going to suggest?

**Presley Cannady** · 02-16-2010

Originally Posted by selil

Gotta disagree. Since Knuth wrote his book data structures and algorithms of moving data have changed little. Gordon Moores Rule is about transistors and not simply speed. It is easy to get caught up in vendor hype and techno creep but the reality is fundamental rules and patterns of information technology are older than the computer revolution.

We started writing down the current fundamental laws of semiclassical and quantum mechanics over a century old now, and you can express them on a single sheet of paper (to be precise, two equations and about a dozen or so inequalities setting reasonable conditions). The devil emerges when you apply these laws to make actual predictions, and work, discovery and engineering yields novel and useful theory that is nothing more than increasingly large permutations of the fundamental set. We don't even have to resort to analogy to see the parallel in computer science and information technology, the latter is a superset of the former and the former derives from the electrical engineer's application of physics in semiconductors.

Sure, when you get down to it data structures are essentially scalars and the connections between them, and arrays, various lists, even more numerous hashes, and fundamental operations on them are ubiquitous. But those data structures combine to form new ones with unexpected behaviors. A tree doesn't act like a list, and an acyclic graph doesn't act like a tree. Object is a term that defies any meaningful mathematical classification, and now you have data structures that require some linguistics to understand fully.

Moore's law may help us understand the physical limits to semiconductor utility, but it offers no insight on the direction and feasibility of solutions to tackle the problem at hand.

Technology is not some special animal that it isn't above repeating itself again and again in new and interesting ways (according to vendors).

So you could say that the fundamental rules and patterns of information technology are wholly contained in Einstein's Field Equations and the Standard Model. Of course, that isn't a really helpful observation for the poor developers and engineers tasked with building your new total awareness solution.

Grid, cluster, cloud and many other technologies are patterns of delivery/processing that have existed for quite some time. Community Memory in the 1970s has a direct connection to the Sun SunRay appliances, and that has a direct connection to the consideration of cloud computing. And if you believe Amazon S3 means you don't need backups you never heard of the Microsoft/Danger Side Kick debacle and various other "cloud" data losses.

Networking and queuing have been around for a long time, but it's only been in the last two decades that cost has dropped and speed has improved enough to permit the migration of processes across CPUs for larger enterprises, and only a decade or so since further cost reduction put the capability in reach small firms and individuals. We're talking about going from maybe a user population of a couple hundred to billions in twenty years. That sort of drastic change in scale didn't come about merely as a result of piecing together core memory, PALs and teletype lines. Does today's hardware and software reflect it's heritage? You betchya. But knowing how to build a timesharing system in 1970 doesn't spare you the years of discovery and practice behind how engineers build grids today.

And there's a simple solution to beating risk down to near zero when backing up S3; back up to two or more regions. If you run into a problem then, it's either most likely on your end of the transaction or you and the entire the world are facing a much more serious problem than data loss.

Oh and to processing power the schemas and methods of high speed processing are just one Visa terminal away, one check cashing location, or any other place identity is assured. Google does how many billion transactions? To say it is difficult is a red herring.

Accumulating processing power or storage isn't difficult. Setting it up to handle what looks like any arbitrary work flow is. Google's entire grid principally supports its search tool, which acquires information through a very simple set of rules and through a small number of well defined protocols. That doesn't alleviate you of the problem of actually designing those tools in the first place, or rapidly supporting protocols we may not even know exists. How do you source from an Excel spreadsheet, a scribble on the back of a receipt, an Atom feed, a paper-archived pre-print, and a blood sample in a lab in a way that doesn't involve hiring manpower to actually build the adapter? We're not at Skynet yet.

Yes federated is difficult if you don't build it in or don't build with open standards. LDAP and an entire set of standards do that for you, but that isn't what the original question was about. How do you create technology that will solve a specific problem or processes that technology can support to solve that problem?

That's the problem right there. A standard is only as good as its adoption, and in this particular case federation is extremely difficult because no one's figured out a way to get our own people, let alone your adversary's, to slavishly adapt their (for lack of a better word) enterprise to technologies you can conveniently exploit. You can toss a credit card and trade in laundered or counterfeit cash, valuables, or in-kind.

Even if the enemy standardized with you, those very standards include practices and tools that can still defeat your intelligence work flow. It's not hard at all to set up a chain of proxies and direct communications through them entirely through SSL--hell, there's software out there that'll do it for you automatically. So you evolve the standard in response, which brings you back to the adoption problem.

So, what is it your exactly disagreeing with? That the problem is solvable or that you have some spectacular new way you are going to suggest?

I don't know. Folk seem to forget how heterogeneous the information environment is, especially when they remember that for most of recorded history people have tackled intelligence with little more than their wits and pen and paper. So when they look at computers and see that they can perform tasks millions of times faster than the fastest human calculator they think "hey, we can do millions of times the work and that should cover it." They forget that the resolution of answers to the questions analysts routinely ask to connect the dots is very poor; I pointed out earlier that as a function of time, most of your product operates on scales covering months and years, not hours and days. Even then the gaps, inconsistencies and unconnected duplicates are glaring, as anyone in the electronic medical records field will tell you.

What I do know is that engineers and developers need to be honest with the stakeholders, especially in a field this important. Jeff Jonas blogging about how federation will solve our problems glosses over the reality that we're still learning the ropes. It's akin to Larry Ellison noting that "cloud" is just a buzzier way to say parallel processing and redundant storage--it's an obvious point to make which doesn't begin to capture both the pain and fruits of expanding that technology to the scale envisioned.

**Presley Cannady** · 02-16-2010

Interesting little aside, since clouds were mentioned. I think this highlights the slow walk of IT towards even a clearly stated solution with a supposedly manageable scope. The problem the here is definitely a worthy one, especially in light of news reports detailing Amazon Web Service outages. A single IaaS/PaaS provider could still represent single point of failure. Can we federate these providers in a way to permit customers ot migrate their processes and data from amongst multiple cloud providers?

To that end, the Cloud Computing Interoperability Forum is at least a year old. They apparently made a big splash, considering the org's list of sponsors. They're targeting the major IaaS and PaaS providers, which probably numbers around 20 world-wide and of which probably less than ten really matter. The cheerleaders from Enomaly envision a future where such providers compete over your bytes and uptime cycles. They also want to go about this in a very open way, even hosting the API portion of the project on Google Code.

A year later, here's where the actual development effort stands.

**Jason Port** · 02-17-2010

The reality is that we do not have a technical problem. While I won't pretend to understand the advanced arithmatic above, I suspect that the mathmatical solution to predicting behavior is not so far from a reality. Looking at a person, observing their behavior and applying those behaviors against a model of a "terrorist" (No, Liles, this is not a boolean - more of a sliding scale, that someone is more likely than someone else to be a baddie). In turn we can then focus our efforts on those people. Naturally, this type of system will seldom capture the angry guy who just goes off and drives his SUV through a university or a nut case who is otherwise "ok" and shoots up Ft. Hood, but it _should_ provide us a list of people to observe more closely, and so we can stop searching cub scouts.

However, there is a policy issue and a people issue at hand here. Regardless of what we want to believe, our government doesn't like to share. This is often promoted by contractors who are protecting their own turf (Data makes you king, and sharing data is seen as weakening your realm). In turn, we find that various agencies can "collect" on someone, and failure to share is not met with a firing squad.

Conversely, I posit that data entry is annoying at best, and hard at worst. Given human nature and *our* desire to find the most leisure whenever possible, people don't bother to collect on the details. I flew through RDU this morning at 0600. The woman in front of me was meddling with her personal toiletries. The TSA rep told me I could jump into another line to bypass her. However, our practice should have been to report her by name on this behavior. Was it criminal? No. Suspicious? Not really. But when taken in conjunction with other behavior, it could show trends or patterns that might indicate negative or dangerous behaviors in the future. Sadly, my crack TSA agent instead made a smart assed comment and I was on my way.

The reality is that during a survey of any data store in the intelligence or C2 arena, we might be surprised at how many fields of data we ask for and how few are actually completed. It is really hard to do trending when 80% of your database is blank.

So, at the end of the day, the reality is that the ideas you all are promoting are sound mathematically and technologically, and if used to highlight individuals, organizations or even regions, these can be effective to help us plan. But until we are really serious (I mean firing some senior people in both the Government and Industry), I will just continue fighting the good fight and hoping that we get lucky again.

**Jason Port** · 02-17-2010

Until we solve for all of the rest, the Social Media - Faceyspaces, and TwitteryTweets are all just a time suck. Trust me - I use all of them, and they all waste time. To try and use these to paint a picture, when we can't mine structured, normalized data is simply a bridge too far.

During last year, I read about a use case from the intel community for twitter - Imagine two patrols twitter about the same event (like an IED blast) from two vantage points. Or that all of the patrol members twitter about the event. Now we have 24 reports (or whatever) about the event, and in turn our intel studs can form a complete picture based on the 24 strories of 140 characters each.

Seriously? The market has been blown sky high, and I am jumping on my iPhone? I don't know how to text and return accurate fire. Moreover, how do we know it was one event or 24? Location, separated by time could create multiple events? Is that 1 or 24?

Again, until we are mature enough to use the systems in play, let's keep reporting out of MilBook and Twitter. (OK - We can use Wikis - Intellipedia is supposed to be pretty hot - though it is just another island of information not accessible to the enterprise half the time)

Thread: Intelligence failure: get the right IT system thinking

Thread Tools

Display

Intelligence failure: get the right IT system thinking

Change needed - another view

Does Bruce Schneier contradict himself?

On the periphery - not overlooked?

Jeff Jonas Part 2

Jeff Jonas Part 3

I just finally threw the clot. . .

And the second clot - for Social Networking

Similar Threads

Intelligence: failures, gaps and knowledge gaps

Human Intelligence and Counterintelligence Operations in Iraq- New Book Out

Culture battle: Selective use of history should not be used to justify the status quo

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions