PDA

View Full Version : Intelligence failure: get the right IT system thinking



davidbfpo
01-16-2010, 06:11 PM
I've looked through the Intelligence thread and cannot immediately find an appropriate thread for this.

Bear with me, it could fit in the Detroit bombing thread: http://council.smallwarsjournal.com/showthread.php?t=9331 and FBI investigations: http://council.smallwarsjournal.com/showthread.php?t=8828 - both are useful cross references, hence the links.

Robert Haddick today has written 'Computers must take over counter-terrorism analysis', which at first I thought was another "IT can fix it"; pg. 2 of this article
:http://www.foreignpolicy.com/articles/2010/01/15/this_week_at_war_google_has_more_guts_than_the_us_ government?page=0,1

Then I recalled Jeff Jonas is an IT expert (at IBM) and was well worth reading, having thought hard on the issues around data. His blogsite is: http://jeffjonas.typepad.com/ and just to illustrate try his post-9/11 ppt on the hijackers associations:http://jeffjonas.typepad.com/SRD-911-connections.pdf

After a long absence he has now commented on what he calls 'The Christmas Day Intelligence Failure', note this is Part One:http://jeffjonas.typepad.com/jeff_jonas/2010/01/the-christmas-day-intelligence-failure-part-i-enterprise-amnesia-vs-enterprise-intelligence.html

He advocates that "data finds data":
The December 25th event is a classic case of enterprise amnesia. Enterprise Amnesia is the condition of knowing something on one hand and knowing something on another hand and never the two data points meet....

Abdulmutallab applies for a multi-entry visa. The terrorist database (TIDE) is checked and found to contain no such record. The State Department issues a visa. Later, a TIDE record for Abdulmutallab is added to TIDE. The split-second this record is added to TIDE, the State Department is notified the visa may need reconsidered.

Devil in the details. For all this to work, the system needs to realize that despite name variations and inconsistent data, the identity in the terrorist database is the identity in the visa system...

Jeff raises difficult issues for non-IT outsiders to think about - as we should be the ones setting the requirements for IT help - and I will add subsequent parts as they appear.

He is a very entertaining speaker on these issues.

davidbfpo
01-16-2010, 10:45 PM
Another respected commentator on security issues, Bruce Schneier, adds this comment:http://www.schneier.com/blog/archives/2010/01/fixing_intellig.html


We don't need new technologies, new laws, new bureaucratic overlords, or -- for heaven's sake -- new agencies. What prevents information sharing among intelligence organizations is the culture of the generation that built those organizations....sharing is far more important than secrecy. Our intelligence organizations need to trade techniques and expertise with industry, and they need to share information among the different parts of themselves....We need the bottom-up organization that has made the Internet the greatest collection of human knowledge and ideas ever assembled. The problem is far more social than technological.

Bill Moore
01-17-2010, 03:45 AM
Critics have pointed to laws that prohibited inter-agency sharing but, as the 9/11 Commission found, the law allows for far more sharing than goes on. It doesn't happen because of inter-agency rivalries, a reliance on outdated information systems, and a culture of secrecy. What we need is an intelligence community that shares ideas and hunches and facts on their versions of Facebook, Twitter and wikis. We need the bottom-up organization that has made the Internet the greatest collection of human knowledge and ideas ever assembled.

I agree with his points about culture change, but to deny the benefits of "value-added" technology is being overly dismissive of a "needed" capability. He even states in the paragraph quoted above (emphasis is mine) that we're working with outdated information systems. Many government agencies have their own Facebook now and can share information with other contacts in other agencies, but that hardly allows one fuse all the data available, then to connect the dots in a way that tells a story. The real challenge isn't sharing the information (we're much better than he gives the community credit for, but there is still much room for improvement), but the bigger challenge is making sense of the volume of information. We desparately need better information technology that helps analysts sort through volumes of data and then connect the dots (analytical support) and display it in a meaningful way. The culture that needs to change quickest is for each government agency/department and military service to stop storing their data in databases that are not accessible to the community of interest at large. Too much data resides in data banks that is not sharable outside their individual system, etc. The result is intelligence failures because the data was not available to the analyst who had a hunch, and if he/she had the all the data available and had the right analytical tools to quickly pull and sort through the relevant data, and then display it in a way that tells a story (visualize the data through link analysis and using temporal analysis), then we will have made a change that will actually result in our intelligence and law enforcement communities being more effective. Facebook and Twitter are only baby steps, they are far from being revolutionary enough to truly move us into the information age.

Another technology he may be bashing is technology to detect explosives and other potential weapons in airports. IMO it would be foolish not to invest in these technologies. Technology in many cases can do a better job at this and other tasks than humans, so why not use it? If it effectively reduces risk to a critical economic system (our air transportation system), why not invest in it? I'm sure if we did a cost comparison of what one attack costs when you consider all the ripple effects we would it find it a worthwhile investment.

davidbfpo
01-18-2010, 08:42 PM
Earlier today I read this NYT article on the radicalization route for the Detroit bomber:http://www.nytimes.com/2010/01/17/world/africa/17abdulmutallab.html?pagewanted=2 . I have seen similar before and worth a read, although much I fear is news reporting and not careful, verified investigation.

Then Leah Farrell, an Australian CT analyst, adds her viewpoint - having cited the NYT article:
Still, while he was seen to be “reaching out” to known extremists and appearing on “the periphery of other investigations” into radical suspects there, he was not considered a terrorist threat himself, according to a British counter-intelligence official.

Leah adds:
Edge of network connections–again. Of course the problem is always resourcing. There is never enough time to track down everything. But still, it seems to me that we see this over and over and over again.

No answers provided, some pointers to her earlier thinking on the issue and to an IT "guru" who has tried an answer.

I also wonder how many for example have attended a meeting on a controversy, listened, even spoken to a speaker, who might be a 'known extremist'. Does that merit a CT record? In the Detroit incident, an exchange of information between the UK and USA which apparently did not happen.

Stan
01-18-2010, 09:23 PM
I also wonder how many for example have attended a meeting on a controversy, listened, even spoken to a speaker, who might be a 'known extremist'. Does that merit a CT record? In the Detroit incident, an exchange of information between the UK and USA which apparently did not happen.

David,
It was just a few years ago while attending MET training, the instructors feared their POI was copied and sent to addresses in the Middle East.

Of the countless so-called seminars and conferences held in Europe, I don't recall one instance where my credentials were checked or my name vetted. In fact, most of these folks are so hungry for participants they dump their advertisements onto the internet. I'm a little tired of the spam every morning, but wonder who their audience actually ends up being.


General so and so is your guest speaker with decades of experience fighting (insert key country or conflict here)

Presley Cannady
01-19-2010, 02:37 AM
Robert Haddick today has written 'Computers must take over counter-terrorism analysis', which at first I thought was another "IT can fix it"; pg. 2 of this article
:http://www.foreignpolicy.com/articles/2010/01/15/this_week_at_war_google_has_more_guts_than_the_us_ government?page=0,1

Automation is already spreading into the counterterrorism field, but simply saying we need more of it isn't going to produce technology that simply doesn't exist yet. There is no single thing you can point to and call it data-mining. It is a family of hundreds of loosely connected problem spaces and solutions orbiting storage and recall. If we have to resort to a very crude simile, you might liken the field today to that of the vast, also loosely connected realms of neuroscience, cognition, and linguistics. An even less compelling, but still useful parallel might be to the study of episodic memory.


Then I recalled Jeff Jonas is an IT expert (at IBM) and was well worth reading, having thought hard on the issues around data. His blogsite is: http://jeffjonas.typepad.com/ and just to illustrate try his post-9/11 ppt on the hijackers associations:http://jeffjonas.typepad.com/SRD-911-connections.pdf

Ten bucks Jeff Jonas hasn't done serious work in twenty years with any database model other than relational. That is to say that the trick to his analysis here is devising a system of relations--tables in a database--that admits the properties of events to be correlated in hindsight. That is to say even if future terrorists were careful enough not to input identical or even similar contact information, the system would break down if the method of input (say, in this day an age, Travelocity v. Expedia) changed. There is an easy enough way to fix this (and I hope they've done it), which brings us to this:


After a long absence he has now commented on what he calls 'The Christmas Day Intelligence Failure', note this is Part One:http://jeffjonas.typepad.com/jeff_jonas/2010/01/the-christmas-day-intelligence-failure-part-i-enterprise-amnesia-vs-enterprise-intelligence.html

First off, the visa office wouldn't consult TIDE, they'd consult the TSDB--which is sourced from TIDE. This is a non-classified subset of the information contained in the IC's database. The key problem, from news reports thus far, is that information in TIDE was not transferred to TSDB. This is a classic failure in information sharing.

The point is that there are still filters between source repository of collected data (which could be an airline's manifest, a booking agent's order list, or an aggregator like TIDE) and the databases operated on by analysts. Law, I imagine, plays a role in keeping those walls up; I leave that to someone with the appropriate background.

Another wall might simply be competing data structure. The vBulletin software driving this forum has a database schema that is for all intents and purposes fixed during operation. I can input no more data than a relation specifies and in most cases no less than the constraints allow. The only way to change that is to change the underlying structure, which anyone whose ever even played with SQL should understand is a dicey, manual process that should never be taken lightly or without adequate testing before hand.


He advocates that "data finds data":

That's nice, but he has a lot of technical obstacles to overcome first. The walls I listed above are not insurmountable, but they are difficult to overcome. The legal issues have to be resolved by legislation or jurisprudence. The variety in data schema out there is tremendous. And finally, the real world's databases are not self-evolving (yet), and wishing for mature enough technology is not going to change that fact. Research may change that fact in the future, but for the time being human beings are going to be the principal glue that moves information from one large database to another.


Jeff raises difficult issues for non-IT outsiders to think about - as we should be the ones setting the requirements for IT help - and I will add subsequent parts as they appear.

The Non-IT folk--the stakeholders--have laid out clear requirements, in public and on multiple occasions. I hate to see it when folks in the computer sciences hide behind so-called ambiguities in the requirements to hide the fact that a problem may be intractable at this time. This is probably because in advance of the release planning, a ton of promises were made about what a technology could do without any sort of thought into what it couldn't.

Presley Cannady
01-19-2010, 02:46 AM
Oh and for Chrissakes, why is it every social networking tool in the last decade's been promoted as the next must-have thing for collaboration? I find it particularly disturbing whenever I hear someone say that the IC can benefit from something like Twitter, Facebook or wikis. It's rate to see such a claim accompanied with an explanation of what these tools bring to the table, and you'll never see any analysis of the pitfalls. Take the Twitter and Facebook models for example. Are we going to be ranking the relevance of take based on the popularity of the source? That's what a friend or follower model entails. A wiki is a bit more defensible, but no more so than any other content repository with versioning and open access to anyone--a wiki is no more innovative than say git or svn or Alfresco.

J Wolfsberger
01-19-2010, 01:14 PM
All of which points to a fundamental truth that was hammered into me years ago, and too many seem to have forgotten or never learned: Analysis is an activity that takes place in a human mind.

There seem to be three problems involved:

1. Information is not winding up in the right place.
2. The eyes looking at the information are not making the right connections.
3. All of the proactive effort I'm hearing about is focused on incidents, not the organization taking the initiative in creating the incidents.

(Might expand on this later.)

davidbfpo
01-20-2010, 08:05 PM
Part 2: http://jeffjonas.typepad.com/jeff_jonas/2010/01/the-christmas-day-intelligence-failure-part-ii-jeff-jonas-christmas-wish-list.html

Those who understand IT issues will hopefully follow the views.

HowardHart
01-26-2010, 04:27 AM
Utilizing a Filemaker database with visual selection boxes for certain categories - "warning", "denied visa" and the like - creates a searchable record and also allows for specific commentary, key fact listings, and personal information that can be tied to a specific person interviewing or entering data on them at a specific time. In combination with a paste user function and the ability for tiered users to create records in the database, this allows for a searchable total database record that would be a start for integration between analytics in an intelligence environment and entry-level FSO's in countries deemed "terror risk."

Filemaker is a good program for creating databases that can be linked across networks. There could be a broadband or a satellite connection between the country desk in the given intelligence agency and the embassies. An FSO, when interviewing someone for a visa to the United States would be able to create a record for that person in that country that would be simultaneously searchable by analysts. Creating the databases is easy enough; the problem is linking them between the appropriate agencies and ensuring network security.

Presley Cannady
01-26-2010, 06:52 AM
Filemaker is a good program for creating databases that can be linked across networks.

If by linked across networks, you mean sharing DBFs, then we're really not adding much value beyond emailing Excel spreadsheets. You need a real database engine driving a real application. There is no off-the-shelf solution to this problem; it's too domain specific. You're going to have to glue components together no matter how you slice it.

davidbfpo
02-14-2010, 02:23 PM
A rather shorter comment on Deadly Transparency:http://jeffjonas.typepad.com/jeff_jonas/2010/02/the-christmas-day-intelligence-failure-part-iii-of-iii-deadly-transparency.html

This is the bulk of his comment:
Press accounts like this make me want to throw up.
Excerpt: “About four months before the attempted bombing on December 25, the NSA intercepted telephone conversations in which the leaders of al Qaeda in the Arabian Peninsula talked about the possibility of using an unidentified "Nigerian" bomber in an attack, according to intelligence officials.”

Every al Qaeda operative (big fish or little fish) in the Arabian Peninsula that said anything on the phone about this attack is going to think they were overheard. As such, they are ALL going to be more careful next time.
Consequences: Bad guys improve their tradecraft. US intelligence loses signal – going deaf is a bad thing. Net net, the likelihood of bad things happening in future, without detection, increases.

Transparency is not an issue unique to the USA and has happened throughout history, just that is quicker today IMHO.

selil
02-14-2010, 04:42 PM
I don't know much about information technology but I'm game.

Let's see. Large scale information data warehousing has been around since the early 1960s. Some of the original database warehouses were built to handle fun things like payroll and finance. So, we've got a mature set of technologies. SAP and other companies have been building enterprise level data-warehouse systems with multiple levels of access that exist between physical and logical locations for quite awhile. Anybody though who has tried to roll out People Soft or similar can tell you this is not an easy task. The issue is not in the technology of delivery. The information technology exists to deliver the correct information to the correct individuals for analysis and/or action. Visual systems exist that do this within life critical systems currently. The Aviation Administration handles thousands of targets that have to be analyzed constantly with a good amount of accuracy. This is done over a large area with multiple sensors and tied into a backbone utilizing rule sets for delivery. Not close enough to the target use? Mail systems like HotMail, Gmail, and others do the same basic task but like the FAA example these are limited and the rule sets are finite.

The issue is the rule sets.

Can you succinctly describe terrorist behavior?

terrorist == male, adult, radicalized, weaponized

but wait if terrorist absolutely equals a male adult who is radicalized and has weapons what do we call a United States Marine?

So that suggests you need analysis that is fuzzy and each of the attributes needs to defined. So we end up with (simplistic examples)

terrorist.male = anti-american, violence prone, opportunist
terrorist.adult = old enough to push a button, decision aware, etc.
terrorist.radicalized = volunteered, choses violence, ???
terrorist.weaponized = has opportunity, can develop or deploy weapons

What happens next in the analysis phase (oh wait you mean you intelligence guys don't do it this way?) is that the COMPUTER needs rules. Computers are not very smart in fact they only do really stupid things really really fast. So, you tell it to do the wrong thing it will do it fast.

What you want is the edge cases. So for each attribute (e.g. terorrist.male) you are not looking to just get the male but the "edge" cases. Those males that have all the other factors. But, wait you say that females and children are terrorists too! Well of course. That is one of the rubs. You have to identify ALL terrorists. Take all of the known cases of terrorism and make similar rule sets. Somebody is reported you have terrorist.female.maoist.senior_citizen(reported) and that is your case that comes up before some human or stops them from getting on an aircraft. Though I wish they'd just not get a ticket. Oh and if you back through the logic you now know why no-fly lists are stupid and don't work.

I'm sure somebody has to have done this somewhere already. The above is a very simplistic semantic/ontological search and filter sequence based on object oriented techniques. To those of you with computer science degrees (I KNOW I BROKE THE RULES) but to everybody else I hope it made sense.

Then again I don't know much about this stuff so...

selil
02-14-2010, 05:04 PM
So another point to this ....

If terrorist{Bob, Ted, Joe, Stan} we will then assume "Bob Johnson", "Bob Smith", etc. are terrorists. Same for Ted, Joe, Stan.... Because we filter at the "Bob Johnson" level doesn't fix the issue because there are N(Bob Johnson)s in the world where N= A really big number. So if you now spend your time vetting "Bob Johnson" you are not looking for the one "Bob Johnson" that is really a terrorist.

So you want to look for the specific terrorist. The edge cases are specifically important. You know that if "John Doe" is a wanted terrorist and he is described as (5'10", 185lbs, Dark Hair, living in Italy) that the (6', 300lb, blond) is likely not the guy in front of you. In fact you can coordinate between other elements and build a matrix that excludes beyond name many of the people with even close attributes.

So we want the edge cases. Those that are close to the boundary (exclude, include) in our target identification. Once we have those attributes we can then use all that stupid computing power to quickly iterate through records. If we hit a case of include (as a threat) then we do so and have a human look at it. Most of the kind of data is already available but we have a tendency to not trust it. No matter what my drivers license says I'm sadly not 200lbs. People lie. A bunch.

So, we can look for secondary and tertiary mechanisms of available data to collaborate outside the control of the individual. Facial recognition and other biometrics are only as good as the databases they are found in. Numbers and values in databases are also subject to variable veracity in values. Using our semantical construct though that data can be softened and again the edge and outlier cases identified.

Now don't be fooled we're talking a lot of processing power. Not nearly as much as one google search for the 1967 World Series MVP batting average in his high school junior year. Once you know the right terms (iterative search) getting the answer is either impossible (edge case) or you don't get the answer (include) or you do get the answer (include/exclude). That is the principle of fuzzy logic that get's you to the least amount of human interaction required. It also stops the terrorist before they get to the airport.

I guess I need more coffee.

Presley Cannady
02-15-2010, 06:19 AM
I don't know much about information technology but I'm game.

Let's see. Large scale information data warehousing has been around since the early 1960s. Some of the original database warehouses were built to handle fun things like payroll and finance. So, we've got a mature set of technologies. SAP and other companies have been building enterprise level data-warehouse systems with multiple levels of access that exist between physical and logical locations for quite awhile. Anybody though who has tried to roll out People Soft or similar can tell you this is not an easy task.

Setting aside the time granularity of the data, the problem's scale has increased by at least nine orders of magnitude (megabyte to petabyte) over the sum total of hard and electronic storage in that era. Even then, you're only talking about snapshots over several months to several year intervals per target. Also, you'd probably like a taxonomy and tools to crunch this data in a reasonable period of time; the architectures of the early-2000s are woefully inadequate. Anyone want to take a guess how relevant technology from forty years earlier is?

From an enterprise point of view, things that worked well twenty, thirty or forty years ago form a solid foundation for evolving technology. That makes sense. Businesses aren't employing orders of magnitude more people than they were before. Accounts payable still goes out mostly per diem, weekly, bi-monthly, or monthly, and an hour is still and hour. That whole area is largely concerned with tweaking around the edges.

Business analysis--which we seriously need to think of separately from the rest of the enterprise--is a whole other animal. It's concerns itself not with data pertaining to the operation of the business, but the far larger, far finer set necessary to answer arbitrary questions in an arbitrarily short amount of time.


The issue is not in the technology of delivery. The information technology exists to deliver the correct information to the correct individuals for analysis and/or action.

Information technology is no Oracle (no pun intended). It's perennially immature hardware and software that works precisely as badly as it's implemented. We've got maybe twenty years experience with the sort of data centers we need for this kind of work, and less than ten in learning how to federate them properly. In fact, you could say we still don't know how to do it, because hardware and software are still catching up. We can't simply hand wave in the technology if it doesn't exist or perform as advertised.


Visual systems exist that do this within life critical systems currently. The Aviation Administration handles thousands of targets that have to be analyzed constantly with a good amount of accuracy.

That speaks more to the power of billions of dollars in cost overruns thrown at making software, hardware, and people multiply redundant than the technology itself. On paper, the problem isn't terribly difficult, but the FAA's trials and tribulations with implementing good tracking systems are well documented. You can taste the disappointment if you think about how few life critical systems--expensive as they are--come even close to approaching the scale of the air traffic control challenge. Launching the Space Shuttle twice a year costs almost about as much as the FAA's annual operations budget.


This is done over a large area with multiple sensors and tied into a backbone utilizing rule sets for delivery. Not close enough to the target use? Mail systems like HotMail, Gmail, and others do the same basic task but like the FAA example these are limited and the rule sets are finite.

Federating data is not easy when not designed for from the start. It took ten years to get to the point where you could manage the logins for several webmail services seamlessly, and even today it's only the major players that have gone ahead and done so. Only recently are we starting to see consumer tools for going the next step and providing people with a common inbox for all their accounts. Once again, the problem isn't hard on paper (or even in prototyping). It gets real hard when you start thinking about how many people are going to use this tool, how many cycles and how much storage will you need to support the demand, how much pipe will you need to move data from God knows how many places to God knows how many more, etc. We're starting to work through the problem, but that doesn't mean we're there yet. Amazon S3--as dirt cheap a way as you can get to never having to worry about back ups again--has been around for a couple of years now. Is your employer using it?

Federating data in business analysis, intelligence, or the like by definition precludes a priori design. Doing it in realtime...? It's a hard problem. I don't think it's intractable, but at some point IT needs to come clean and let the business know exactly where we're at.

selil
02-15-2010, 07:56 AM
Gotta disagree. Since Knuth wrote his book data structures and algorithms of moving data have changed little. Gordon Moores Rule is about transistors and not simply speed. It is easy to get caught up in vendor hype and techno creep but the reality is fundamental rules and patterns of information technology are older than the computer revolution. Technology is not some special animal that it isn't above repeating itself again and again in new and interesting ways (according to vendors).

Grid, cluster, cloud and many other technologies are patterns of delivery/processing that have existed for quite some time. Community Memory in the 1970s has a direct connection to the Sun SunRay appliances, and that has a direct connection to the consideration of cloud computing. And if you believe Amazon S3 means you don't need backups you never heard of the Microsoft/Danger Side Kick debacle and various other "cloud" data losses.

Oh and to processing power the schemas and methods of high speed processing are just one Visa terminal away, one check cashing location, or any other place identity is assured. Google does how many billion transactions? To say it is difficult is a red herring.

Yes federated is difficult if you don't build it in or don't build with open standards. LDAP and an entire set of standards do that for you, but that isn't what the original question was about. How do you create technology that will solve a specific problem or processes that technology can support to solve that problem?

So, what is it your exactly disagreeing with? That the problem is solvable or that you have some spectacular new way you are going to suggest?

Presley Cannady
02-16-2010, 07:47 AM
Gotta disagree. Since Knuth wrote his book data structures and algorithms of moving data have changed little. Gordon Moores Rule is about transistors and not simply speed. It is easy to get caught up in vendor hype and techno creep but the reality is fundamental rules and patterns of information technology are older than the computer revolution.

We started writing down the current fundamental laws of semiclassical and quantum mechanics over a century old now, and you can express them on a single sheet of paper (to be precise, two equations and about a dozen or so inequalities setting reasonable conditions). The devil emerges when you apply these laws to make actual predictions, and work, discovery and engineering yields novel and useful theory that is nothing more than increasingly large permutations of the fundamental set. We don't even have to resort to analogy to see the parallel in computer science and information technology, the latter is a superset of the former and the former derives from the electrical engineer's application of physics in semiconductors.

Sure, when you get down to it data structures are essentially scalars and the connections between them, and arrays, various lists, even more numerous hashes, and fundamental operations on them are ubiquitous. But those data structures combine to form new ones with unexpected behaviors. A tree doesn't act like a list, and an acyclic graph doesn't act like a tree. Object is a term that defies any meaningful mathematical classification, and now you have data structures that require some linguistics to understand fully.

Moore's law may help us understand the physical limits to semiconductor utility, but it offers no insight on the direction and feasibility of solutions to tackle the problem at hand.


Technology is not some special animal that it isn't above repeating itself again and again in new and interesting ways (according to vendors).

So you could say that the fundamental rules and patterns of information technology are wholly contained in Einstein's Field Equations and the Standard Model. Of course, that isn't a really helpful observation for the poor developers and engineers tasked with building your new total awareness solution.


Grid, cluster, cloud and many other technologies are patterns of delivery/processing that have existed for quite some time. Community Memory in the 1970s has a direct connection to the Sun SunRay appliances, and that has a direct connection to the consideration of cloud computing. And if you believe Amazon S3 means you don't need backups you never heard of the Microsoft/Danger Side Kick debacle and various other "cloud" data losses.

Networking and queuing have been around for a long time, but it's only been in the last two decades that cost has dropped and speed has improved enough to permit the migration of processes across CPUs for larger enterprises, and only a decade or so since further cost reduction put the capability in reach small firms and individuals. We're talking about going from maybe a user population of a couple hundred to billions in twenty years. That sort of drastic change in scale didn't come about merely as a result of piecing together core memory, PALs and teletype lines. Does today's hardware and software reflect it's heritage? You betchya. But knowing how to build a timesharing system in 1970 doesn't spare you the years of discovery and practice behind how engineers build grids today.

And there's a simple solution to beating risk down to near zero when backing up S3; back up to two or more regions. If you run into a problem then, it's either most likely on your end of the transaction or you and the entire the world are facing a much more serious problem than data loss.


Oh and to processing power the schemas and methods of high speed processing are just one Visa terminal away, one check cashing location, or any other place identity is assured. Google does how many billion transactions? To say it is difficult is a red herring.

Accumulating processing power or storage isn't difficult. Setting it up to handle what looks like any arbitrary work flow is. Google's entire grid principally supports its search tool, which acquires information through a very simple set of rules and through a small number of well defined protocols. That doesn't alleviate you of the problem of actually designing those tools in the first place, or rapidly supporting protocols we may not even know exists. How do you source from an Excel spreadsheet, a scribble on the back of a receipt, an Atom feed, a paper-archived pre-print, and a blood sample in a lab in a way that doesn't involve hiring manpower to actually build the adapter? We're not at Skynet yet.


Yes federated is difficult if you don't build it in or don't build with open standards. LDAP and an entire set of standards do that for you, but that isn't what the original question was about. How do you create technology that will solve a specific problem or processes that technology can support to solve that problem?

That's the problem right there. A standard is only as good as its adoption, and in this particular case federation is extremely difficult because no one's figured out a way to get our own people, let alone your adversary's, to slavishly adapt their (for lack of a better word) enterprise to technologies you can conveniently exploit. You can toss a credit card and trade in laundered or counterfeit cash, valuables, or in-kind.

Even if the enemy standardized with you, those very standards include practices and tools that can still defeat your intelligence work flow. It's not hard at all to set up a chain of proxies and direct communications through them entirely through SSL--hell, there's software out there that'll do it for you automatically. So you evolve the standard in response, which brings you back to the adoption problem.


So, what is it your exactly disagreeing with? That the problem is solvable or that you have some spectacular new way you are going to suggest?

I don't know. Folk seem to forget how heterogeneous the information environment is, especially when they remember that for most of recorded history people have tackled intelligence with little more than their wits and pen and paper. So when they look at computers and see that they can perform tasks millions of times faster than the fastest human calculator they think "hey, we can do millions of times the work and that should cover it." They forget that the resolution of answers to the questions analysts routinely ask to connect the dots is very poor; I pointed out earlier that as a function of time, most of your product operates on scales covering months and years, not hours and days. Even then the gaps, inconsistencies and unconnected duplicates are glaring, as anyone in the electronic medical records field will tell you.

What I do know is that engineers and developers need to be honest with the stakeholders, especially in a field this important. Jeff Jonas blogging about how federation will solve our problems glosses over the reality that we're still learning the ropes. It's akin to Larry Ellison noting that "cloud" is just a buzzier way to say parallel processing and redundant storage--it's an obvious point to make which doesn't begin to capture both the pain and fruits of expanding that technology to the scale envisioned.

Presley Cannady
02-16-2010, 07:58 AM
Interesting little aside, since clouds were mentioned. I think this highlights the slow walk of IT towards even a clearly stated solution with a supposedly manageable scope. The problem the here is definitely a worthy one, especially in light of news reports detailing Amazon Web Service outages. A single IaaS/PaaS provider could still represent single point of failure. Can we federate these providers in a way to permit customers ot migrate their processes and data from amongst multiple cloud providers?

To that end, the Cloud Computing Interoperability Forum (http://www.cloudforum.org/) is at least a year old. They apparently made a big splash, considering the org's list of sponsors (http://www.cloudforum.org/about/sponsors/). They're targeting the major IaaS and PaaS providers, which probably numbers around 20 world-wide and of which probably less than ten really matter. The cheerleaders from Enomaly envision a future where such providers compete over your bytes and uptime cycles. They also want to go about this in a very open way, even hosting the API portion of the project on Google Code.

A year later, here's where the actual development effort stands (http://code.google.com/p/unifiedcloud/downloads/list).

Jason Port
02-17-2010, 04:52 AM
The reality is that we do not have a technical problem. While I won't pretend to understand the advanced arithmatic above, I suspect that the mathmatical solution to predicting behavior is not so far from a reality. Looking at a person, observing their behavior and applying those behaviors against a model of a "terrorist" (No, Liles, this is not a boolean - more of a sliding scale, that someone is more likely than someone else to be a baddie). In turn we can then focus our efforts on those people. Naturally, this type of system will seldom capture the angry guy who just goes off and drives his SUV through a university or a nut case who is otherwise "ok" and shoots up Ft. Hood, but it _should_ provide us a list of people to observe more closely, and so we can stop searching cub scouts.

However, there is a policy issue and a people issue at hand here. Regardless of what we want to believe, our government doesn't like to share. This is often promoted by contractors who are protecting their own turf (Data makes you king, and sharing data is seen as weakening your realm). In turn, we find that various agencies can "collect" on someone, and failure to share is not met with a firing squad.

Conversely, I posit that data entry is annoying at best, and hard at worst. Given human nature and *our* desire to find the most leisure whenever possible, people don't bother to collect on the details. I flew through RDU this morning at 0600. The woman in front of me was meddling with her personal toiletries. The TSA rep told me I could jump into another line to bypass her. However, our practice should have been to report her by name on this behavior. Was it criminal? No. Suspicious? Not really. But when taken in conjunction with other behavior, it could show trends or patterns that might indicate negative or dangerous behaviors in the future. Sadly, my crack TSA agent instead made a smart assed comment and I was on my way.

The reality is that during a survey of any data store in the intelligence or C2 arena, we might be surprised at how many fields of data we ask for and how few are actually completed. It is really hard to do trending when 80% of your database is blank.

So, at the end of the day, the reality is that the ideas you all are promoting are sound mathematically and technologically, and if used to highlight individuals, organizations or even regions, these can be effective to help us plan. But until we are really serious (I mean firing some senior people in both the Government and Industry), I will just continue fighting the good fight and hoping that we get lucky again.

Jason Port
02-17-2010, 04:58 AM
Until we solve for all of the rest, the Social Media - Faceyspaces, and TwitteryTweets are all just a time suck. Trust me - I use all of them, and they all waste time. To try and use these to paint a picture, when we can't mine structured, normalized data is simply a bridge too far.

During last year, I read about a use case from the intel community for twitter - Imagine two patrols twitter about the same event (like an IED blast) from two vantage points. Or that all of the patrol members twitter about the event. Now we have 24 reports (or whatever) about the event, and in turn our intel studs can form a complete picture based on the 24 strories of 140 characters each.

Seriously? The market has been blown sky high, and I am jumping on my iPhone? I don't know how to text and return accurate fire. Moreover, how do we know it was one event or 24? Location, separated by time could create multiple events? Is that 1 or 24?

Again, until we are mature enough to use the systems in play, let's keep reporting out of MilBook and Twitter. (OK - We can use Wikis - Intellipedia is supposed to be pretty hot - though it is just another island of information not accessible to the enterprise half the time)

Presley Cannady
02-17-2010, 07:04 AM
If anyone's heard of a higher up tasking a junior to post or link to a Word doc on the wiki, that's one huge glaring indicator that Intellipedia is nothing more than a high tech circular file. I've seen it happen too many times in business that I'm not prepared to believe a government employee makes for a better user.

Bottom line, Wikipedia works because its users--some 300,000 listed editors + God knows how many million anonymous ones--grew it to meet their mostly individual needs, and the aggregate of their contributions meets the needs of hundreds of millions more. A corporate wiki exists solely because someone ordered it deployed and then ordered someone else to contribute to it. Another example of how differences in scale pose drastically different problems.

selil
02-17-2010, 02:19 PM
A good example of the Wiki problem is at my Uni. My students all said how they needed a Wiki (woe I'm a dullard if I don't support their little socmed needs). So, we built it (remember this is the "user" community demanding it). They had many grandiose ideas of how they would use it. We populated it with course information, set it up, allowed some students editorial control (and the ability to grant it) and off to the races. We got stagnant pond water. It's still there (no real cost to leave it up), but the reality is that ONE wikipedia works, maybe a special one here or there. I know of one socmed web forum that tried a wiki too, but nobody participated.

I do reject a few things. The government intelligence community problem is not unique. It is a knowledge management issue (which is a lossy system). Wiki's are a form of knowledge repository but they are not the only ones. Small Wars Journal/Council is also a form, Amazon Answers (and others) are other forms of knowledge repositories.

The problem with most (not all) repository systems is they are passive/reactive. The issue with any technology is that it will likely be event driven and as a result not-predictive. Trend analysis and such strategies are flawed (if not we'd all be rich on the stock market). The best we can hope for is "best case" that fails rarely. I realize my compadre Presley has a bone to pick with the tech but, there are places where similar systems work pretty well if not perfectly. The imperfect, failure prone, immature technology that keeps getting referred to is over-hyped. Each of those criticisms are life cycle issues and in many cases development failures. You can't say all tech is bad and be any more relevant than the current failures in tech.

My personal belief (near religious zealotry) is that the only scalable effect that works is a mandelbrot fractal solution starting with the human being and integrating the technology. I'm far from the first person to suggest this strategy. The resulting solution is a person using technology and being replicated again and again with each smaller piece making a similar larger piece. This is how wiki's work but it isn't a wiki (if that makes any sense). Each person is a writer, editor, evaluator making thousands of judgements on each topic. Then larger groups and larger communities do the same. It is a known imperfect system (as many fake editing incidents prove). What we want to do with the technology solution is the same pattern of behavior only automate it as much as possible (the writing and data entry is all over the place being done by outsiders) and apply some filters to look for those outliers we're interested in. Will it be perfect? Not on your or my life. We still haven't reached Minority Report status and personally I hope we never do.

JM2008
02-18-2010, 03:44 AM
Don't just discard the value added of IC versions of twitter and facebook. Though they may not fill the need for many of the topics covered in this thread, mainly Terror Watchlisting, they willl/do provide a invaluable human networking resource. Not sure if you have seen the video for Chirp (IC Twitter) but if not check it out. I could not find a good link but I will keep looking and post back if I find it. But as I see it, two of the biggest problems an analyst faces is 1) not being able to get the information needed to make proper assessments due to lack of knowledge where the data he/she needs is available and 2) once they produce products not being able to distribute those products to the customers that need them but the analyst doesn't know exists. This is where Chirp really fits in: allowing analyst to publish reporting to the masses while tagging it for relevance, and allowing other analysts to pull the info based on needs from sources they didn't know existed. And allowing them to follow those sources to keep up to date.

I know there is lots of debate about whether the push or pull method of data dissemination is the best. And I think that it is really neither one but more of a combination of them. This is what Chirp does. But what it doesn't do is provide the human networking capacity like facebook... That is currently filled by old fashion email. But an IC facebook would combine the functionalities into one place and likely provide even more.

-Just my 2 cents
James

davidbfpo
12-29-2011, 12:35 AM
Reviewing my workload in 2011 I found this link to a presentation by Jeff Jonas to an EU-funded project: Macro Trends in CT Technologies. It is a rather large Mb Powerpoint:http://www.detecter.bham.ac.uk/pdfs/Macro_Trends_in_CT_Technologies_JeffJonas.ppt