The Defense Advanced Research Projects Agency (DARPA) is working on a project called DEFT (Deep Exploration and Filtering of Text), which aims at developing a system to analyze textual data at a scale beyond the capabilities of human intelligence. With DEFT, DARPA would be able to analyze data from virtually all possible sources, from YouTube videos to communications intercepted via spy programs.

DEFT’s aim, according to its website:

“Automated, deep natural-language processing (NLP) technology may hold a solution for more efficiently processing text information and enabling understanding connections in text that might not be readily apparent to humans. DARPA created the Deep Exploration and Filtering of Text (DEFT) program to harness the power of NLP. Sophisticated artificial intelligence of this nature has the potential to enable defense analysts to efficiently investigate orders of magnitude more documents, which would enable discovery of implicitly expressed, actionable information within those documents,”

The technology could help intelligence analysts scan more text documents and audio files to determine what is being talked about in them and decode ambiguous statements or indirect references to people and things. Analysts would receive alerts that frame the suspect when the system yields the result of the analysis, and would also receive a set of related sources of information. DARPA hopes new NLP approaches will help it thoroughly understand documents at a semantic level. Another goal of DEFT is to collect all the information and organize it into a database that would further ensure ease-of-access and accessibility to large sets of valuable data.

The agency is investigating a number of other machine learning approaches to organizing and analyzing the large sets of data that defense agencies (not just the NSA) are collecting, via its $25 million XDATA program, and through funding software and research at Stanford University, Columbia University, and Carnegie Mellon University.

