WASHINGTON – The Defense Innovation Board, a group put together by former Secretary of Defense Ash Carter and headed by Google chairman Eric Schmidt, is weighing the possibility of creating a central repository for the massive amounts of data collected by the military.

Changing how the Pentagon collects, maintains and uses data is going to be vital for the U.S. to maintain its technological edge going forward, the board members concluded at an April 4 hearing at the Pentagon. At the same time, the advisers acknowledged that the project comes loaded with thorny security and cultural questions.

When the group held its first public meeting in October, it laid out a collection of suggestions, including the creation of a chief innovation officer, new software testing rules, and a focus on machine learning. But after six months of research and visits to military bases, the group has zeroed in on the issue of data management as one that impacts every other idea on innovation.

The exploitation of data is central to any attempt by the Pentagon to employ artificial intelligence or machine learning, with Schmidt noting in his opening remarks this week that "the military, as a general statement, has data everywhere and nowhere as a result. … It's clear that without doing that a lot of the things the military would like to achieve are not going to happen."

Right now, there is no common database for all the data collected, with hundreds of different databases stored all over the department, the majority of which do not have common coding that would allow interfacing.

Speaking in March, William Roper, the head of the Pentagon's Strategic Capabilities Office, said the department focuses "on data in a 1990s-era way -- data for us is like something that you use to go into the fight and win, and after that fight, the purpose of the data, its raison d'etre, is over.

"And that is not the way that the commercial world, especially the big companies that are trying to work analytics and deep learning machinery -- to them, that data is truly gold. Probably better, it's probably closer to oil," Roper said. "It's a commodity, it's a wealth, it's also a fuel, and you're data keeps working for you even after you've used it."

But getting to the point where the Pentagon can use data like the commercial sector won't be easy, with a good example of the data problem facing the Pentagon laid out during an exchange with Lt. Gen. Jack Shanahan, director for defense intelligence at the Office of the Under Secretary of Defense for Intelligence, who said the Pentagon collects 22 terabytes of data every day, roughly the equivalent of 5.5 seasons worth of video for the National Football League. 

"You cannot exploit 22 terabytes worth of data the way we are doing things today," Shanahan said. 

But in 2012, Facebook said it was handling more than 500 terabytes of data a day, a number which will only have expanded in the five years since – and that data is all processed, stored and used to impact how a Facebook user interacts with the website, including creating targeted content relevant to their interests.

In other words, what Shanahan called a "tsunami" of data is something the commercial sector could handle five years ago. 

Schmidt himself cited that 22 terabyte figure and noted that "within the business world, this is not overwhelming. Those kind of numbers are easily dealt with, with modern computing. So there is an example of a big gap between the commercial and defense worlds." 

In addition, all involved agreed that getting all the data in one place is another thing – figuring out how to make the various databases talk to each other, and knowing how to mine them, remains a major challenge given the number of older systems involved.

Culture and security concerns

During the hearing, six witnesses from inside the Pentagon were given time to discuss data projects and concerns, and they all largely touched on the question of culture.

The most impassioned comments – and the only one to garner applause from the audience — came from Bess Dopkeen, an analyst with the Pentagon's Cost Assessment and Program Evaluation (CAPE) office, who said that within the department, "Everyone fights the collection and sharing of data everywhere you turn."

The bureaucracy incentivizes not sharing data, Dopkeen noted, because the perception is data sharing means "others will take your money and your influence." Dobkeen's comments were echoed by others, especially during the comment period toward the end of the event.

And there is another cultural issue to overcome, as noted by retired Adm. William McRaven, the former head of Special Operations Command and the only member of the DIB to have served in the military. He noted that while the benefit from data might be easy to see in offices inside the Pentagon, those on the front lines are going to want to see tangible benefits.

McRaven's comments, combined with those of the witnesses who spoke to the group, led to Schmidt wondering if  "we're going about this slightly in the wrong order," focusing on capabilities before culture.

After some back and forth on that point, Schmidt concluded that the group should look into whether there are simple, near-term projects in the fields of intelligence, surveillance and reconnaissance which can be launched to show the benefits of big data.

Another issue raised during the event was on how to secure the information, with astrophysicist and television personality Neil deGrasse Tyson warning against creating a Library of Alexandria – the ancient central repository of knowledge where the information was all lost in a fire.


Other members of the board downplayed that concern, noting that it wouldn't be one actual physical database, highlighting commercial cloud databases with strong encryption. But they acknowledged there is a balance between protecting data and making it inaccessible. 

Milo Medin, vice president of Access Services with Google Capital and a former NASA official, also raised a concern about the Pentagon's current data security strategy.

Inside the department, there is a "focus on 'communication security,' versus 'information security,'" he said. "We tend to say it's okay to store data in the clear at rest, we'll just protect it when we move it around. This is a fundamentally bad idea, because it opens up vectors for attack when the data is at rest, and it gives you a false sense of security for protecting the data when we're moving it around."

What concrete next steps the innovation board can take is still being debated, but the group pledged to remain focused on the subject and to work with Secretary of Defense Jim Mattis.