Data farming is not all that new, but it is still not well-understood, even in the operations research and modeling and simulation communities. But as researchers come to understand it, they come to love it.
Unlike data mining, in which analysts sift enormous amounts of existing information for useful nuggets, data farmers grow their own. Researchers ask a “what if” question, build a model and run a simulation thousands or even millions of times on a supercomputer or high-performance computing grid. Then they look for trends, anomalies and outliers in the “grown” data.
“Data farming allows the examination of whole landscapes of potential outcomes, not just a few cases,” said Gary Horne, an operations research analyst with MCR Federal in Arlington, Va., who coined the term in 1997. “It provides the capability of executing enough experiments so that outliers might be captured and examined for insights. Data farming is not intended to predict an outcome; it is used to aid intuition and to gain insight.”
The technique can help analysts identify paths to success, said Ted Meyer at the Pentagon’s Joint Improvised Explosive Device Defeat Organization. For example, you could model a relatively simple scenario in which the Red Team is protecting an objective and the Blue Team is trying to penetrate it. Perhaps you discover, after running the simulation a few thousand times, that Blue achieved its objective in only a few dozen runs.
“We only looked at the cases where Blue succeeded in penetrating, and identified what factors enabled them to do so. We then ran simulations where Blue purposely take that path rather than randomly, to optimize their chances for success,” Meyer said.
Researchers can also vary the inputs and then restudy the outputs, allowing better understanding of what affects certain things more than others, he said.
The anomalous outcomes are generally the most interesting.
“When models behave in unexpected ways, it always tells you something interesting, even if it is just to challenge the assumptions in your model,” said Michael Lauren from the New Zealand Defense Technology Agency, which uses a modeling tool called MANA.
“We used MANA to model asymmetric threats to frigates and came up with some pretty interesting strategies. We quickly hit on the most useful tactical options, some of which would have required quite a bit of lateral thinking to come up with. Furthermore, figuring out where a tactic breaks down potentially allows you to understand what your weaknesses are and how your adversary might exploit that.”
Data farming grows even more useful as the problem grows more complex, Meyer said. Situations dealing with economic, human interaction, cultural and social behavioral variables, with intangibles such as morale, trust, courage and charisma coming into play, and ones that involve multiple stakeholders become extremely complex. But a sufficiently fast computer can generate a vast array of possible outcomes and allow analysts to find the desired ones.
Santiago Balestrini-Robinson of the Georgia Tech Research Institute is working with a data farming team to develop a socio-cultural model for a disaster relief scenario where shortages, lack of sanitation, famine and other factors could incite violence.
“It’s an iterative process,” Balestrini-Robinson said. “You can get very emergent behavior, where something that may seem like a good course of action eventually has unanticipated second- and third-order effects that actually make things much worse.”
Working in multidisciplinary groups is useful. “We can show some results and get immediate feedback from people on our team, or on other teams working different problems but who may have expertise that we can tap into,” Balestrini-Robinson said.
In practical application, sometimes it is useful to divide large problems up for analysis. “We are looking at the same question base under different angles of view, from a very high abstraction to highly detailed truth,” said Klaus-Peter Schwierz, senior manager at EADS’ Cassidian division, where he is an operations research adviser to the German Federal Forces. “We start evaluations in the high abstracted world, looking at the universe, and then we drill down to the highly detailed world. The two methodologies are not competing [with] each other; they are completing each other.”
As with any modeling effort, data farming requires a balance.
“The model should be as simple as possible, and as detailed as necessary,” Schwierz said. “The goal is to apply the model, evaluate the results, and then visualize and present the results so that they are not just understandable to computer geeks.”
Combined with global connectivity, data farming promises a new era in reachback support. A commander in Afghanistan might send a “what if” to analysts back home, who would crunch the numbers and send back insight on how to approach the situation as it unfolds.
In theory, this would be like playing Scrabble with a cheat program that tells you the best move, based on the board and the tiles you have. As the board changes and you draw new tiles, the program updates your list of moves. In reality, obtaining and managing the necessary data remains difficult.
“Effectively gathering, preparing and analyzing the potentially huge number of results of these experiments is still a major hurdle for most analysts,” Meyer said. “Ongoing improvements in computer processing power and resources have provided scientists with the raw horsepower required to examine problems that are complex, and the methods for gaining access to these resources and implementing experiments are being developed in the context of various disciplines.”
Yet the Swedish Defense Research Agency is already working on a way to evaluate options for expeditionary operations.
“We have found the data farming methodology allows for robust evaluation of existing equipment alternatives using existing or new tactics, techniques and procedures,” said Johan Schubert, a deputy research director at the agency.
Data farming might even improve acquisition decisions across coalitions by helping national leaders understand their options in aggregating coalition force structure to meet a range of threats, said operations research analyst Steve Anderson at the Naval Surface Warfare Center. That’s critical, he said, because “force structure decisions last for decades.”
Edward Lundquist, a retired U.S. Navy captain, is a principal science writer with MCR Federal.