The Defense Department has not established guidelines for evaluating how well simulators prepare trainees, like these soldiers using an air traffic controller simulator, for their jobs. (Army Reserve)
Even as more military training takes place on or with the help of simulators, the Defense Department lacks a general method to measure how effective that training is.
Neither DoD directives nor military service regulations specify how to measure such training effectiveness. DoD has tried different approaches, but these have run into difficulty for one reason or another. Discussions with the services indicate a need for a better knowledge of simulator effectiveness, how they can be improved and identification of promising new applications.
We propose a straightforward, two-pronged approach. First, the military services should routinely track trainee performance on the simulators themselves. Usually, good performance on a simulator translates into good on-the-job performance. Individual training facilities can (and usually do) track simulator performance on their own.
Second, the military services should periodically survey those most directly involved with simulator training: the trainers and past trainees. These individuals can provide valuable information on the strengths and weaknesses of a particular simulator, how well it prepares trainees for military duties and what areas need improvement.
Such surveying — today, the exception rather than the rule — would deliver constructive information about simulation effectiveness at low cost without disrupting the training itself.
Insufficient Military Guidelines
DoD guidelines for simulator training focus on general requirements and when to file evaluation plans, rather than offering particulars on how to measure the effectiveness of such training.
A 1986 DoD directive applies to any simulator or training device that meets the criteria for a major automated information system acquisition program, or to any special interest program so designated by the secretary of defense. It calls for a training effectiveness evaluation plan, or TEEP, to “meet the Military Service’s training requirements and effectiveness levels.” The TEEP is required six months before evaluating the effectiveness of simulator training.
However, the 1986 directive is silent on the tools used to measure this effectiveness. It neither discusses alternatives nor recommends any particular set of tools.
In 1997, the DoD Office of the Inspector General recommended that the Pentagon “establish policy and procedures for evaluating the training effectiveness and cost-effectiveness of large scale training simulations,” but no subsequent DoD directive or instruction has been issued on the subject. Further, current military service regulations are silent on how to make such measurements.
Defense Study Highlights Difficulties
These directives and regulations have been slow to develop for a reason. A 2000 study for DoD concluded that the military services have found it difficult to implement effectiveness measures of simulator training. That study discusses five forms of such measurement, including the two preferred ones: a cost-benefit comparison and calculation of return on investment, and transferability of skills learned from simulator training to actual job performance.
But these two direct measurement approaches are problematic because implementing them generally requires considerable time and resources. Some also involve controlled experiments that disrupt training and other operations.
The 2000 study indicates that analytical computer models are the third-most preferred form for measuring such effectiveness. This method only simulates operations, thus can’t disrupt them. However, experience indicates that the analytical models are unduly costly and time-consuming to develop and update.
The two remaining methods for evaluating the effectiveness of simulator training are direct measurement of learning with the simulator and pursuit of feedback. Direct measurement is not very costly, time-consuming or disruptive. It simply measures changes in performance as a student practices on the device.
The pursuit of feedback involves direct contact with trainees and trainers. These participants are usually in a strong position to evaluate how well the simulators have prepared them for their jobs, and can point out what was and was not useful. Surveys are cited as a good method to obtain this kind of information.
However, the 2000 study indicates that the military services at that time were not executing evaluation surveys very well. First, they surveyed trainees but not trainers, which limited the usefulness of the survey results. Trainers are in a good position to judge the technical properties of the simulator training process and offer improvements.
Second, the surveys often did not facilitate candid feedback. Some of the survey questions were designed in a way that tended to elicit positive responses. Conducted in-house, the surveys also may have predisposed respondents to answer positively. Survey bias can lead to misleading results and false implications. Our more recent discussions with military personnel indicate the situation is much the same now.
We discussed simulator effectiveness measurement with one of the authors of the 2000 DoD study. He offered several recommendations:
■A measurement system should build as much as possible on what the military services already are doing.
■Introducing a complicated, resource-intensive system is unprofitable.
■Any new tool should be simple and accommodate the military in terms of operations, time and resources.
Guidance from Industry
More recommendations came from the National Training and Simulation Association, which helps its member companies as they develop simulator training systems. The NTSA suggests:
■Use simulator-job transferability measures of personnel effectiveness, if possible. The effectiveness of simulator training should be evaluated with direct measurements of task performance and operator skill, where feasible and practical.
■If direct measurement is not possible, use measures that focus on how well learning takes place on the simulators. Track such learning as part of the training sessions.
■Attitudes and impressions of trainers and trainees should be measured and included in every evaluation. The inputs of trainers and trainees are essential for the continued success of training with simulators.
Our formulation bears this guidance in mind. The military services should routinely track trainee performance on simulators themselves. They should also survey trainers and trainees to obtain firsthand feedback. The trick is to survey in a way that valuable information is gained at a low direct cost and with minimal disruption of operations.
Proposed Survey Tool
Our proposed survey approach takes to heart the practical methods described in the 2000 DoD study, the advice of the study author and the current NTSA guidance. It involves surveying trainers and trainees about both the skills they learn and how well those skills transfer to actual job performance.
For several reasons, the military services will need qualified outside help to conduct such surveys. For one, a third party should conduct the surveys to ensure that responses (and nonresponses) are kept confidential, an essential ingredient in obtaining candid feedback.
The use of a well-qualified third party to design the surveys and guarantee the confidentiality of responses will greatly increase the chances of forthright, candid views.
The design and analysis of such surveys can be tricky, often requiring expert formulation of questions and sample sizes to ensure valid, reliable results and inferences.
We call the surveys for measuring simulator effectiveness the Simulator Effectiveness Tool, or SET, which would cover two classes of respondents. The first is personnel who have experienced the training in question. Trainees should be able to offer well-informed opinions on how well a particular simulator prepared them for their eventual duties and also how well simulator training was integrated with schoolhouse and live training.
The second would be the military trainers in charge of simulator training and certification. These trainers should be able to assess simulator effectiveness from the data they collect on learning during simulator training — such as ease of achieving proficiency levels, unusually high or low “washout” rates and the adequacy of the fidelity of the simulators.
The surveys might include questions on the backgrounds of the trainees and trainers. Relating simulator training feedback to this background information can help determine whether assessments are statistically universal or depend on the particulars of the respondents. Such findings are important for determining how much weight to give to the training feedback.
SET’s particular value is its ability to solicit feedback on how to improve simulator performance. For example, we could ask the following:
■How realistic is a particular simulator regarding visual effects, physical range of motion and representation of unanticipated events such as enemy actions?
■Did the simulator adequately prepare the trainee to perform on an actual job, for example, by learning how to troubleshoot problems?
■Could the use of the simulator be improved with the following: (1) more time spent per simulation, (2) more scenarios or (3) more repetitions?
A comment section at the end of the survey would give respondents an opportunity to elaborate on their answers to such questions.
By using standard surveys, respondents should be able to complete them quickly — within about 15 minutes. By accommodating busy military schedules, surveyors should see an improved response rate, leading to more data and better tracking of simulators.
Using the survey approach is also substantially cheaper than more intrusive methods of measuring simulator training effectiveness.
Ultimately, SET has the potential to provide inexpensive and timely feedback to a military service that routinely uses simulator training and is looking to improve it. In a period of fiscal austerity when more training is likely to shift to simulators, quantifying what does and does not work is more important than ever. ■
Michael Canesis a senior energy consultant with LMI, a not-for-profit consulting firm. Lawrence Schwartz is an independent consultant with expertise in government surveys, performance measurement and statistical techniques.