| ©1990, 1995 | section list | 5: Non-manual | overview | General Contents |
| Section 4.4 | 5.1 Criteria of suitability subsections | Section 5.2 | ||
If human response to complex tasks was to be studied, experiments were needed to obtain relevant data. To help in the evaluation of experiments, let us distinguish a few constituent parts of a suitable experiment.
The following criteria are not seen as specific to the author's actual position, and therefore they are presented as general methodological points, with the possibility that the same criteria could be relevant to other research into the same area. Specific options will be considered in detail below, § 5.2.
The idea that the study of games can be relevant to understanding complex dynamic systems is supported by Rivers [113]. He suggests that study of games and simulations could address questions such as: how do people generally make decisions and cope with complexity; how does the surface representation of the underlying dynamics of a situation affect people's understanding of it; how do people generally learn to behave in relation to complex dynamic systems; what is the variability between individuals on these dimensions; to what extent is it possible to predict performance in the real situation from performance in a game? This further motivates the consideration of games as well as higher-fidelity simulators and live applications, as relevant to the general aims of the present study.
The nature of complexity is not unambiguously defined, but has been discussed above, § 1.3.2. There are factors weighing both for and against greater complexity in an experimental task.
The argument in favour of greater complexity is that the relevant real-world systems and tasks are highly complex. The more like these tasks is an experimental arrangement, the more relevant an experiment would be to these tasks. In particular, the more complex a task is (using a common-sense meaning of complex) the more likely it is to exhibit complexity as has been operationally defined above (§ 1.3.2), namely, that a variety of strategies is likely to be employed either across time, or across different subjects.
When we come to consider tasks and systems of equal or comparable complexity to these real-life systems, problems emerge. Subjects will either be fully trained or not fully trained beforehand on the task. If they are fully trained, they are unlikely to be readily available as experimental subjects, as their training tends to be expensive, leading to a high cost of their time. If fully trained personnel were to be used, the experimental system would have to closely match their normal working environment. This entails either using real-life equipment, or (usually expensive) high-fidelity simulators. If, on the other hand, subjects were not previously trained on the target system, they are likely to take a long time to develop a stable skill in performing the task. This, in turn, means either that the experiments would have to be extended over a long period (with consequent expense), or that the subjects would still be learning about the task and system as the experiments were performed. If subjects are still learning, instead of there being a stable set of rules underlying their behaviour, the rules would still be changing. Modelling a stable set of rules is the more fundamental aim: so it would seem unwise to try to model rules in a learning situation without, either at the same time or beforehand, being able to model stable rules.
If a realistically complex target system is desired, but no actual system can be used as it is, there is an unknown amount of work needed to realise an effective experimental system. In the case of building a computer simulation from scratch, the time necessary is likely to be prohibitive.
The conclusion of the arguments on complexity is that we would like the most complex target system and task that come within all the practical limitations. In practice any system that conforms to these limits is likely to be not more than fairly complex.
A related aspect of the choice of task is the choice of level of control. This considers the task with respect to the operator interface, rather than the underlying target system.
In the design of any complex task interface, there is a choice of level for the sensors and effectors. At the lowest level the primitive components of the interface correspond to individual elements of the target system---the raw sensors and effectors that are implemented in hardware. At a higher level there would be some composite sensors or effectors that in some way combine more than one lower-level sensor or effector. Let us illustrate this with a few examples.
A raw sensor might give the revolution speed of a motor, or the temperature or pressure of a certain part of the target system. There are many possibilities for higher-level sensors. A sensor which gave the estimated time to a particular condition being satisfied would have to integrate information on current values and current rates of change. A sensor for the working state of a ship's rudder needs information about the rudder angle and the angle of the water flow past the rudder. Further examples can be imagined. Any operational concept that depends only on measurable quantities could in principle have a high-level sensor built to display it.
In complex systems, the lowest level effectors sometimes have servo systems on them which cannot be bypassed, and for this reason among others the effectors do not necessarily directly alter the quantities sensed by the lowest level sensors. In ships, typically, the direct controlling actions are to set demands for the propeller speed or rudder angle, since it is not possible for these to respond immediately. Servo mechanisms then bring the actual value towards the demanded value over a period of time, perhaps several seconds. In more everyday examples, low-level effectors often take effect simultaneously with the physical control action---gear changing in a car, for example. Higher-level effectors are set up whenever programming is done. In mechanical systems, a higher-level effector might have the same effect as a number of lower-level ones. As with sensors, construction of higher-level effectors is not constrained in principle. In terms of a game or well-defined task, the highest level effector possible would be a single button that started automatic execution of the whole task.
The level of control has important consequences for what can be learnt by observing control actions. Observing control actions at the highest possible level would not reveal anything about the mental structures involved in task performance, because there would be no structure in the control actions. At low levels of control, the salient features of the control actions are likely to concern the lower levels. The extent to which higher-level structure is present and established in human control would depend on the extent to which the human had mastered the lower levels, and gone on to develop higher-level control strategies. Lower levels preceding higher levels of control is reflected in many human activities, where you have to learn `the basics' before you are able to learn the more advanced points, and this is largely dependent on experience gathered through time. If one wishes to study higher-level strategies, the situation to avoid is where a low-level interface is being used by a person who has not had the time to master the lower levels completely. For complex systems, mastering lower levels could take a long time.
The different levels of control are also reflected in Rasmussen's categorisation of skill-, rule-, and knowledge-based behaviour [101]. The lowest level of sensors is most likely to correspond to the skill-based level, where Rasmussen characterises the information as signals. When humans act at the skill-based level, their actions can often be clearly seen as effectors at a similar level---consider steering a car or bicycle, or being a helmsman on a ship without the autopilot. At an intermediate level of control, corresponding with Rasmussen's rule-based level, the actions taken are more abstract, but still without knowledge-based processing. For information to be appropriate to this level of control, it must be presented in terms of the antecedents of the rules being used. Rasmussen calls this information signs. Higher levels of control are more likely to correspond with Rasmussen's category of knowledge-based behaviour. However, at the highest possible level of control, where the task is completely automated, human cognitive processes are no longer necessarily involved at the time the control is being carried out.
The knowledge-based level is where both conscious mental processing, and explicit learning, are most likely. If explicit learning it going on, this suggests that some salient aspect of the cognitive structure is changing, and this is more difficult to study than an unchanging cognitive structure.
Overall, considerations of the level of interface suggest a fairly low level of control as appropriate to an experimental arrangement, but not so low as to make the task too complex and difficult to learn thoroughly.
In contrast with these arguments for a low level of control, the experience of the Simple Unstable Vehicle experiment (above, Chapter 4) warns us against control that is too much motor-skill based. There it was noted that investigation of motor-skill tasks is likely to require discovering about relatively low-level perceptual and psycho-motor skills.
In practice, complex tasks such as the ones we are holding as exemplars tend not to involve any motor skill. A ship's master would rarely take the helm: most actions are initiated by spoken commands. In most supervisory control tasks, there are no analogue controls present on which motor skill would be appropriate (beyond the everyday skills of pressing buttons, etc.). Therefore excluding motor skill from an experimental arrangement would benefit the relevance of the experiment.
There are various ways in which motor skills and psycho-motor limits could appear. One is hand-eye coordination: for example in which the mouse could be used to guide the cursor following an intricate route; or the cursor coordinates on the screen could be used as an analogue input to a simulation. The limitations here would be more obvious in cases where a human had impaired limb movement. Another aspect of motor skill is in the precise timing of actions: either doing a planned action at an exact moment, or reacting as quickly as possible to an unexpected stimulus. Everyone knows about their own limit of reaction time.
Having no motor skills in an interface means ruling out a whole level of interaction. This is in opposition to the idea of ``direct manipulation'' (e.g., [126]), where the advantages of physical, reversible, incremental interaction are stressed. But removing much of the vast range inherent in analogue interaction makes the job of precisely recording the interaction much simpler, and may lose a great deal of variation which did not have any significance for the present study. A further advantage is that a task with a limited range of interaction could provide a fairer comparison of unmediated human ability with the performance of pre-programmed rules.
Whatever the target system, and interface to it, there is still the question of how the task is specified. Without a specified task, users of a system might explore it, or experiment with it, in whatever way comes to mind at the time. They may set their own goals explicitly, or may rely on unspoken implicit goals to guide their behaviour. They may not appear to have any goals at all.
Being goalless is not what is wanted for this experiment, for two reasons. Firstly, real-life complex systems rarely permit much exploration or experimentation. Typically, some aspects of an operator's task are clearly defined by his or her employers, and this may well be sufficient to prevent exploration, particularly when there is risk or danger involved. Secondly, in order to study the human approaches to a complex task, we need to have as much data as possible relating to the same task. Thus, we do not want to allow users to make up their own tasks as they go along, with the twin risks of the task changing frequently, and it being not easy to know at any time what the effectively current task is.
In real-life tasks, any operator may be motivated by a number of factors, some of which may be common to all operators, and some which may be personal, or may be varied in the strength which different individuals attach to them. In this sense, the tasks performed by different people in the same job are not necessarily identical. This is even more likely to be true in complex tasks, where there are a variety of possible strategies, than in straightforward tasks, where there is a highly constrained set of methods and acceptable outcomes. Ideally we would want to dispense with this variation of motivating factors, for the sake of this stage of experiment.
Explicit predefined goals would avoid these problems, and provide a stable and well-defined task for operators to adapt to. This may be more motivating than trying to achieve one's own ad hoc goal, if only because it is difficult to give oneself finely-graded feedback on a self-defined task, and without fine feedback, the improvement with practice will be less noticeable, and therefore probably less motivating. An experimental subject is even less likely to set goals of the type usually encountered in complex systems: that is, multiple conflicting ones.
Another important factor for the potential subject is the inherent interest and challenge of the task. While a well-defined task is an important element in this, another important element is the nature of the task itself. It would seem likely that an operator could relate more easily to a task that has some realism in it, and where ``things happen''. This realism need not be the strict engineering realism of high-fidelity simulators, especially not so if the subject has no detailed knowledge of the target system. But it should give the sense to the subject that he or she is engaged in a real task. One way of spoiling this sense of realism is to have a component of the simulation behaving counter-intuitively. This is less likely to matter if it is only a weakly-held intuition about something of which the subject has little experience, but even in unfamiliar situations there will be some strong expectations based on general knowledge of the world, and these should be respected.
We turn now from the requirements of the subject to those of the experimenter. What does the experimenter do, if the experimental arrangement turns out to be producing data more relevant to another study than to this one, as was the case with the SUV study (Chapter 4)? In principle, the target system, the definition of the task, or the interface could be modified to change the nature of the data produced. An experimental system would be better, on this criterion, if it was able to be modified. A simulated target system may need to be altered if the behaviour of some part proves counter-intuitive. The task might need alteration if it produces behaviour which is either too knowledge-based or too motor-skill-based. The interface might need modification if it is too much of an obstacle in the way of performing the task.
Modifying the target system itself would be difficult for a system not written by the experimenter. The task could be changed in any case, but if the interface was not able to be changed, the task definition would have to be on paper, which may not be so satisfactory (as argued above). Altering the interface has similar constraints to altering the target system, except that no knowledge of simulation mathematics would be required. The main point here is that modifiability is not easy criterion to satisfy, and therefore needs close consideration.
The need to log data is a briefly statable but centrally important criterion for a good experimental system. Without the ability to log data and analyse it, the experimental method would be severely constrained, and would have to rely on verbal reporting (for a discussion of verbal reports, see Bainbridge [6]).
Detailing this requirement, data needs to be logged in such a form as would permit the complete regeneration of of experimental trials: both the situations which occurred during the experimental runs (in terms of information presented), and the actions taken by the operator. This must be machine-readable. The practical considerations of storing the data must also be taken into account.
A final practical criterion of choice is the obvious one, that whatever system is chosen must be realisable in some way or other. For systems tied to bulky hardware (such as training simulators), this means in practice that access is needed. For ready-built simulations, the code must be available in a form which can be run on an available machine. For unimplemented simulations, the mathematics must be available, and it must not be too difficult to code. If the simulation and the interface are separate, the same considerations apply to both.
Subjects must also be obtainable, which means taking into account any need for skill or experience, and the time the subjects are needed for.
| Next Section 5.2 | |
| General Contents | Copyright |