©1990, 1995 section list 8: Conclusions overview General Contents
Section 8.1 8.2 Implications subsections Section 8.3

8.2 Further implications for systems design, decision aids, and training

When discussing cognitive psychology above, the point was made that having a theory does not necessarily mean, either that is it relevant to real problems, or that it can be put to good use. Since this study was motivated by real problems in the world, it is appropriate to consider briefly how the work done could be built on to provide something of external use, which serves some of the purposes introduced in §1.3.4.

These could come from either of two levels of the analysis in this study: firstly (and more simply) from the representation, along with its implications for information use; and secondly (with more difficulty) from the rules, of a particular individual's task performance. The context structure and information usage could be useful to the incremental redesign of interfaces; but application to early design, training, and what is here called the 'Guardian Angel' approach to operator support, need a model of rules as well as context structure. Before considering these in detail, we shall first look at obstacles remaining in the way of applying the methodology to real tasks at all.

8.2.1 Preconditions for applying the methodology

The methods developed here aim not to rely on subjective judgement, but as far as possible to be objective and automatic. Hence it is important that the information automatically gathered and processed for the analysis covers as much as possible of the information actually used by the human operator in making decisions or in performing actions. In §8.1.4, we have discussed problems attending representing the information that is available, but this still leaves the problem of gathering relevant information in the first place. This research does not circumvent this problem, but rather highlights it by providing methods of using the information that has been gathered.

Applying to ship navigation

We have discussed above (§3.1) how ship navigation involves a number of aspects, and that there is limited usefulness in analysing one without the involvement of the others. But capturing data relevant to all the aspects presents major problems. Even for collision avoidance (a relatively easily represented aspect of the task) it would be difficult to capture the unmediated visual information that an officer of the watch might well use. And although it should be possible in principle automatically to capture the data presented on an advanced plan position indicator, there would still be problems in formalising that information, just as there were problems in the formalisation of graphical information in the experiments of the present study. But if we go beyond collision avoidance, the involvement of several people on the bridge clearly implies managerial aspects to the task, that would be difficult to capture and formalise automatically; and even on a one-man bridge, even in fog, the navigational decisions are greatly affected by communication with external agents such as harbour authorities and other ships. Before ship navigation can be fully analysed using the present methods, there is clearly more work to be done in gathering appropriate information.

Applying to other complex tasks

The problems in analysing ship navigation generalise easily to other complex tasks. We can imagine, in any complex system, the existence of sources of information that were either not monitored electronically, or difficult to formalise, or both. As well as the fairly obvious sights, sounds and smells, it is not uncommon in complex systems for the actions of an operator to be affected by factors relating to other people involved in the process. Keeping other people happy is not to be forgotten, but surely very difficult to capture and formalise.

The obvious way of attempting to perform a study in circumstances like navigation or other complex tasks would be to have a human observer recording significant events, and this could be fruitful; but there is the danger, at each of the stages of observation, recording, and interpretation, of the researcher's representation of the task interfering with discovering that of the subject being studied.

Fast dynamic tasks

The faster dynamic tasks include riding bicycles; many computer games, discussed in §5; and car driving, which will be used as an example below (§8.2.4). These pose particular problems in analysis, to the formalisation of both situations and actions.

The problem with the representation of situations is that there is an information input of major importance from vision, in a way that can appear to have the character of pattern-matching; and in the cases of driving and riding, there is much potential for information to be gained from such senses as proprioception and balance, which might be difficult to capture. Even to model human performance in the tasks of this kind that are easiest to analyse-the computer games-one might have to apply some sort of pattern-recognition preprocessing to get a tractable description of situations. At least, for computer games, the presented information is available for analysis: in driving or riding, automatic sensors cannot yet collect the same kind of information, to the same detail, as humans can. Simulation for these live tasks is certainly not easy, as was reported in §4 for bicycle riding, and therefore this would not be an easy alternative for analysing the content of the skill. A practical approach would therefore have to rely on gathering what data can be gathered by more straightforward means, and hoping that an analysis based on these would match a human performance even though it was not identical.

The problem with the representation of actions is that there are in the case of driving and riding only a few analogue inputs (steering, power) that are used to perform the wide range of actions that we are able to discuss for these tasks. Detecting what action had been performed, or indeed intended, from a record of physical movements of the controls would be difficult, and until this had been done, no satisfactory analysis of the skills could be expected (see the analysis and discussion in §4). Even with computer games of this type, where the input is technically digital rather than analogue, typically most, if not all the control is via a joystick or mouse, and there is still a great problem in adequately formalising and recognising the higher-level actions, given only data on the very limited range of low-level signals. Despite one reported attempt to extract machine pole-balancing skill from human performance on a pole-balancing simulation (see above, §3.2), simply in terms of the primitive left and right movements of a joystick [79], it remains doubtful how near this approach can come to modelling the human skill itself, without making higher-level characterisations of the actions.

For this kind of game, players are usually performing against their limits, such as reaction time and coordination. An adequate description of these skills would want to take into account these limits in some way, which may include modelling some psycho-motor aspects of task performance. Even for tasks in which the psycho-motor aspect is not a critical limiting factor, it is possible that psycho-motor factors affect the way the task is performed: this would again mean modelling these factors.

8.2.2 Interface redesign

If we imagine these preconditions to be satisfied, we could use the data gathered to reveal context structure to the task (whether by the methods used in this study, or by extensions and generalisations such as those discussed above, §8.1.4), and thus to progress towards interface redesign, addressing some of the problems of interfaces introduced in §1.2. The context structure could be made the basis for the interface, by being used as a fundamental unit of the organisation of the interface. It might help the immediate comprehensibility of this if the operator could agree meaningful names for each context. We could imagine, for example, one screen for each context, in an interface where one VDU is used to display many screenfuls of information. An important part of the interaction would be maintaining the appropriate context for the situation.

Within each context, the information that had been found relevant could be displayed. In general, we could expect the amount displayed at one time to be less than is generally presented simultaneously in complex system interfaces, but one should not lose sight of the value, and actual use, of redundant information in the same context, both to support a small range of alternative strategies, and to enable checking of the internal consistency of the data. The present study has not examined the use of redundant information, but one could extend the methods used here, to discover redundancy.

Some aspects of the style of the interface could also be based on knowledge gained from the analysis. If a context was highly ruly, the rules being derivable from only a small number of quantities, care could be taken on the appearance of the display to maximise the ease with which those quantities could be apprehended. This could well involve qualitative or symbolic displays, as well as the more usual continuous analogue or digital ones. But if the context did not have well-defined rules, and instead appeared either to be one where the information processing had a knowledge-based character, or one where the quantities needed were not sensed directly, one might want to design the display to enhance access to other sources of information, and the making of inferences, links, or analogies. Hypermedia-style interfaces come to mind here. For more subtle skills involving many small pieces of information, or graphic information, processed by pattern, one could design the display to enhance this ability, by ensuring that the important aspects of the pattern were salient.

If an interface were to be redesigned for a number of operators or users, the ease of using the currently suggested approach would depend on the extent to which the users all shared the same context structure and information use. If there were a lot in common, then the redesign could proceed in essentially the same way as for one individual. However, the present study would doubt whether individuals would be likely to share a lot in common in such a complex task, without at least extensive training designed to ensure conformity. If the context structure, only, were common to the different operators, and the information use different, then a redesigned interface could include all the information that any of the operators used in any one context. But if the operators' context structure were different, in terms of the divisions and boundaries between the contexts, and the higher-level rules governing transitions between them, it would be difficult to obtain a principled common redesign from the methods of the present study.

While discussing redesign, it may be pointed out that redesign of an interface is likely to change the way an operator performs the task, thus invalidating the analysis that led to the changed design. Another factor that would invalidate the design is if the user changed information use. Thus, in this paradigm, redesign should not be based on an analysis of operator's performance while that operator's representation was changing, but it would be valid if the representation had achieved stability, even if the action rules were still changing.

It is relatively easy to see how this methodology could lead to incremental improvements in an interface, particularly to do with prioritising the information that was actually used over that which was not, thus easing the workload of operators, and reducing the chances of overlooking important information. For the more difficult objective of identifying major deficiencies in the provision of information, a context analysis could provide the raw material for someone with extensive knowledge of the task to use their creative insight to suggest, for example, where a new source of information could relieve an intricate context structure built around a badly instrumented system.

8.2.3 Safety

Having suggested that this methodology is primarily able to support the redesign of interfaces for individual operators, we would agree that it would be counterproductive to design an interface to support any maladaptive strategies that might have been adopted by a particular operator. Such maladaptive strategies could not easily be detected on the basis of context structure alone, except by means of another expert's judgement on what information should be consulted in a situation. If, however, a detailed model of the rules of a particular operator had been obtained, from a deeper analysis, this model could be tested on a simulation of the task, to see whether the rules used were likely to lead into trouble or inefficiency in any situations-most likely those that were not frequently encountered, but possibly also where bad habits had consolidated. Finding such maladaptations could lead to the recommendation of remedial training on a simulator, focused on the kinds of situation where the model had shown potential problems.

This kind of detailed analysis of individual rules and their potential failures could bring a qualitative change in methods of human factors safety and risk assessment. Probabilistic methods of assessing the unreliability in the execution of a particular action cannot take into account the possible variation of reliability across different contexts, without having a context model. If errors could be analysed in a context-based way, there would be the potential of a very informative and precise attribution of causes to errors. The more accurately the rules of human task performance are known, the clearer will be the explanations of failures in that task.

8.2.4 The Guardian Angel support paradigm

The prospect, even a remote prospect, of having a rule-based model of operator performance in a complex task, stimulated the concept of a kind of operator support that invites the name, "Guardian Angel". The concept is like that of a guardian angel, supposedly looking over the shoulder perhaps, remaining mostly in the background; understanding actions in terms of intentions; intervening when action and intention do not match, or where a harmful intention is formed, or where important information is overlooked. The intervention could be either giving advice, information, or asking a question to direct the attention to something. Like a guardian angel, such a system would be inherently personal. In order to relate some examples to common experience, we will here consider potential application to driving a car. This example is chosen not because it is typical of the kind of complex task considered in the present study (it is not), but in order that we may relate the concept to a task of which most adults have extensive experience.

Probably many of us would be very glad of a voice quietly telling us that there is a police car following behind. The potential value of this is recognisable irrespective of the practical difficulty of implementing this technologically. But when would a guardian angel system make such an announcement? Not every time a police car began to follow, for that might become annoying. If a guardian angel system had learnt that in every case when I knew a police car was following, I rigorously obeyed the current speed limit, then the observation that I was not being rigorous would enable the hypothesis that I had not seen the police car. That would be one truly helpful time to let me know. Equally well, if I was already cruising along at the speed limit, pressing my foot hard down on the throttle would be clearly inappropriate. A similar warning, delivered in a timely fashion, could keep me out of trouble.

This is not universally applicable, however. A fire-engine driver would not appreciate such advice, and indeed, there would be no such rule observable from a fire-engine driver's past behaviour. Here, a guardian angel system would not intervene, because the actions were within the normal range.

A guardian angel system would have to know about different classes of passenger, as well. We would not want it to give suggestions to the effect that I was driving slower than normal, when I had an aged relative as a passenger. On the other hand, if I forgetfully drove with my normal style for lone driving, I would appreciate a reminder (preferably visible, and only to myself) that I had a granny in the back.

In other examples, there might be no simple external explanation of the deviation from the normal range of behaviour. A guardian angel system might come to know what my normal performance was in keeping my place in a lane-how much I deviated each side of the mean, what the frequency of the deviations were, etc. Large oscillations are obviously undesirable, and if I started to exceed my normal limits of performance, it would be quite in order to suggest that I should reduce my speed. If information concerning my consumption of alcohol was also routinely available (or perhaps taking of some prescribed drugs) a guardian angel system might recognise the performance as belonging to a known category of substandard performances, for whatever the reason was. An ideal system might also know from experience what to do to reduce the risk of accident in these circumstances. This would be based on generalisation of many people's behaviour.

Probably most of us are aware of a wide variety of driving style, though we would be hard put to define exactly how to measure it. To design a general driver advice system would mean designing to the common factors of a large number of drivers. Such a system would not have the ability to discriminate between the same performance when done by different people, and the different implications of this. We would not want drivers with somewhat less motor control continually to be criticised for their lack of perfection, but it would be useful to be able to consider the possible reasons for an otherwise excellent driver to be driving in a way that might be reasonable for others, but distinctly bad for him or her.

Perhaps the most important distinction between a general advice system and a guardian angel system is in the likely response from users. With a general advice system, there would be external rules and standards, and it is easy to imagine a driver rejecting advice with the riposte that "It is perfectly safe!" The idea behind a guardian angel system is that the advice given would be like one's own advice, distilled from one's own performance, gathered and approved over long periods. If delivered appropriately, that advice should not suffer from the same disadvantage of being felt as fundamentally alien.

In order to work at all, such a system would need comprehensive access to information about the quantities or variables that affect task performance. Initially, there would be a learning phase, where the guardian angel system formed a detailed model of the user's task performance skill. The only advice that could be given initially would be of the same grade as from a general advice system, possibly tailored by user's choices, or by fitting the user into a stereotype.

In time, enough data would accumulate to allow the derivation of a representation, and rules. Further data would have to be continually added and analysed, to ensure that the rules remained current. If no input was obtained from the user, the rules would reflect what he or she did, rather than what he or she thought was good performance. However, one can see much more potential coming from a system that includes value judgements from the user on his or her own performance. Most people seem to be aware when they make a mistake, or do something that they would rather not repeat, and if this value judgement could be incorporated by the guardian angel system, it could form models not only of what the user did, but what the user thought was decent performance. This might lead even to the ability to suggest causes of poor performance, and suggestions for avoiding that.

In the European Community DRIVE Programme (Dedicated Road Infrastructure for Vehicle safety in Europe) there is a project, Generic Intelligent Driver Support (GIDS) [130], which aims to deal with similar issues, specifically for driving, including the idea of an adaptive interface: but they do not consider machine learning as a tool. The advantages of approaches using rule induction are firstly that the adaptation of the interface could in principle be closer than is possible using predefined stereotypes, and secondly that it provides a searching test of the adequacy of a representation for describing performance. If such a test is not carried out, it is easy to rely on a representation that only encompasses a small subset of the real task. This is related to the problem identified earlier (§2.3.1.5) in which formalisms may fail to represent a task adequately.

8.2.5 Training and assessment

The clearest and most obvious application to training, of the analytic methods discussed in this study, would be to assess training's efficacy. Since training can potentially make a difference to the way in which a task is learnt, a detailed analysis of what has been learnt, in terms of contexts and rules, may reveal generic differences in the results of different training schemes, beyond the differences between the individual trainees. If the individual characteristics of the trainees were taken into account, there would be the potential for discovering whether different training regimes suited different types of people. In these cases, the context and rule analysis of task performance would be contributing more detailed feedback about what is learnt than is normally obtained through straightforward tests of speed or accuracy.

However, in the course of this study we have confirmed that it is more difficult to discover detail about task performance when a task is still in the early stages of learning, because in general the contexts and rules have still not stabilised. One outcome of the analyses performed is to show that different contexts differ in their degree of ruliness; and this suggests that even at the earlier stages of training one might be able to identify particular contexts where the rules were established relatively quickly.

Designing a training programme for a new task poses more problems. After such a program had been running for a while, analysis of the results of training, as above, might be able to guide a redesign. But this does not help in the initial design of training: for that, one would need general principles of what was learnable, and what a human would be likely to learn about a task. This overlaps with the problem of early design, which will be discussed below.

Related to the concept of training, we could ask whether the methods of analysis described in this study could form the basis of an assessment tool, to discover the suitability of different people to the performance of different kinds of tasks. The answer to this is by no means clear, but if an answer were to be found, it would most likely relate to parameters governing the structuring of a task, which would here be chiefly about contexts. If one were to study the task performance of subjects, in terms of rules and contexts, across a wide range of tasks, there might be consistent differences between people, for example in terms of the number of rules that could be comfortably accommodated in one context; the number of different items of information that were taken into account in each context; the number of contexts into which a given task was split; and the amount of intermediate processing necessary in the execution of the rules in any context. From these differences, it might be possible to factor out one or more dimensions of ability in general task performance.

8.2.6 Early design

To be able to help with early design, or (relatedly) to help with the construction of a training programme for a new system, a model of an operator's capabilities must exist prior to the analysis of an actual operator's performance. This model would have to be able to address the question, "how difficult is the task of controlling this system, given this amount of information?" An even more general question would be, "What information has to be provided to make this task doable?", and we could imagine answering this latter question in terms of the former one, using the extra input of the cost of providing whatever information is needed. These questions are closely related to the idea of 'cognitive task analysis' which has been raised in the discussion of the literature above (§2.1.2), and in a more extensive focused fashion elsewhere [43].

Here, as previously, to answer these questions we need to know something of the parameters governing human ability to structure a task. If we then came up with a model of one possible method of performing a task, those parameters could be applied to that model, resulting in a judgement whether that particular method was humanly possible or not. A positive result should be reassuring, that the task was indeed possible, but should not be taken to imply that any human would actually perform the task in that way. But the converse would not be true: just because one found that a particular method was implausible for a human would not mean that the task was impossible. To prove that a task was impossible would be at least much more exacting, if not itself strictly impossible.

To design a training programme, or to derive human models which could be expected to arise from a (possibly null) training programme, would need a model of human learning as well as parameters governing what is learnable. For this reason, the automatic design of a training programme is a yet more distant goal.

Next Section 8.3
General Contents Copyright