|© 1994||Home page||Publications||Paper frame|
Many tasks, in particular complex tasks, have a great deal of information, potentially relevant to the task, available to the operator. The design of displays to these tasks has been the subject of much discussion, particularly focused around the issue of whether it is better, on the one hand, to present all the available information all the time (perhaps around the walls of a control room, ship's bridge or aircraft cockpit), or on the other hand, to select a small subset of the information that is most relevant to the operator at any particular time, e.g., (Bainbridge 1991). The option of selecting a small set of most relevant data is attractive when constructing an interface to be based around display screens - the typical cathode ray tube, for example.
Some model of users' information needs is required on which to base such a selective design. It is common, however, in complex tasks, for users to differ in their strategy, and hence in their information needs. A clearer, more detailed model of human information use would be likely to help towards the design of better interfaces; and for tasks where users' information needs are found to differ, it would appear to be essential to have this kind of user model guiding the system's provision of information. (Grant & Mayes 1991)
The object of this paper is to present groundwork for better models of cognition which may be used as the basis for better interface design generally; and more specially, as the basis for more effective modeling of users' information use. Since complex tasks often allow inter-individual differences in performance, such models as are made of task knowledge in general should primarily address the overall structure of that knowledge. In particular cases, researchers may attempt to fill in the detail, and this is in principle easier if a framework is first established.
All well-known model frameworks, whether from AI, cognitive science, or more practical study, involve some kind of modularity. What kind? Modularity in models of complex cognition can be achieved through either functional, or contextual, differentiation. Since the main focus of this paper is on contextual modularity, first functional modularity will be discussed to clear the ground.
Where necessary in this paper, the reader may imagine the task of flying a small powered aircraft, and the design of an interface to facilitate that task. This is simpler than likely real applications in the interests of comprehensibility.
It is common practice to divide cognitive faculties into functional modules, not unlike the block diagrams that have been used in the design of the structure of computer software, or the very common data-flow diagrams used similarly in structured systems analysis. Where researchers come to divide up cognitive function by way of top-down decomposition, it is not surprising that such a structure should result. I will give two examples here of predominantly functional modularity in models of cognition.
Barnard's Interacting Cognitive Subsystems (Barnard 1987) provide an interesting example of a functionally modular model of cognition. The subsystems (such as visual, propositional, motor, etc.) are described as independently operating, connected by a data-bus-like structure. Widespread academic discussion of this model suggests that this is a promising way of giving modularity to cognition, but it does not address all the questions necessary to explain complex task performance. Quite apart from anything else, there is no attempt to explain why some subsystems should be used in in particular situations, where other ones may be used in similar situations.
The work of Shallice and others (Shallice 1988; Fodor 1983) give another view on modularity. Shallice's main motivation is to form models capable of explaining phenomena of impaired cognition. Clearly, if cognition is impaired in a particular way across all tasks, we are looking for an explanation in terms of a functional module, which be associated with a particular location in the brain. This does not address, however, the performance of perfectly normal and competent people who happen to do things they might later regret. This needs a contextual view.
It should be clear that most models of cognition have modularity both of function and of context. The difference between the two groups identified here is only in the emphasis given to one or other.
In contrast with the functional view, contextual modularity here is taken to mean the division of cognitive structure into modules of similar functionality, but differing context, such that each module performs an analogous role in its own specific context. Each module, so divided, deals with a separate stage of a given task. Thus, whereas the casual observer may see a light aircraft taking off, flying round, and landing, the learner pilot quickly recognises this circuit as broken into different stages, including the takeoff, the downwind leg, the base leg, the final approach, and the landing itself. Each one of these stages has its own important things to observe, and its own rules for action. The precise granularity (and naming) of the modules may be debated: the issue here is rather the fact of division and the nature of that division.
Contextual modularity has no very obvious correlate in general-purpose computer systems, where large amounts of immediately accessible memory, along with virtual memory management, mean that a great deal of information is simultaneously available. But from basic knowledge about memory (e.g., popularly, (Baddeley 1983)), it is clear that humans do not function in this way. The small size of the typical contents of human working memory suggests a small basic unit, or module, of knowledge, since it is implausible to have all of complex cognition in one large unit. The view that human knowledge structures are divided into such small units is the basic assumption underlying a contextually modular view.
There are many strands to the modeling of cognition that could be seen as coming from a contextually modular position. Taking some of the seminal ideas from the literature, we could consider any schema theory (e.g., (Bartlett 1932) to be a form of this division, as could be frame theories (Minsky 1985) and script-like theories (Schank 1982). On the other hand, theories such as ACT* (Anderson 1983) have little in the way of contextual modularity. In the case of Soar (Newell 1990), the only true contextual modularity is in terms of separate "problem spaces".
There is clearly no close agreement among cognitive models on the issue of contextual modularity. In view of this, a useful starting point for contextually modular modeling would be twofold.
A challenge for any cognitive model is to account not only for the regular performance of human skill, but also for the less regular features of human performance, including errors, interruption, biases. Here it is argued that, in particular, this requires attention to be given to the transitions between contextual modules, and that this could be a way forward to more thorough and powerful models, particularly of complex task performance.
Introduced above was the idea that there are two facets to modularity: the nature of the modules by themselves; and their interrelationship. In the case of contextual modularity, the question of the nature of the modules includes discussion on exactly what is contained within or associated with a module, both with respect to the quantity and quality. The interrelationship of the modules seems to be a more difficult question. Unlike the case in functional modularity, where the functional relationship between the modules is part of the reason for their existence, for contextual modularity we must consider how, during cognition, the modules are selected or replaced, since, if they perform essentially the same function, not all of them (possibly only one) can be primarily operating at once.
With a model of contextual modularity, there is the question of whether these two facets are addressed, and what the balance is between the facets. Here, we shall consider each facet in turn, relating them to classic models of the past.
Bartlett is one of the originators of a view of memory as having schemata (Bartlett 1932), where each schema is a known pattern. A schema is "an active organisation of past reactions", but Bartlett gives no indication of how the schemata are switched between in the course of complex task performance. In a similar vein, Schank's scripts and MOPs (mental organisation packets) (Schank 1982) have been based on the understanding of stories, rather than the performance of tasks. The scripts or packets themselves are clearly defined, but the ideas on switching are weak, and the theories do not explain complex task performance. Minsky's concepts of frames and agents (Minsky 1985) also clearly have much of the same character of contextual modules. But again here, much more effort is put into the delineation of the structure of the frames and agents, and little into detailing their interrelationship.
In an attempt to produce models of behaviour in a complex task using rule-induction techniques, Grant (Grant 1990, Chapter 7) investigated dividing up human actions according to the information that was available at the time the actions were taken. This led to a clearly contextually modular structure, where at any time the user or operator is in one or other "context", as the contextual modules were referred to. Each context has specific rules governing decisions or actions, a specific cognitive representation of the relevant variables in the context, and specific sources of information which are used in the derivation of the relevant variables. In that study, the mechanism for transition between contextual modules was not very clearly defined, but it was suggested at least that there may be learned cues, which had to be related to the currently observable quantities being monitored at the time.
The difficulty for any model without a clear concept of module interrelationship is that, although the model could simulate a part of a task well enough, it would rapidly get lost when trying to switch between appropriate modules. In the dynamic control literature, this is sometimes referred to as "situation awareness", and approaches to modeling it are not immediately obvious.
So, in contrast, here we see the other side of contextual modularity. The examples chosen here are of working computational models, and this is not surprising, because in order to make a working computational model one must have a clear execution model, including effective transition between whatever contextual modules there are.
ACT*, PUPS etc. (Anderson 1989) have a clearly defined model of learning, but not such a clear model of execution of tasks using the learned knowledge. Procedural memory, modeled as production rules, is not explicitly divided into contextual modules. As in many production systems, all of the productions are considered at each cognitive cycle. The semantic network of declarative memory works on the basis of spreading activation, rather than context. This leaves the contextual granularity at the level of the single production or semantic unit, which is smaller than is suggested by the previously cited models.
Soar (Newell 1990) has a clear execution model within a problem space. The approach is highly generalised and unified, and this is made explicit through the assumptions that are stated, such as the problem space hypothesis, and the universal subgoaling hypothesis. Unfortunately, this strictly tree-structured hierarchy does not appear to correspond well with human complex task performance (see, e.g., (Bainbridge 1993). Soar has a concept of context, which is associated with the goal stack, and this is the context in which a particular production fires or not. But it is not a cognitive context in the sense of being responsible for context effects such as priming. Chunking in Soar is to do with the replacement of a sequence of cognitive processing with one cognitive operation, which, again, is not the same as a contextual module. In Soar, perhaps the closest correspondence with the contextual module is the problem space itself, as there is a different problem space for each problem, but Soar is devoted primarily to the mechanism of problem solving and learning within a problem space: hence moving between problem spaces is given little attention.
The current models specifying the interrelationship of contextual modules only really deal with predictable behaviour, not exceptions. The difficulty appears to be that emphasis on switching between the modules makes their nature less clear. This may be because, having devised one effective control mechanism, it is capable of dealing with a variety of data. Without a clear commitment to cognitive plausibility, it is easy to choose a representation that is computationally convenient rather than cognitively accurate.
The foregoing discussions of modular models may be put together thus. If, in a model, there is too much focus on the nature of the module, the interrelationship between the modules can easily be glossed over, and vice versa. The choice, for a model framework, of a particular module nature may easily have consequences for the associated model of interrelationship, and again, vice versa. To illustrate this, consider two modular theories with different sized modules. The transitions between the modules would not be the same. Again, the transitions between modules would be very different depending on whether the modules themselves contained information concerning the transitions, or whether all transitions were managed by a separate function.
What is needed are theories and models that consider equally the nature of the modules, and their interrelationship. To be cognitively plausible, rather than just an exercise in AI, the modules must be of such a nature and size as is compatible with known contextual effects; and the modules must be interrelated by transitions which are compatible with what is known about human switching of context. This switching may be particularly significant in the discussion of "human error".
The work of Lisanne Bainbridge is an important step towards just this kind of model that deals with both nature and interrelationship (Bainbridge 1974). She developed on paper a model of the process control skill of a steel-worker performing a realistic simulation task. This model is in the form of a complex flow-chart, where the cognitive processes are detailed to a level that allows estimation of the load on what she terms "working storage". The cognitive processes are divided into what Bainbridge now terms "routines" and "sequencers", which both correspond to recurrent patterns of action (and associated verbal protocol). The interrelationship between these is straightforward. The routines are called by the sequencers, and return a value to them along with the control. An example of a routine, in the steel-works task, is the decision of which furnace to cut the power to. In contrast, the sequencers do not return control; rather they pass control on to another sequencer.
This style of model is still, it seems, unique in the analysis of complex tasks. It offers a plausible model of a particular operator's usual performance, in a way that takes into account cognitive capacity. However, it does not deal with the kind of module transition that is unexpected. For this, we have to suppose another mechanism, and this invites a thorough look at how the nature of contextual modules may interact with these two kinds of transition between them.
How are these cognitive contextual modules coordinated? Common observation of human tasks brings up two important relevant observations.
The first is that, as people practise performing a task, they are increasingly able to move smoothly and without apparent effort between different parts of the task. Using Rasmussen's ideas (Rasmussen, 1986), we could say that information processing becomes increasingly dominated by the skill-based level, where the information is perceived as signals. Specific cues are learned for many parts of a task, and we may well suppose that this includes moving from one stage of the task to another: one contextual module to another. What becomes clear in many tasks is that the cue itself does not determine the destination module. An airspeed of 80 knots might, according to context, be the cue for taking off, or the cue to perform the next step of an aerobatic maneuver. This suggests the involvement of the contextual modules themselves.
The second observation is that there are many situations that may be imagined where an unexpected event, or the observation of an unexpected value, may completely interrupt a task, and replace the actions that were to be carried out by quite different ones. An explosion might be a good example in many settings. In the air, this might be followed by complete loss of control, which would probably suggest bailing out, if parachutes are provided.
The hypothesis here is that there are two fundamentally different kinds of transition from one predicament to another, appropriate to the two observations above. These are firstly, a learned, context-dependent mechanism; and secondly an associative mechanism dealing with situations that have not often been met before.
The essence of the contextually modular view is that regularities appropriate to certain contexts are stored together, and are accessible together. Where decisions or actions are taken, the rules for these would be included in the module. It is only a short step from here to including the regularities governing transitions between modules. There could be, for example, rules of the same form as decision rules. This kind of transition rule would be reliable, and suitable for automatising in the course of development of a skill.
One advantage of having transition rules associated with contextual modules is that the rules for decision or action can be much simpler than they would be in a non-contextual system. To avoid ambiguity in a non-contextual system, one would need explicit conditions attached to any rule for that rule to be able to fire. As well as the extra space required to store these conditions, this requires that evaluation of the conditions be done before the action is taken. Moreover, in a non-contextual system, all the rules for the whole task would have to have their conditions checked at each decision point in the task.
Learned transition rules are more like "goto" instructions than like procedure calls, and for this reason, they may be imagined in the form of a transition network, rather than as a tree-structured hierarchy of goals. If a graphical notation such as this is to be used, it must be remembered that a contextual module has internal structure, and is not a single atomic entity. Arrows may then be drawn starting from a particular part of the boundary of a module, which would be associated with a condition being fulfilled, and leading to the next appropriate module. An illustration of how this could be drawn is given in Figure 1.
In the figure, we see represented as 'blobs' some possible contextual modules for the stages in flying around the local airfield circuit by a beginner pilot, along with some learned transitions between these. The normal transitions between stages are learned, and therefore they are drawn coming from particular parts of the boundary of the relevant 'blob'. Each box attached to the boundary of a 'blob' represents a particular set of conditions being fulfilled, leading to a learned transition to another contextual module. This is not intended to be substantially different from other forms of graphical representation, rather it is one possible convenient way to draw such models.
There is a choice of transition from final approach, either to landing, if the criteria are satisfactory, or overshooting, if certain parameters are wrong for a safe landing. Many more transitions could, of course, be drawn in, including the practiced responses to engine failure, incipient stall, etc. The diagram goes only a first idea of how one could draw the modules and their interrelationship.
The learned mechanism for transition between contextual modules is executed by transition rules that are specific to each module. But the problem with a learned mechanism is that it cannot deal with the multiplicity of events that would actually be a radical change of context to a human. This is not unlike the frame problem. When flying, it is clearly implausible to suggest that all the events that could stop one flying were explicitly encoded. Hence the hypothesis of an alternative mechanism for just such unexpected situations.
Where a specific transition rule has not yet been learned, there must be a mechanism to get the human out of inappropriate perseverance within that module. The model framework suggests here that this happens in two stages. In the first stage, the human detects some relevant condition that makes the current state lie outside the normal operational envelope. Note particularly that the terms in which the current state is described - the local representation - differ between modules. The exact conditions in which this happens probably vary between individuals and between situations, but it is easy to think of several unanticipated events that would interfere with normal level flight, for example: unexpected responses to control movements; unpracticed modes of instrument failure; the sudden sight of another aircraft on a potential collision course; strange messages on the radio.
In these circumstances, the pilot is aware that the situation may be inappropriate for the set of rules that are currently in operation, but no learned cue has been encountered that would have led to another contextual module in a routine way. Something must be done, however, and the problem for the human (and our models of the human) is how the next module is selected. This may be envisaged in terms of the graphical state transition network introduced above. The situation comes to be seen as outside the range of convenience symbolised by the 'blob', but the exit was not via one of the recognised paths, and so there is no prearranged contextual module that follows.
At this point, an associative mechanism would serve. It could be that the known states of affairs are matched against the characteristics of possible other contextual modules: the characteristics could include salience, recency, frequency of encounter, and even associated affect, of the modules, as well as the match of their typical features with the features of the current situation.
Because of the range of features that may be used, and their variability, this associative mechanism is likely to be unreliable, and to give different results dependent on chance circumstances. In this way it is very different from the learned mechanism. Nevertheless, as the same situation is encountered repeatedly, the association would become routine, and a transition rule would be learned.
The distinction between the two transition mechanisms is thus not yet entirely clear. What has been described here are the two extreme cases: of a well-learned transition rule; and of a situation that has never been encountered before. There must also be intermediate stages of some kind. This invites future work to be done in testing whether these mechanisms do in fact represent what happens, and if so, how one form of transition develops into the other.
There are many ways in which this work could be significantly taken onwards, first theoretically and then practically. Work in progress includes the development of computational models embodying these theories, and it is expected that this computational effort will be fruitful in guiding the model towards clarity and consistency, while providing a computational framework capable of being scaled-up to the range of size associated with real complex tasks. Perhaps it should be a commoner research issue, but computational modeling of complex tasks has rarely been attempted.
The issues raised in this paper suggest two complementary questions that may be asked about any model of cognition in complex tasks. If a theory has yet to be implemented computationally, one may ask how it would be done. On the basis of current examples, this is likely to reveal major problems. On the other hand, if a model is already executable, one may ask, do the data structures of the model tally with what we know about human memory and skill? Again, current models have tended to invite the answer, no.
The most important principle to keep in view is that of cognitive plausibility. The claim in this paper is that to get a good model of complex tasks, one needs both the right modularity and the right interrelationship between those modules, and that the most fruitful way of addressing these two is together. This raises several subsidiary issues.
The first issue is granularity of contextual module. The two mechanisms proposed give two, not necessarily identical, partitions of task knowledge into contextual modules. The routine transitions emphasise a granularity based on current decision or action rules, and current local representation of the task state space. The associative transitions emphasise a granularity based on points at which the task may be joined, and sections of the task which do not permit restarting other than at the beginning. Further empirical work clearly needs to be undertaken to clarify this granularity.
The second issue is the proceduralisation of task knowledge, and the development, in Rasmussen's terms, from knowledge-based behaviour through rule-based to skill-based behaviour. How can this be modeled? It is clearly an important question, and it is hoped that this paper's setting out of the two kinds of transition, with the associative used more by the novice, and the learned by the expert, may help to focus the issue.
A third topic for investigation, following from the discussion here, is how interruption and reorientation mechanisms work. Experiments could be designed to test between the model presented here, of interruption being associative transition, as opposed to routine being learned transition. This also implies that reorientation to a task starts at the boundary of a contextual module as described here.
Based on these more theoretical considerations, practical progress could arise in a number of ways. Firstly, a suitably detailed model of a particular user performing a particular task could help to predict when that user will take decisions or make actions that could be seen as in error. This work is essential to the accurate estimation of system reliability, which at present uses many models of questionable power (Reason 1990). Aircraft crashes are frequently attributed to human error, but the cognitive analysis of these errors has not been thorough. The distinction made in this paper between learned and associative transitions between cognitive modules plays a part in possible analyses that are more thorough.
Secondly, in cases where there is sufficient commonality between different operators, mapping the cognitive structures involving the use of information in complex tasks could lead to better interfaces. If we wish to inform interfaces that provide just the right information at the right time (rather than making the user waste time and effort sorting out relevant from irrelevant data), we need user models that deal with information requirements at the different stages of a task, and that predict operators' transitions between those stages. In the case of the flying example, an ideal (though improbable) redesigned interface could give just the information that was needed for each particular stage. When on the downwind leg, the appropriate checklist could appear. If the pilot seemed to be performing actions inappropriate to the context, the system could draw attention to data that could have been overlooked, and which would be likely to cause the pilot to reassess the situation.
Perhaps the most interesting long-term possibilities arise from the potential role of this modeling framework in the ability of the computer system to continually refine its model of the user's skill. Current modeling frameworks have not enabled computer systems to build good predictive models of human skill from an observation of their actions. The current suggestions may not lead all the way there, but we may conjecture that an advance in the understanding of the detailed structure of task knowledge in general could well lead to corresponding advances in the ability of the system to induce models of users' skill from traces of actions. This would, if it were successful, enable systems to offer interfaces to complex tasks that would be both truly adapted and adapting to individuals' patterns of information use. For example, there are many possible styles of final approach to landing, and a liberally intelligent pilot support system might get to learn the preferred methods and tolerances of particular individuals. The system could then advise when that person was falling short of their own standards. This has been referred to as the Guardian Angel support paradigm Grant 1990).
This paper is suggesting that one good reason why we do not have useful user models predicting behaviour and information use in many tasks (especially complex ones) is a lack of clarity about the nature both of the modularity of the user's task knowledge, and the mechanisms of transition between those modules, and that therefore addressing the highlighted issues by modelling cognition in the way described may be an important contribution to the future practical applications of user modeling.
I would like to thank Lisanne Bainbridge and Rick Cooper for fruitful general discussion of the field, and Alistair Sutcliffe, Gordon Rugg, Peter Faraday and two anonymous referees for commenting on draft versions of this paper.
|© 1994||Home page||Publications||Paper frame|