© 1996 06	Home page	Publications	Paper frame
Bibliographics	authorship	abstract	contents	references

SimulACRUM: A cognitive architecture for modelling complex task performance and error

INTRODUCTION

Analysis of currently available cognitive architectures shows that there are gaps in the treatment of human errors in complex task performance. On the one hand, well-known 'architectures' such as Soar (Newell 1990) and ACT-R (Anderson 1993) have been predominantly used to model learning or skill acquisition, which is not often highly salient in errors found in complex systems. Rather, we see mode errors, fixation errors and other kinds of phenomena that are often clearer to the domain practitioner than to the cognitive modeller. There has been extensive treatment of such practical themes in the literature, for instance by Reason (1990), Hollnagel (1993), and Woods, Johannesen, Cook, & Sarter (1994). Decision making under time constraints is important, and the concept of situation awareness is prominent, notably in tasks such as aviation.

On the other hand, the foremost work in modelling and simulating complex tasks, e.g., (Cacciabue et al. 1992), though it takes theory from cognitive science, draws the underlying architecture more from computer science, and thus lacks detailed correspondence between the mechanisms of the architecture and plausible mechanisms implementing effects recognised by cognitive psychology. The philosophical point of view also gets little attention.

The challenge is to devise a more realistic cognitive architecture that can provide the basis for meeting the practical modelling requirements of cognitive ergonomics. We shall first consider some examples from the literature of the kind of human error that needs to be modelled.

EXAMPLES OF ERRORS FOR MODELLING

Brief discussion of three examples of human error taken from the literature indicates issues which are important for the cognitive architecture underlying any adequate model of the cognitive processes responsible for these errors.

An Unexpected Maintenance Event

Woods (1994, p.40) describes the incident of the missing O-rings in the Eastern L1011 in May 1983 in more detail than here. Mechanics usually replaced the aircraft engines' 'master chip detectors' complete with O-rings that had been fitted by someone else - helpfully, but against the exact definition of the procedure. When the usual place to get them was empty, they got others, but these did not have O-rings ready fitted. They did not notice that the O-rings were missing, and checks were inadequate to spot the mistake. This led to all the engines malfunctioning during the next flight.

This suggests that the relevant skill of an operator is not identical to the procedure manual, and that an analysis of error that is limited to deviations from written procedures is inadequate. The operator's error in this case (and many others) is not a deficiency in routine knowledge, for the situation in which the mechanics found themselves normally was adequately covered by their skill. The error arises from the combination of a particular level of skilled knowledge combined with some kind of unexpected or unusual situation.

Three Mile Island

Reason (1990, p.251; pp.189-191) gives a useful summary covering the main chain of events, active errors, contributing conditions, and latent failures. The accident is so well-known that there is no need to describe the scenario here.

A cognitive explanation of the accident depends critically on a model of how the operators' limited knowledge of, and limited access to information led them to suspect that the problem was other than it actually was. The relief valve (PORV) was stuck open, but the indicator said closed: we need to model the fact that based on the available observations that were likely to be made at the time (which itself is based on how the situation was being perceived) the diagnosis they made was plausible. We should also model the fact that the observed evidence did not produce a timely shift of view - in this case, to a more realistic recognition of the plant state including a misleading indicator.

There were important failures of management, regulation, design, training and maintenance, pointed out by Reason, without which the accident could have been avoided. However, from the point of view of analysing the immediate causes of the accident, these wider organisational (or latent) factors can be seen as either affecting the availability of information to the operators, or affecting (or limiting) their operationally effective task skill to respond in appropriate ways to the information provided.

A Well-Studied Air Accident

The official report of the Mont Sainte-Odile (approaching Strasbourg, France) aircraft accident was by the commission of enquiry (Monnier et al. 1993). One clearly possible, and commonly acknowledged, interpretation of the accident is that one of the pilots mistook the mode of the flight management system, entering a descent rate of 3300 feet per minute instead of a descent angle of 3.3 degrees, which were possibly confusable. (The design has since been changed to be less confusing in this respect.) Perhaps the surprising aspect is that neither pilot noticed the effects of the mistake, which we can easily imagine as fairly obvious. This was probably helped by the fact that the flight was in the dark, and that complications in the approach demanded abnormally high attention and workload.

To get the error in perspective, it is helpful to remember that no more than about one minute elapsed between the presumed wrong setting and the crash itself. This minute was one in which they were concentrating on another task, presumably not suspecting that they had any need to check on the progress of what was being taken care of automatically by the flight management system.

A cognitive model of this accident needs to take into account the effect of high workload, and relative inexperience, on which variables would be noticed, and the inferences drawn from those variables in terms of whether there was anything amiss with the progress of the flight. One common factor in air accidents of the type "controlled flight into terrain" (CFIT) is that there is no awareness, at least until it is too late, that there is anything wrong with the flight path. The situation awareness has diverged from what it should be - consistent with reality.

General points

A key feature needed to model this kind of accident (or similar incidents or events) is the combination of a rule-based approach to normal situations, together with the ability to predict what will happen when the situation is changed in some unexpected way, in terms of the humans' situation awareness (that is, how the human understands what is happening). In order to do this, one needs to go below the level of the complete rule. The approach that emerges from the examples given is to consider what the human is likely to observe, and what known situation will be inferred by the human as being the most likely one given what is observed, and given the other factors in the situation.

The easier step from there is to make a reasonable explanation of any particular accident. This needs an intuitive understanding of how human cognition is likely to work in the given situation. But the harder step of making a predictive model, or even a cognitive simulation of the behaviour noted, needs those insights on the workings of human cognition to be formulated and implemented.

CONSTRUCTING A COGNITIVE MODEL

The features that are necessary for a model of this kind focus around the types of situation in terms of which the events are interpreted by the operator, and the variables noticed at different times, which both guide the interpretation of what the situation is, and also any action that may be called for in the situation.

In a complex system, where there are many more variables involved than can be constantly monitored at all times, the selection of variables to monitor at any time is largely dictated by the particular situation. Thus, in a modern, glass-cockpit aircraft, the pilots will select the particular information display on the screens that most closely matches their needs. However, it is also clear that some other variables, not necessarily recognised as directly required, are also salient. Thus, had the Mont Sainte-Odile scenario arisen in clear daylight, the pilots would more likely have noticed their approach to the ground, which could have interrupted their concentration on the lesser task of reaching the runway centre-line.

As in discussion above, the organisational and latent factors come into this cognitive analysis by affecting either the content and extent of the pilot's skill, or the availability of information. A mismatch of the task, the operator's skill, and the information provided can then often be seen as a responsibility of the management. Thus a cognitive model may provide another view of, and approach to, organisational factors.

The importance of the cognitive factors is common to the accidents described, and to many other examples of human error in complex tasks with complex systems, including air traffic control, the medical profession and the financial industry. This gives an incentive for building a modelling system which has built-in mechanisms capable of handling these phenomena, rather than trying to build, from scratch, a cognitive model of each different domain and each different accident. The common mechanisms amount to what is called the cognitive architecture.

DEVELOPING AN ARCHITECTURE

When considering developing architecture, it is as well to remember that there is no way that an architecture itself can be proved or disproved. This is because any normally rich computational architecture can in principle be made to model any of the same range of phenomena that is possible with any other architecture. The relative merits of different architectures must be discussed on more practical grounds, such as the convenience for practical modelling, or the size of model content code that must be written to represent certain observed phenomena. Thus, for instance, Anderson (1993) discusses the relative numbers of productions necessary to get ACT-R and Soar to do similar things.

The justification of proposing a new architecture here is that there needs to be a clearer mapping between the mechanisms suggested in the architecture and the phenomena that have been pointed out in the examples above. Other architectures do address some of these issues, and developments should therefore consider previous insights, to ensure that a new architecture is at least as useful as former ones. To discuss the merits of other architectures is, however, a long study, and this paper will confine itself to proposing a new one, hoping that the points of influence will be apparent.

As well as accommodating insights from other architectures, a new one should be philosophically reasonable, and internally coherent. One example of a philosophical discussion would be about the way in which commonly used words mapped onto concepts of the architecture. This is quite demanding for any architecture with a large scope.

However difficult the philosophy is, to be of practical worth, the architecture must be used for making models of real-life skill, and in particular, 'human error' arising directly from the exercise of that same skill. This may be expected to follow on from what is presented below.

SimulACRUM

SimulACRUM stands for Simulation Architecture for Cognition-Related User Modelling. The acronym is rather broad in meaning, and this reflects the general-purpose intention behind the architecture.

Background

SimulACRUM starts from the concept of contextual modularity (Grant 1994) in being based on cognitive task units. These are in the tradition starting with Bartlett's (1932) schemata and going through frames, scripts, and other such modular divisions of the knowledge underlying competence in a task or activity.

A task unit is a very general concept, not limited to situations we normally call tasks. It is not just the observed human behaviour, or 'activity', that defines the task unit, but the relationship of that unit of behaviour with other similar units, whether they be parts of actual tasks, games, pastimes, habits of mind, or any pattern of response to the world that is identified by the cogniser as relevantly similar to another pattern of response. Calling them 'task units' helps to focus on the relatively clear case of a task, where there is likely to be a goal. However, a goal is not necessary for a task unit in SimulACRUM: it can more simply be a pattern of response to conditions.

Most cognitive architecture focuses on some kind of task unit, be it a production rule system like ACT-R (Anderson 1993), where the unit of procedural knowledge is the production, or a model such as COSIMO (Cacciabue et al. 1992), which uses a frame system to represent task knowledge. Unlike the case in other architectures, in SimulACRUM, task units are seen as similar to, but complementary to entities, which are the things that are taken to exist, physically or conceptually. The interplay between task unit and entity is central to the architecture, and differs, for example, from ACT-R's relationship between the productions and the 'chunks' of declarative memory.

Theory

We are all familiar with classes of entity, such as animals, dogs, humans, or aircraft, and also particular individual members, called instances, of those classes: you, me, perhaps someone's dog called Spot, or the aircraft FGGED (the one in the Mont Sainte-Odile accident). SimulACRUM adopts these straightforward ideas of entity class and instance, which have similarities with (and some differences from) object-oriented ideas. Classes of task unit are perhaps not so obvious, but in SimulACRUM, a task unit class is somewhat like a script, in that it holds the general methods for observation, evaluation, decision and action etc. for a particular kind of situation, like what to do, driving a car, at traffic lights. The instances of these task units are the episodic representations of particular events, such as the times when one has encountered traffic lights.

Thus, procedural knowledge is represented as task unit classes; episodic knowledge as task unit instances; entity classes represent much of declarative knowledge (thinking of the European verbs savoir, wissen, sapere) about things (with an indefinite article) and entity instances represent knowledge (connaître, kennen, conoscere) of particulars (with the definite article).

Behaviour

For a cognitive model built with the SimulACRUM architecture, observation of the world situation leads, by a pattern-matching process, to identification of 'what is happening', that is, the current task unit, which is instantiated from its class into a current instance (which will later become part of episodic memory). Once a task unit has been identified, that governs action as long as the situation remains within the bounds, recorded with the task unit, representing the normal range of situations appropriate to the task unit. If some part of the environment is noticed going outside these bounds, it will trigger a rematching of the changed conditions to a new task unit, if there is a more suitable one available.

These pattern-matching processes in some way resemble Reason's (1990) similarity matching and frequency gambling, embodied in COSIMO (Cacciabue et al. 1992). This needs a mechanism equivalent to an envelope defining where the task unit state vector (that is, the set of relevant variables with values) is considered 'normal'. It is impossible to take into account all possible observables in a real world context, but the state vector is likely to have more variables than the ones needed explicitly to feed decision-making and action rules. Any attribute can be added that has been associated with the usual situations in which the task has been practised, without the extra computational cost associated with adding extra rules.

SimulACRUM's approach to stress and emotional factors is to allow 'tension' associated with a task unit, and 'charge' with an entity. This may be related to the degree of risk perceived in the situation or object. Task units with a high 'tension' will be instantiated more readily than ones with a low 'tension': this reflects the common experience of being preoccupied with a worrying situation. More attention will be given to monitoring the variable attributes of charged entities - the position of a wasp, or of a conflicting aircraft, for example.

In SimulACRUM, this pattern-matching form of task unit change occurs when the conditions are unusual in the sense of not having been practised, and not if a task is running according to usual, well-known conditions. In well-practised cases, the human will have learned that certain conditions may be taken to indicate reliably the appropriateness of the next task unit. This means that in task units we have rules for 'moving' to another task unit, given certain conditions - or, to be more precise, rules for instantiating a new task unit.

Structures of the architecture

The basic components of the SimulACRUM architecture are the entity and task unit classes and instances. The details will be tuned as a result of the experience of applying the architecture in the modelling of real-life cognitive skill and error.

The way that the architectural components are used as modelling elements will depend on the particular human's skill at the particular task and their degree of expertise. Thus for illustration only, the example of an ordinary motor car is given for the entity, and making dough (for bread) for the task unit. Where practical, one example case is given in parentheses along with the component: many are often possible. Entity classes may have the following components, each of which also has a variable strength of connection with the entity class.

supertypes (road vehicle), subtypes (automatic transmission);
superparts, subparts (engine);
evidence rules for variable attributes (for speed, look at speedometer);
execution rules for affordances (to slow, press foot harder on brake pedal);
normal state envelope and dynamic (must have at least three wheels, an engine, etc.; must work);
charge as a function of entity state (high when water temperature is near maximum).

Entity instances (e.g., my own car) then may have:

parent entity class (as above);
part links to other entity instances;
evidence and execution rules which may supersede the ones from the parent class;
current state, with uncertainties for each variable;
current charge.

Task unit classes may have at least:

supertypes, subtypes (making a particular kind of dough);
superparts (making bread; pizza), subparts (kneading dough);
task-action rules, linked to evidence and execution rules of entity classes (when the dough has risen, knead it);
task unit instantiation rules, linked to evidence rules of entity classes (when the dough is ready, continue with making bread or pizza);
normal state envelope and dynamic (suitable temperature of dough mixture);
tension as a function of task unit state.

The rules and links are also connected with a variable strength to the task unit. Task unit instances (such as the time I made dough for pizza on 14th June 1996) while they are being executed, may have:

parent task unit class;
part links to other task unit instances;
task-action rules, linked to evidence and execution rules of entity instances;
task unit instantiation rules, linked to evidence rules of entity instances;
current state, with uncertainties for each variable;
current tension.

The normal state envelopes are perhaps the most difficult component to describe directly. For an entity class, the envelope concerns the question of what changes in the entity would make it be a member of a different class, or at least a dubious member of the class in question. The envelope may have some hard edges, linked to attributes that are considered (by the individual human) to be defining of the class. Something that doesn't have wheels simply isn't a car. But the envelope can also have less clear boundaries: would something without seats be a car?

For task units, the normal state envelope determines when the situation is no longer appropriate for that unit of task knowledge. Again, this can have sharp or hazy boundaries. The hazy, or disputed, boundaries of normal state envelopes go along with the idea that they are acquired generally by experience and not by explicit rules, and that they are used in pattern-matching rather than rule manipulating mechanisms.

The size of the classes and instances, in terms of number of rules, links, etc., is limited in some way by the capacity of what is often called working memory, and SimulACRUM will have a working memory mechanism to do this.

Another important feature is the global activation threshold, connected to general stress, or arousal, which works, together with the strength of connection of rules, and the tension and charge, in the selection of rules for firing at every level.

APPLICATION OF SIMULACRUM TO EXAMPLES

Having sketched some basics of the architecture, here it is outlined how models could be made, of the examples presented earlier, in terms of SimulACRUM. This is intended to give the beginning of a feeling for the elegance and practicality of the match between the architectural mechanisms and the modelled phenomena, which is a key factor in the evaluation of an architecture.

The O-ring incident could be modelled in terms of normal state envelopes of entity and task unit classes. The usual task units would refer to the entity class 'master chip detector', with a rule about where these were to be found. Since O-rings were always fitted, the entity class 'O-ring' would not feature in the usual task units, and the detectors would be recognised perhaps using only a few characteristics. When there were none in the usual place, the question of locating some would have naturally arisen. A default rule was to get them from the stockroom, but this was not a usual procedure and therefore it would not be clear exactly what had to be done in this case. The obvious assumption for the mechanics was that the detectors obtained from the store were the same as the detectors normally found in the foreman's cabinet. There was clearly no explicit active rule to check the O-rings presence, and since they had not come across detectors without O-rings, the normal state envelope would not be crossed.

The normal task unit state envelope can be illustrated further by a feature which would be more likely to be part of it: the feel that a mechanic gets when fitting a component into place. In unexceptional conditions, one might expect that if a part fitted badly, or if it made an unusual noise when being fitted, or something of that nature, that this would have crossed the normal state envelope for the task unit of fitting the part, provoking at least a moment's reflection of "that's strange". This presumably did not happen in the case cited.

The case of Three Mile Island could be represented firstly as one of evidence rules. When the PORV stuck open, and the indicator showed closed, that information was mistakenly believed by the operators, when there could have been more detailed rules for ascertaining the actual state of the PORV. One could say that there was a deficient evidence rule. However, even if the operators had a more sophisticated rule, one would have to take into account the possibility that under stressing condition, such as alarms, the global activation threshold will rise, and short, strongly connected methods of evaluation will take priority. Thus the information may be less reliable than would be ideal, and the behaviour less predictable.

The operators were presumably already in the process of diagnosing the fault. Diagnosis, where the symptoms are not identical from one incident to the next, seems to consist of a particular kind of task unit, where those task units that are instantiated have not been experienced widely before, and where therefore there is little detail in the expectations of what to do. Operator reasoning may follow basic or primitive patterns, or use 'weak' methods. A SimulACRUM model, through built-in architectural limitations on attention range and memory capacity, would be able to suggest what problem-solving strategies would be plausible, but unlike architectures such as Soar (Newell 1990) it would not rely on just one normative problem-solving method, as these can vary greatly between individuals. Instead, the problem-solving method would be an emergent characteristic of the task unit rules.

The Mont Sainte-Odile accident could be modelled in a number of steps. The initial supposed error could be a lack of care with verification of a procedure, due to time pressure. Here again the strengths of connection of the rules come into play, that when there is time pressure, the less strongly connected ones are left out - common in everyday experience of time pressure. Perhaps more importantly, the fact that this was a real danger appears not to have been known by the pilots. Knowing about the dangers could be modelled either by stronger connections of the feedback evidence rule with the execution rule, or with a higher 'tension' of the task unit. Higher tension of the task unit, or higher charge of the entity, is what one may expect when a pilot has had near-accident experiences (possibly in simulator training) connected with that task unit or entity.

Then there was the failure to notice the abnormal indications that showed that the aircraft was not behaving as it should. This loss of situation awareness could be attributed to task pressure, and needs to be modelled in terms of attentional resources, salience of cues, etc. It seems likely that a ground proximity warning device would have succeeded in calling the attention of the crew to the loss of situation awareness.

Some practical points can be made in response to the analysis of these kinds of problem. They all could be helped by more operator experience of the problem itself. The difficulty here is to know what experience to give in training; and to select the problems to be worked on in simulation. One factor that emerges from the modelling architecture is that it is important matter for training to get the operators' normal state envelopes right - that is, they must be alert to the changes of situation variables which signify something being amiss. SimulACRUM suggests that this is different from training in the rules themselves, or even explicit rules for diagnosis of problems, because it uses a different mechanism. This could be investigated experimentally.

For practical applications, very detailed modelling is needed if all these mechanisms are to be represented. A full-scale study of a practical task would need very substantial resources, and this would be necessary to develop the architecture in a valuable way.

WHAT SimulACRUM IS NOT

SimulACRUM at present does not have a model of learning, as do Soar and ACT-R. While learning is vital to complex task training, it is rarely central to operation. A model of learning needs an underlying model of knowledge, and so developments of SimulACRUM will provide the basis for a modelling learning in a more realistic cognitive way.

Also by way of disclaimer, SimulACRUM does not deal directly with interpersonal systems, but it represents a basis in individual cognition for doing so.

CONCLUDING COMMENTS

Architectural mechanisms emphasised in the applications above include particularly those ones that are not simply rule-based in a straightforward way: the normal state envelopes; the variable strength of connection of rules, modulated by the global activation threshold; tension and charge.

The rule-based aspects of the architecture are needed to give the observed highly predictable behaviour in usual situations. It is the way in which these two separate aspects are articulated that is the basis of the outline of the SimulACRUM architecture that has been presented, and will remain central to future developments.

One of the great difficulties in research into cognitive architecture is that one cannot definitively prove one architecture to be definitively better than another. What has been presented here needs developing and using in the practical modelling of complex tasks, and only after that has been done, also using other architectures, will it become apparent which architecture is more useful. The considerations given in this paper are indications of potential value.

It is rather like new products on the marketplace, or new programming languages. One has to use a number of them before it becomes apparent which is most useful for what. The value of presenting architectures in development is to stimulate and prepare the ground for future modelling projects, in which models and architecture can be developed together.

ACKNOWLEDGEMENTS

Anne-Laure Amat's comments have been very helpful.

© 1996 06	Home page	Publications	Paper frame
Bibliographics	authorship	abstract	contents	references