©1990, 1995 section list 6: Experiment 1 overview General Contents
Section 6.3 6.4 Discussion subsections Section 7.1

Further discussion

A number of issues arose in the previous section that will be further expanded here. This leads on to a review of what was learnt from this experiment, and arguments pointing towards what needed doing next.

6.4.1 Uncertainty in scores

The reliability of the total score as a measure of experience was compromised by the random number and distribution of the mines. The number was random to ensure that the search could not be broken off without covering all corners of the minefield, which would enable an unfairly quick return, as well as being unrealistic. In an attempt to counter this problem, the scoring system allotted points for each mine found. However, the scoring was fixed before a great deal of experience had been gained, and it was subsequently discovered that experienced players gained more points by dealing with a mine than they lost through the extra time taken. Hence higher scores could be obtained when there are more mines, and the actual highest score of a player depended not only on their skill, but also on their luck in the allocation of mines.

A further problem with the reliability of the scoring comes from the catastrophic nature of an accidental mine explosion. A subject could be performing very well, but such an explosion would cause the total score to be highly negative. Thus good scores would be mixed in with very bad ones. For these reasons, it was felt that any graph of raw scores over time would be of little value.

The psychological impact of the scoring system is difficult to evaluate, and this will not be attempted. It may be noted that the task would change depending on whether the subjects were instructed to achieve the single highest score, or to achieve the best overall average score, or somewhere in between these two extremes. The emphasis in the experiment was only on achieving the highest single score, and this meant that when a subject accidentally blew up a mine, or did something that led to a long delay, the game was sometimes abandoned at that point, presumably on the grounds that a high score could not be obtained.

6.4.2 Types of action

Detailed consideration of the task, a priori, suggests several possible types of action that the player might perform. Correct identification of the type or types of action performed is potentially important to any analysis of this kind, since an analysis designed to find certain kinds of action might fail to find other kinds. These could include:

Of these, the methods of the current study can only deal with those actions that fall into the second category, i.e., those that follow a rule. Therefore the success of the rule-induction depends, as well as on the quality of the representation, on the extent to which the actions analysed belong to this class, as opposed to one of the other classes.

Dealing first with slips, we may note that some of the unintended key-presses have no effect. These can be taken out in the process of analysis, by actrep. Other slips will contribute noise to the data, with the result that the induced rules will be less accurate and perform worse on prediction.

With knowledge-based processes, we could expect in theory to be able to induce rules if we know all the factors that are taken into account, and the intermediate concepts involved, in the knowledge-based process. This would be comparable with defining the terminology with which to construct an expert system, and would involve defining appropriate higher-level concepts in terms of lower-level ones: there is nothing in general to prevent this being done, but it may require much knowledge or theory about the knowledge-processing mechanisms. We are unlikely to be able to capture much of this level with the relatively straightforward methods that are used in the present study.

Information-seeking actions could be of two types: either actions directly altering the selection or presentation of information, or actions affecting the simulation itself. The information selection actions may reveal something about the information being used or considered at a particular time: however, in the first version of the simulation game, there was still so much information present concurrently (especially graphical) that our knowledge of the player's information usage is advanced only slightly, if at all. This approach to understanding the player had yet to be explored. More difficult to formalise are the actions which may be characterised thus: ``give it a nudge to see how much it moves, and then you'll know how much to push it''. If this kind of action were being used, it would tend to obscure rules about how large an action to make in differing circumstances, since the initial nudge might be similar in the different situations, with only the following action differing; and that action not differing on the basis of the static quantities, but on the dynamic response of the thing that was nudged. However these information-seeking actions are dealt with, there are likely to be fewer of them the more practiced the subject is, since the desired quantities will be more likely to be known. For exploratory actions, again, the more practiced the player is, the less likely they are to occur. This reinforces the desirability of concentrating on well-learnt situations.

There is also a philosophical aspect to the question of the nature of actions, i.e., how we are to represent actions in general. This has a large effect on the methods of analysis. Firstly, we could consider an action as directly corresponding to the state that it brought about. An analysis on this basis will work if every situation has corresponding unique control settings appropriate to that situation. For example, if the ship is moving forwards at a reasonable speed, and the desired direction is more than (say) 15 degrees to port, then the desired rudder setting is hard port. This fits into the paradigm of pattern recognition and means-ends analysis: knowing how things ought to be leads to appreciation of the discrepancies between the actual and the desired state, and thence to steps to reduce the difference.

A second approach is to consider all actions as interventions, not necessarily determined by the objective state that is brought about. One can characterise the above example in this way, by adding that if the rudder setting is not hard port, then set it to be so. In this second approach, the dependency of actions on the current control setting is emphasised. It may be that this is more appropriate for serial actions, and explicit rules; while the first may be more appropriate for parallel actions without conscious attention.

Which approach is taken has implications for treating null actions. It is evident that at times, an operator is consciously not intervening, because everything is within the operator's limits of acceptance. If a `desired state' approach is taken, the concept of action has no default: there is always some desired control state; every situation has some appropriate response, even if this does not entail altering the controls. If the correct response is not known, some measure of closeness will give a situation which is similar, and whose action can fill the unknown. With an `intervention' approach, on the other hand, there is a default action of `do nothing', in just those cases where there is no appropriate intervention. This does, however, raise the problem of granularity of actions, in that it is far from clear how many null actions to attribute to any given space of time free from positive actions.

Choosing exclusively one or the other approach seems over-rigid. However, purely for ease of analysis, we may note that one can always express desired-state actions in terms of interventions contingent on the current state of the controls, as well as the outside world; whereas one cannot always express interventions in terms of desired states. For this reason, the analysis in this study is in terms of interventions.

The choice of approach with respect to actions is to some extent a pragmatic rather than a theoretical one. Constructing a complete theory incorporating all these types of actions would be a huge enterprise, encompassing a great deal of cognitive psychology. The choice of rule-governed actions may be justified as a starting point firstly by considering the relevance of regularities to the kinds of applications we are considering (and the relative lesser relevance of other actions); secondly by recognising that knowledge-based actions have been the subject of much investigation, both in AI and in learning systems (e.g., [41]); and thirdly by discounting the practicality of investigating information-seeking, exploratory and whimsical actions as being a much more difficult place to start.

6.4.3 Evaluating the information provided by an interface

The information displayed at the interface falls into two sections. The `sensor' section contains only numeric data displayed as numbers, and this clearly defines a set of primitives which we can take as the basic representation of these quantities. But for the `graphic' sections, it was much more difficult to decide what was being displayed. One view might take the content of what is displayed to be the system variables that are used in the construction of the graphical display. However, the inference of other quantities is so immediate and intuitive, that it is difficult to avoid the idea that this information is also being presented in the display.

A simple example concerns the ROV's heading. One numerical sensor gives, in whole degrees, the heading of the ROV in the conventional way (000 to 359 clockwise from North). Another sensor gives the bearing of the closest unknown or unsafe target. There was no explicit offering of the bearing of the closest target relative to the ROV's head, and yet this was obviously going to be a significant quantity, and it was one which was immediately apparent (though in an unscaled form) from the ROV graphic display, as long as the target was within the viewing region. A very close parallel exists with the ship's heading and associated quantities. In the case of the ship, the relative bearing can be immediately seen from the general position indicator.

Thus, it is a real difficulty with graphic displays to achieve any degree of objectivity about what information is being provided, and hence, for any higher-level representation, how much information processing is being done by the interface and how much is being done by the human. The uncertainty of interface design remains unclarified in this case, because there is no a priori way of being sure that you have presented the information that you wish to present effectively.

One possibility for formalising some graphic information is to focus on significant events, and allow that the display effectively gives a rough idea of time-until-the-event. Of course, this need not be displayed as such, but the combination of perceived distance and motion can easily be seen as giving a time measure. Such time measures have an established history in theories of mariners' actions in collision avoidance. For a discussion of the ``RDRR'' criterion (Range to Domain over Range Rate) and its use in an intelligent collision avoidance system, see [15, 27, 28, 30]. A slightly simpler concept, ``TCPA'' (Time to Closest Point of Approach) is also used in many places (e.g., [111, 129]).

6.4.4 Other difficulties with representation

Another difficulty with representation arises in connection with the manoeuvring of the ship. The general position indicator (the upper graphic display present at all times) sometimes confronts the player with a pattern of targets that have complex implications. What is the best place to stop the ship, so that the most targets can be dealt with at once, and leaving the ship in an advantageous position to proceed? To come to a decision on this clearly requires an overall view of the disposition of the targets, and since the precise pattern of targets repeats itself extremely rarely, any routinizing of these decisions could not be linked to precise identity of the conditions.

It is plausible to consider this as an example of knowledge-based behaviour, since in the time available people are likely to still be trying out different approaches, and developing ways of categorising arrangements of targets into groups indicating the best action to take. In the author's experience, a considerable amount of conscious thinking goes on in the consideration of where to stop the ship, though this thinking may not be verbal. Alternatively, one could consider it as a visual pattern-matching process. An attempt to analyse this in symbolic terms would inevitably involve many pattern and shape concepts, which would be difficult to derive from data such as is in the present study, because this experiment was not designed to address pattern issues. In the longer term, we might be able to ascertain which attributes were relevant to ship positioning decisions, and we might be able to devise a method of learning how to recognise values of these attributes from the original Cartesian data of the simulation. These questions are difficult enough to constitute independent problems, and since there is little necessary overlap with the present lines of enquiry, issues involving the processing and use of patterns are not followed here.

6.4.5 Limited nature of interesting results

Reviewing the state of results at the end of the first experiment:

  1. we had interesting evidence that rule-induction reveals important things about human task performance, particularly about learning and differences in representation.
  2. we had a reasonable method of dealing with the representation of actions (though far from perfect).
  3. we have discovered higher-level representations of turning actions for the ROV which appear to fit humans better than the lowest-level representations.
  4. the studies pointed towards the possibility of cross-comparisons of one player's rules with another's actions, perhaps leading to the ability to distinguish representative examples of different player's actions.
  5. the way was in principle also open to selecting and refining rules and examples iteratively; selecting for the next training set those games where the rules perform best (the most `ruly' games), and selecting those rules that perform best on the best games.
  6. it appeared possible, though extremely laborious, to select attributes for representing situations by introducing them one at a time, and observing the effect on the performance of the rules.
  7. a yet more laborious possibility would be to select landmark values for the data as a whole, mapping the numeric data onto a small number of values for each attribute. The best position of these landmarks could be found by moving them gradually, watching the effect on the performance of the rules.
  8. these last three possibilities would only become practical if some automated tools were produced to help. Some of these ideas therefore will be taken up in the `further work' section (§ 8.3).
  9. there was no good principled method of generating representations of situations close to those that we might assume people have.
  10. graphically displayed information appeared the hardest to represent, and was difficult to envisage dealing properly with.
  11. the performance of derived rules suggested that we were still a long way from any full representation including all the factors which come into a human's decisions.

6.4.6 Need for further experiments

The most salient need was therefore to overcome the problem of generating better representations of situations, in the absence of automated methods. Three ideas had clear merit.
  1. Cutting out the graphic displays would drastically limit the uncertainty of how the presented information was to be represented. However, removing them altogether might make it far more difficult for the task to be learnt in the first place.
  2. Costing the information, enabling and encouraging players to turn off what they are not using, would give a great deal of help towards knowing what information any player was using at any time, and therefore would help to provide representations capable of supporting the induction of rules which performed better. Graphical information would be priced highly, thus encouraging players to do without it. As soon as they had `got the general idea', they would attempt to find strategies which did not need the graphical information.
  3. Using data from a well-practiced subject would minimise the learning activities performed (knowledge-based behaviour), and if the player had enough practice to be clear about what information was necessary, there might be fewer information-seeking actions that affected the simulation. This implies the maximisation of the time spent by each subject.
Also having discovered something about the turning of the ROV, new higher-level controls could be made, which could make the task easier. To compensate for this, weather could be introduced, as was originally planned. These steps would change the task substantially; but since the idea of the task in the first place was only to provide a sufficiently complex and interesting task in the chosen field, this should not be detrimental to the experiment as a whole.

Next Section 7.1
General Contents Copyright