8 december 1998
MUSE - a Successful Case-Study of Engineering Human Factors Design Methods?
Meeting SIGCHI.NL - 8 December 1998 - CIBIT Utrecht 19:00 - 21:30
Professor John Long
Ergonomics & HCI Unit
University College London, UK
IPO, Center for Research on User-System Interaction
Eindhoven University of Technology, Eindhoven, the Netherlands
The presentation focussed on three aspects of MUSE:
- HCI as an Engineering Discipline
- the use of Method Engineering
- Use of Case Histories
- the original problem
- an overview of the method and its stages
- method engineering
- applications of MUSE
- comparison to other HF methods
- the future and conclusions
- loose ends
1- The Problem
The MoD used JSD as a standard but was unsatisfied with the human factors input: within JSD HF input only comes within focus during the detailed design or implementation phase which resulted in ineffective design and in the need for extensive redesigns after HF evaluation. As such, there was a need to add stages or components next to JSD as a vehicle to get HF input into JSD and to improve JSD design for applications in which HF were important.
2- An Overview of the Method and its Stages
- Information Elicitation and Analysis Phase
- Extant Systems Analysis
- Generalized Task Model
- Design Synthesis Phase
- Statement of User Needs
- Composite Task Model--> JSD functions list
- System and User Task Model<-- JSD information flow
- Design Specification Phase--> JSD implementation model
- Interaction Task Model
- Interface Model and - Display Design
Information Elicitation and Analysis Phase
Using such a wide definition is also meant to address the criticism that design methods using some form of extant system analysis are not suitable to design completely new types of design: designs are never so "new" that they do not any components from other products. As such it is a matter of degree.
The main result of analysis is the GTM; the generalized task model that lists all functions and components that will be considered in the design. The GTM registers what people -experts presumably- say about the design domain; as such, it is rather a model of the domain instead of a domain model, although that's up debate.
Design Synthesis Phase
The Composite Task Model is a conceptual specification of the system in general; that is, without a differentiation between user and software system. The CTM also feeds the creation of the list of functions within JSD design. How exactly to create these model is "jolly difficult" and it depends very much on experience with design in general and MUSE in specific, because it demands choices about how much (and which) HF to put into the design relative to domain modeling.
The System- and User Task Models are the result of good old task decomposition and allocation. In this phase there is a decomposition and allocation between system and user tasks and between online and offline tasks.
The offline tasks are not considered any more within the design of the specific product but here they are because of the mutual dependence between these tasks, for example to address possible error sites and problems after specifying an initial error- free system. According to the structure of MUSE no further exchange of information takes place during the Synthesis Phase, except for the function list; in actual design this will often be different.
Design Specification Phase
From the User Interface Specification input is given to the Software Engineering Specification of JSD, exchanging things like events, display specifications, etc. Different from the Synthesis Phase this information does not come from a single (sub- ) specification but from all the models that make up this phase together.
In general, it should be clear that the method does not do the design; it only provides a ground and a rationale for HF decisions. As such, the result of MUSE, or rather a first iteration will only be a first prototype and not a perfect solution.
MUSE seems to aim at design that already roll and not at completely new types of designs. According to Newman (1994) the majority of CHI papers is about "new" solutions. Designing new solutions does not constitute everyday practice. Also in regular design, "locking into design" is more influenced by aspects of the organization than by the particular method that is used.
MUSE does not specifically aim at designs in which computer technology is abandoned. Within MUSE, the choice of technology belongs to the user requirements stage but it probably does not sufficiently provide for this choice.
3- Method Engineering
- domestic emergency management (e.g. Hill and Long, 1998)
- a recreational booking system
- a distance tutoring system (using only the categories)
- Human-Factors guides to Air Traffic Control (using only the structure)
There was a clear choice to have developers use MUSE, first because it is them who provide the cases to validate MUSE, and secondly, because otherwise nobody else (read: academic researchers) would do this.
To validate a methods like MUSE one might do comparative studies to establish under which circumstances it does better or worse than other methods. Instead, Muse was validated by treating it as just another artifact that has to meet certain requirements; more specifically, the requirements to classify as a design method.
A list of requirements was devised from both, claims about MUSE in the book, and from abstract formal design requirements like: "the artifact is consistent across all levels of application ...".
Doing so does not yield a score that describes to what extend MUSE does its job, but rather a idea of its strong and weak sides such as complexity, observability or well- definedness. One of the conclusions, apart from those concerning MUSE as a method to structure design is that MUSE also serves to delineate which Human-Factors requirements are considered relevant and should be used. For example, complexity plays a different role between military applications with plenty opportunities for training and end- consumer 'walk up and use' interfaces. In this way, MUSE functions to define the viability space for e.g. users requirements and for design application areas.
4- MUSE Applications
A first group of metrics are objective measures. For example, calculating cost/benefit in terms of nodes in the specification (like Function Point Analysis) showed that this number increases during the GTM indicating the need for tool support there.
A second group of metrics were the subjective ratings, such as for the adequacy of design documents. For example, these were rather low for the interaction and display models. It showed up that this was because the design teams were used to a different format of the specifications.
A third group of metrics was concerned with clues on what to do next, including factors such as the degree of redundancy, completeness, the ease of use of specifications for Human-Factors reviews, and the ability to transfer design ideas into the Software Engineering design
Especially the ESI study was concerned with finding the factors why elements of MUSE work better or worse in one company than in another, indicating how a company and/or MUSE might be improved. This study used a fake Wizard of Oz client who worked with design teams from Philips Medical Systems (on an Xray application with strong SE and weaker HF expertise), EDS (on an military aircraft management system with weak SE and strong HF expertise) and EWS (on a human error assessment application with the local guru in the role of SE method).
Some common findings of the ESSI study about MUSE is that it is good as a communication tool, that it supports split-site work, changing design teams, and things like the separation into different levels of design and the recording of design decisions. A major finding was that applying MUSE leads to a more efficient use of time during meetings. Unfortunately, because meetings always last as long as there is time for them, the formal conclusion is that MUSE requires a similar or lesser amount of time as not using it.
For those who plan to involve industrial partners, there may be a lesson to be learned from EDS who 'stole' parts of MUSE and subsequently sold that as part of their own proprietary methodology. This may be seen as a reward or as an example how public domain knowledge disappears into commercialism.
5- Comparisons to other Human Factors methods.
Similarities are listed such as stages, notations, support (hints, tips and procedures) and design activities. The differences between the methods are noted such as ease of use, validation, etc. The next step is to create structure models or GTM's of each of the methods to determine where the unique, overlapping and missing elements of each methods are. At this point it is known where MUSE might be improved, provided that such improvements are important or make sense.
6- The Future of MUSE and Conclusions
Three conclusions apply to the development of MUSE thus far:
- MUSE is a successful case of (methods) engineering HCI
- Case histories, although hardly ever published are critical for the validation of HCI
- HCI as an engineering discipline may be advancing (albeit slowly)
7- Loose ends (questions, discussion points, etc.)
There may be a need to explore different options before choosing a final specification but this is not treated within the method. In this respect John Middlemass remarked that because MUSE looks 'a bit waterfally' it would be nice to make it look more like a spiral, even though MUSE is not a waterfall method.
It would be intellectually false to change the looks of a method to make it more appealing. On the one hand MUSE as a framework may sometimes fail, like there is no design without some tweaking. Expert MUSE users know much better how to use it because they know beforehand where the difficulties are and can anticipate. On the other hand it is important to realize that MUSE is no more than a framework. Nowadays everybody seems to be in design (e.g. marketing, customer relations, etc.) in the sense of design project management, but MUSE is a design framework and not something that tells how to run a project.
Applying MUSE in comparison to not using a method or framework at all will yield better results because a method (presumably any method) makes people take the same route and helps pinpoint where things go wrong, and since any method allows for iterative improvement and knowledge incrementation, it will tell how to do things better next time.
Cummaford, S. and Long, J. (1998). Towards a Conception of HCI Engineering Design Principles. In: Green, T.R.G., Bannon, L., Warren, C.P. and Buckley, J. (eds.) Proceedings ECCE-9: Cognition and Co-operation. Limerick, Ireland, August 24-26, 1998, pp. 79-84.
Downs, E., Clare, P. and Cole, I. (1988). Structured Systems Analysis and Design Method: Application and Context. Prentice Hall, London.
Hill, B. and Long, J. (1998) Diagnosing Ineffective Performance in the Domain of Emergency Management. In: Green, T.R.G., Bannon, L., Warren, C.P. and Buckley, J. (eds.) Proceedings ECCE-9: Cognition and Co-operation. Limerick, Ireland, August 24-26, 1998, pp. 159-162.
Middlemass, J. and Long, J. (1998). Explicit, Informal Cooperative Work: Integrating Human Factors and Software Engineering. In: Green, T.R.G., Bannon, L., Warren, C.P. and Buckley, J. (eds.) Proceedings ECCE-9: Cognition and Co-operation. Limerick, Ireland, August 24-26, 1998, pp. 171-174.
Newman, W. A Preliminary Analysis of the Products of HCI Research, using Pro Forma Abstracts. In: Abelson, B., Dumais, S. and Olson, J. (eds.) Proceedings CHI '94. ACM, New York, 1994, pp. 278-284.
Redmond-Pyle, D. and Moore, A. (1995). Graphical User Interface Design and Evaluation: A Practical Process. Prentice Hall, London.
Wesson, J., de Kock, G. and Warren, P. (1997). Designing for Usability: a case study. In: Howard, S. Hammond, J. and Lindgaard, G. (eds.) Proceedings Interact '97. Chapman and Hall, London, pp. 31-38.
(ECCE-9 references can be found at: http://www.cs.vu.nl/~eace/ECCE9/indec.html)