Evidence Based Experience Design

Peter Jones Design for Care, Design for Practice, Service Design

Architecture, interior design and clinical devices have adopted evidence-based design (EBD) and these fields actively contribute to its development through major projects, journal articles, and conferences. Evidence based design is a rigorous design equivalent to the careful application of scholarly evidence in informing care decisions. It is a healthcare term of art and has meaning in that sector.  It is not the gathering of user data as research “evidence” to inform design decisions in digital design.  EBD generally involves:

  • Reviewing current and retrospective research literatures to identify precedents, mature findings, and prescriptive guidance from relevant studies.
  • Prioritizing and balancing the literature basis with primary data collected from actual patient data, subject matter experts, and professional observations.
  • Advancing theories and hypotheses to support observations and structuring evaluations to test outcomes of design decisions (e.g. architecture, facility design, wayfinding, room design).
  • Measuring outcomes following implementation and assessing theory validity and any gap between observations and hypotheses.

How are other design disciplines positioned with respect to evidence? Does it make sense for UX and experience design to adopt evidence-based principles, especially in healthcare?

Other design fields are not in the same risk position as architecture and device design. It doesn’t help UX to make claims for evidence that cannot be supported by (some type of) peer review. Design trade publications and user experience blogs show growing interest in EBD. Unfortunately the typical claims being made for (largely qualitative) evidence are not helping the UX field gain credibility (in healthcare anyway).

Except for (relatively) few domain-focused specialists and industrial design firms, most claims to evidence-based design are not supported by the necessary level of research, transparency of sharing data and findings, and multidisciplinary skills.  If EBD is claimed in low-risk or non-critical applications, it is probably not really EBD.  Most projects (e.g. websites) do not require this standard of research design. When lives, liability, and (tens of hundreds of) thousands of dollars are at risk, the due diligence of quantified measurable evidence is necessary to ensure the decisions are supportable.

The assertion that a firm employs evidence-based design should not be made in a healthcare context without being able to justify a validated research and design process and to endorse personnel capable of leading such a project.  If we honestly consider the maturity level of evidence-based design in UX and service design, based on known and published literature, a charitable assessment would be Level 2, Repeatable (pegging it to the SEI Capability Maturity Model):

Level 1 – Initial (Chaotic)It is characteristic of processes at this level that they are (typically) undocumented and in a state of dynamic change, tending to be driven in an ad hoc, uncontrolled and reactive manner by users or events. This provides a chaotic or unstable environment for the processes.

Level 2 – Repeatable

It is characteristic of processes at this level that some processes are repeatable, possibly with consistent results. Process discipline is unlikely to be rigorous, but where it exists it may help to ensure that existing processes are maintained during times of stress.

Level 3 – Defined

It is characteristic of processes at this level that there are sets of defined and documented standard processes established and subject to some degree of improvement over time. These standard processes are in place (i.e., they are the AS-IS processes) and used to establish consistency of process performance across the organization.

Level 4 – Managed

It is characteristic of processes at this level that, using process metrics, management can effectively control the AS-IS process (e.g., for software development ). In particular, management can identify ways to adjust and adapt the process to particular projects without measurable losses of quality or deviations from specifications. Process Capability is established from this level.

Level 5 – Optimizing

It is a characteristic of processes at this level that the focus is on continually improving process performance through both incremental and innovative technological changes/improvements.

I’ll provide an example from a current project – Procedures Consult, an online multimedia training and learning management system used for learning medical and surgical procedures, an Elsevier Health product. While not actually designed as an EBD product – we have no consistent access to internal outcome measures – the design process was rigorous and repeatable. The processes used to design and evaluate Procedures Consult were developed over time with multiple (many) field observations and interactive trials. The test protocols were standardized and applied in institutions in Columbus, Boston, Philadelphia, Cleveland, and repeated for every new feature set.

The design and evaluation process is also Defined (Level 3), meaning that other products and projects have reused the process and protocols and the process can be improved across the organization. Can a design consultancy help an organization improve its capability maturity? Of course – but a long-term relationship is necessary if the shop is already just at Level 1, or if working with a product team and not senior product management. The process change and benefits are not seen with one project or one release cycle. Evaluation procedures and skills and standards for user evidence are established over time, not in one product. Learning must be documented and shared across the organization, requiring trials with new products and communicating best practices between shops.

Evidence that Counts

EBD is one of many methodologies that should be understood and used in appropriate settings.  Not every software interface or health website  requires the same methodology, or not every institutional project requires such a robust approach. However, meeting the consensus on evidence does support publication, which advances the credibility of design and its contribution to health science.

What counts as evidence in medical practice, in scholarship, in care planning, and in design decisions differ significantly. Not only the types of evidence, but the definition of evidence, its collection, quality evaluation, controls, presentation, and publication differ between fields and applications. In most fields, a bad design decision will not aggravate morbidity and mortality. In healthcare, poor design sensitivity and insufficient evaluation can lead to harm. Just as in medical research, levels of evidence are defined, in healthcare design research, appropriate levels of evidence might also be suggested. In clinical decision making, available evidence ranges from randomized controlled trials to expert studies, including a variety of types of evidence (observations, imaging, measured variables) relevant to diseases, biological responses, and applications to procedures, interventions, public health. The UK’s National Health Service classifies levels of evidence as follows:

Level A: Consistent Randomized Controlled Clinical Trial, cohort study, clinical decision rule validated in different populations.
Level B: Consistent Retrospective Cohort, Exploratory Cohort, Ecological Study, Outcomes Research, case-control study; or extrapolations from level A studies.
Level C: Case-series study or extrapolations from level B studies.
Level D: Expert opinion without explicit critical appraisal, or based on bench research or first principles.

Since these levels of evidence are applicable to health care decisions, equivalent human-centered design evidence might be expected to relate to design decisions, which affect more than just patient outcomes.  Design and experience research evidence must be defined to meet the needs of a wide range of different applications in healthcare. Some of these, we know well:

  • Software user experience
  • Interactive systems
  • Medical devices

And for more complex or emerging applications, I would observe we do not have generally accepted units of analysis and evidence:

  • Service systems
  • Organizational and Administrative processes
  • Work practices and workflow
  • Wayfinding systems

For these applications there are fewer experience design studies to draw on.  Traditional operational studies rely on “outcome measures,” but the measures taken often assume a causality between interventions and outcomes – a causality which may not exist in a complex reality.

Evidence by type – roughly from more to less rigorous – might include:

  1. Controlled human interaction experiments. Mixed method studies
  2. Patient observations, physiological measures, field experiment data
  3. Robust sampled ethnographic data, controlled usability interaction studies
  4. Small sample interviews, Hard case study, extrapolations from field research
  5. Expert opinion, Heuristic or multi-perspective assessment

Why does this matter?

If “everyone is a designer” and “everyone does research,” there is little hope for distinguishing a standard of ethical practice that might lead to reliable contribution to healthcare. With the huge growth of the Web and of people needed to build it, the user experience field has expanded well beyond the original human factors community that started the field. Many of us older guys were educated in experimental research at the post-graduate level, and worked on large scale information systems long before the Web. Early usability testing methods (at IBM, AT&T) were often conducted in accordance with experimental design standards and at least descriptive or inferential statistical support.

The widespread adoption of the label “user experience design” has glossed over many of original distinctive differences between practices, but research professionals constitute an ever-shrinking proportion of the field. While this merger perhaps gained broader acceptance, it may reduce credibility in high-hazard, high-reliability settings. A different “standard of care” is necessary when designing a system for clinical professionals, or patients, than for consumers.

The user experience field right does not have a single professional society or advocacy. There is no clearinghouse for agreement on validation of practices and standards. I actually think we have a legitimation issue regarding meaningful evidence and accepted, if not standard, measures for high-risk and complex applications. And as a long-time practitioner (my first million dollar usability lab project was in 1989), today I am not sure how these advances in practice are best contested and resolved. By followers on Twitter?

A self-assessment across design fields should be conducted by a panel of representatives from the primary design disciplines to clarify the standard of evidence, range of research  methods, and a credible and ethical representation of the state of practice.  This self-communication within the larger field of design and user experience disciplines is needed to communicate explicitly the expected value and values understood by practitioners in those fields.