jeffrey heer >> blog >> archives (summaries)

book: computers and cognition

It seems we've all been getting phenomenological these days, so now is no time to stop. I just finished reading one of the better things to come out of the 1980's (my little brother and Metallica being two other notable exports) -- Winograd and Flores' monograph "Understanding Computers and Cognition".

The book is the retelling of an intellectual journey, philosophically examining the failure of Artificial Intelligence to achieve its lofty goals and directing the insights gained from this exploration towards a new approach to the design of computer systems. Or, more simply, how Heidegger and friends led an AI researcher to the study of human-computer interaction.

The authors begin by challenging what they call the "rationalistic" tradition (what today might be referred to as positivism?) stretching throughout most of Western thought. This tradition's problem solving approach consists of identifying relevant objects and their properties, and then finding general rules that act upon these. The rules can then be applied logically to the situation of interest to determine desired conclusions. Under this tradition, the question of achieving true artificial intelligence on computers, while daunting, holds the glimmer of possibility.

Winograd and Flores instead argue for a phenomenological account of being. The authors pull from a variety of sources to make their claims, but rest primarily on Heidegger's Being and Time and the work of biologist Humberto Maturana. One of the important implications is the notion of a horizon, background, or pre-understanding, making it impossible to completely escape our own prejudices or interpretations. Much of our existence is ready-to-hand, operating beneath the level of recognition and subject-object distinction, and this can not, in its entirety, be brought into conscious apprehension (i.e. made present-at-hand). AI programs at the time, however, were largely representational. The program's "background" is merely the encoding of the programmer's apprehension and assumptions of the program's domain. While this approach can certainly create useful programs, they are characteristic of the decontextualized, desituationalized nature commonly attributed to computer interaction and are a far cry from human intelligence.

The authors further delve into the issue of language, arguing that "...the essence of language as a human activity lies not in its ability to reflect the world, but in its characteristic of creating commitment. When we say a person understands something, we imply that he or she has entered into the commitment implied by that understanding." Thus, the authors argue that computers, by their very nature, are incapable of commitment and therefore prevented from entering into language on the same terms as humans.

The authors' conclusion? Move from AI to HCI. There is an error in assuming that success will follow the path of artificial intelligence. The key to design lies in understanding the readiness-to-hand of the tools being built, and in anticipating the breakdowns that will occur in their use. A system that provides a limited imitation of human facilities will intrude with apparently irregular and incomprehensible breakdowns. On the other hand, we can create tools that are designed to make the maximal use of human perception and understanding without projecting human capacities onto the computer.

Other thoughts and notes are in the extended entry.

The design section at the end of the book discusses the Coordinator system, which explicitly represents different speech acts as a way of attempting better coordination of organizational communication, in particular supporting the formation and evaluation of commitments. I'm not familiar with the literature on this system, but colleagues of mine have referred to it as a known failure of early CSCW (computer-supported cooperative work). The explicit encoding of otherwise "ready-to-hand" communication seems potentially dangerous and limiting of social nuance. For example, if a commitment is encoded formally, how much room for ambiguity (or delaying, or weaseling, or whatever) is left without making it present-at-hand? It is similar to one of the projects discussed in my friend Scott's thesis, in which by trying to leverage a theory of human behavior (in this case Goffman's notion of different fronts or faces), he encoded formally what people practice unconsciously with high degrees of nuance, thus creating a disconnect between actual human behavior and the well-intentioned mechanisms of the interface.

How would more recent AI developments be treated through the lens of this book? Modern statistical techniques can incorporate probabilisitic logic and learning from example data, but still revolves around the statistical model (e.g. specific graphical models) and training techniques (e.g. the EM algorithm) used. These are still representational (primarily in the choice of statistical model), but less strictly so. How far can we extrapolate this, loosening the representation?

Do we have any of our own 'hard-coded' models (e.g. Chomskian grammar)? Where do our own representational structures lie on the spectrum of nature (genetics, evolution) and nurture (socially learned and negotiated meaning)?

The question here is at the heart of modern cognitive neuroscience - at what representational level, if any, can we understand human functioning, cognition, and experience (at varying levels of consciousness)? Physics? Chemistry? Neuronal interaction? At what level should we look for the organization (or perhaps better stated, embodiment) of a structure-determined, autopoietic system that allows for experience, intelligence and a background to arise? In short, where and how do science and phenomenology dovetail?

In the meantime, it is argued that the design of computer programs should steer clear of these pretensions. The lesson from above teaches us that even as we understand mechanisms of thought, language, experience, etc, the way we naturally perceive and act in the world is not experienced or conceptualized in the terms of these mechanisms.

The big challenge left for us after reading this book: How do we determine the readiness-to-hand of the tools being built (or the desired 'invisibility' of ubiquitous computing environments)? How do we design for it, how do we measure it, evaluate it, and value it? Furthermore, how do we look beyond just 'tools'? How do we build things that appropriately shift between ready-to-hand and present-at-hand, and that are designed to evoke emotional as well as rational responses? (e.g. a nuclear missile launch control interface should be anything BUT ready-to-hand, requiring conscious deliberation). We've had almost 20 years of HCI research since this book was published, with numerous successes in various (often constrained) domains, but these are still the core theoretical and methodological motivations pushing us forward.

--NOTES--

Heideggerian Philosophy
- Our implicit beliefs and assumptions cannot all be made explicit
- Practical understanding is more fundamental than theoretical understanding
- We do not relate to things primarily by having representations of them
- Meaning is fundamentally social and can not be reduced to the meaning giving activity of individual subjects.

Ready-to-hand: the world in which we are always acting unreflectively. The ready to hand is taken as part of the background, taken for granted without explicit recognition or identification.

Present-at-hand: the world in which we are consciously reflective, identifying, labeling, and recognizing artifacts and ideas as such.

Breakdown: the event of the ready-to-hand becoming present-at-hand

Throwness: the condition of understanding in which our actions find some resonance or effectiveness in the world

Properties of throwness
- You can not avoid acting
- You can not step back and reflect on your actions
- The effects of actions can not be predicted
- You do not have a stable representation of the situation
- Every representation is an interpretation
- Language is action

------------------------------------------

The Biology of Cognition: Humberto Maturana

p.43
The structure of the organism at any moment determines a domain of perturbations--a space of possible effects the medium could have on the sequence of structural states that it could follow.

Autopoiesis. An autopoietic system is defined as: "...a network of processes of production (transformation and destruction) of components that produces the components that: (i) through their interactions and transformations continuously regenerate the network of processes (tealtions) that produced them; and (ii) constitue it (the machine) as a concrete unity in the space in which they (the components) exist by specifying the topological domain of its realization as such a network..." -Maturana and Verla, Autopoiesis and Cognition (1980), p.79

A plastic, structure-determined system (i.e., one whose strucutre can change over time while its identity remains) that is autopoietic will by necessity evolve in such a way that its activities are properly coupled to its medium.

Structural coupling is the basis not only for changes in an individual during its lifetime (learning) but also for changes carried through reproduction (evolution). In fact, all structural change can be viewed as ontogenetic (occurring in the life of an individual). A genetic mutation is a structural change to the parent which has no direct effect on its state of autopoiesis until it plays a rolue in the development of an offspring.

A cognitive explanationis one that deals with the relevance of action to the maintenance of autopoiesis. It operates in a phenomenal domain (domain of phenomena) that is distinct from the domain of mechanistic structure-determined behavior.

For Maturana the cognitive domain is not simply a different (mental) level for providing a mechanistic description of the functioning of an organism. It is a domain for characterizing effective action through time. It is essentially temporal and historical.

The sources of pertrubation for an organism include other organisms of the same and different kinds. In the interaction between them, each organism undergoes a process of structural coupling due to the pertrubations generated by the others. This mutual process can lead to interlocked patterns of behavior that form a consensual domain.

---------------------------------------------------

Speech Acts

Five categories of illocutionary point:
- Assertives
- Directives
- Commissives
- Expressives
- Declarations
Each can be applied with varying illocutionary force.

------

The failures of AI

p.123
[A] program's claim to understanding is based on the fact that the linguistic and experiential domains the programmer is trying to represent are complex and call for a broad range of human understanding. As with the other examples however, the program actually operates within a narrow micro-world that reflects the blindness of that representation.

...the essence of language as a human activity lies not in its ability to reflect the world, but in its characteristic of creating commitment. When we say a person understands something, we imply that he or she has entered into the commitment implied by that understanding. But how can a computer enter into a commitment?

p.133
In order to produce a set of rules for [an] ... 'expert' system, it is first necessary to pre-select the relevant factors and thereby cut out the role of the background. But as we have been arguing throughout this book, this process by its very nature creates blindness. there is always a limit set by what has been made explicit, and always the potential of breakdowns that call for moving beyond this limit.

p.137
There is an error in assuming that success will follow the path of artificial intelligence. The key to design lies in understanding the readiness-to-hand of the tools being built, and in anticipating the breakdowns that will occur in their use. A system that provides a limited imitation of human facilities will intrude with apparently irregular and incomprehensible breakdowns. On the other hand, we can create tools that are designed to make the maximal use of human perception and understanding without projecting human capacities onto the computer.

Posted by jheer at 10:15 AM | Comments (0)

book: philosophy of punk

I just finished reading an interesting little book from self-proclaimed radical publisher AK Press: The Philosophy of Punk by Craig O'Hara.

First things first, let's just be clear that when we use the word philosophy here, we're not talking Kant (thank god), and we're not talking Wittgenstein... but we are talking about a possibly fascinating look at a largely misunderstood sub-culture... one with often conflicting views from it's own members. Unfortunately this book is not quite there.

In true punk spirit, however, O'Hara's book is a DIY (do-it-yourself) effort from the ground up. Fanzines (e.g., Maximum Rock n' Roll, Profane Existence, ...) and album liner notes form the primary sources of the book, which chronicles punk viewpoints on media misrepresentation, zines, anarchism, gender, sexuality, and environmentalism. For the most part little new is gleamed, though the book does a nice job of taking various snapshots of the (primarily early 90's) punk world. Skinheads (even the steadfastly anti-racist breed) and straight-edgers, however, are given particularly scathing treatment, as the author characteristically sways between a pseudo-objective tone and unrestrained vitriolic opinion. The same style, if you ask me, that so often characterizes punk.

I did appreciate the book's chapter on anarchism, as it was one of the few sections where I encountered some new perspectives, and set me on a path to discover some interesting readings such as this one. I also discovered that true punks, according to O'Hara's view, are utopians: "anarchy does not just mean no laws, it means no need for laws."

What really struck me, though, was how deeply the rhetoric of rebellion is woven into punk philosophy as presented. In seems that most punk causes can be formulated to always begin with the prefix "anti-". In so doing, it runs the risk of ever being a counter-culture, defined largely by resistance and therefore existing as a reactive movement, its identity dependent on the larger culture it lashes back against. As such, punk is limited, willingly or unwillingly, to merely modifying the culture it would like to see obliterated. This observation is an over generalization, of course: punk acts continue to promote more egalitarian financial models (e.g., the wonderful folks from Fugazi), and other trends in the sub-culture, particularly gender and environmental issues, tend to promote a more proactive outlook. If punk truly still exists in this day in age (it always seems to be pronounced dead or dying), it will be interesting to see how it further evolves.

In the end, I wouldn't recommend going out of your way to get ahold of this book. But if you're either interested or completely ignorant of punk, and like me, find this book sitting on a friend's bookshelf, pick it up and give it a read. At the very least, it will get you thinking. Or, if you want to expose yourself to one of the more beautiful (and for my young teenage self, life-changing) expressions of punk philosophy, buy this album and learn all the lyrics by heart.

Posted by jheer at 10:21 PM | Comments (2)

paper: interface metaphors

Interface Metaphors and User Interface Design
John M. Carroll, Robert L. Mack, Wendy A. Kellogg
Handbook of Human Computer Interaction, 1988

This paper examines the use of metaphor as a device for framing and understanding user interface designs. It reviews operational, structural, and pragmatic views on metaphor and proposes a metaphor design methodology. In short the operational approach concerns the measurable behavioral effects of applying metaphor; structural analyses attempt to define, formalize, and abstract metaphors; and the pragmatic approach views metaphors in context – including the goals motivating metaphor use and the affective effects of metaphor. The proposed design methodology consists of 4 phases: identifying candidate metaphors, elaborating source and target domain matches, identifying metaphorical mismatches, and finally designing fixes for these mismatches. Strangely, this paper makes absolutely no mention whatsoever of George Lakoff’s influential work on conceptual metaphor, which I’m almost certain had been published prior to this article. My outlined notes follow below.

Introduction

Design interface actions, procedures and concepts to exploit specific prior knowledge that users have of other domains
Metaphors as alternative to reducing the absolute complexity
Metaphors, by definition, must provide imperfect mappings to their target domains

(otherwise, it would be the item it mapped to)

Inevitable mismatches are a source of new complexities for users
Metaphors often apply unevenly within a software domain
Composite metaphors are common

Desktop metaphor x Direct manipulation

Learning by analogy, one of the most basic approaches to learning

Operational Approaches to Metaphor

Focus on demonstrating measurable behavioral effects from employing metaphor
Raise questions of precisely how metaphor operates in the mind
Offers no principles that predict “good” and “bad” metaphors extensibly
Offers no principled definition of what a metaphor actually is

Structural Approaches to Metaphor

Develop representational descriptions of metaphors

… as relations among primitives in the source and the target domain

Douglas and Moran - 1983

Structural analysis of typewriter metaphor
Domain operators as structural primitives

type character -> type character
space bar -> insert blank space

Problems causes by mismatches – 62/105 collected errors

Gentner’s structure mapping theory

Holistic mappings between graph theoretic representations
Doesn’t consider individual entities piece-meal
Rutherford example – planet/sun -> electron/nucleus
Attribute predicates generally fail to map

Characteristics of structural formulation

Base specificity – how well understood is source domain (bounds usefulness)
Clarity – precision of node correspondences across mapping (e.g. 1-1 or 1-many)
Richness – density of predicates
Abstractness – level relations compromising mapping are defined (node mappings vs. predicate mappings)
Systematicity – extent mapped relations are mutually constrained by membership in structure of relations
Exhaustiveness – directional surjectivity
Transparency – how easy to tell which source relations get mapped to target
Scope – extensibility of the mapping

Relate metaphors to cognitive aspects of use

Expressive (literary) metaphors
Explanatory (scientific) metaphors
Both rated better when clarity is higher, but richness more important to expressive metaphors

What is appropriate grain for mapping?
What operators, what relations should be defined for a particular metaphorical mapping?
Structure-mapping analysis can prove insufficiently objective
Misses external consequences of metaphor

E.g. interpersonal attraction -> ionic bonding in chemistry may make chemistry more interesting

Pragmatic Approaches to Metaphor

Focus on use in the context and complexity of real-world situations
Emphasize the intentional use of metaphor to an end (i.e. aid users learning less familiar domain)
Context

Goals associated with metaphor
Characteristics beyond similarity basis

Incompleteness
Involvement in compositions

Acts as a filter on structural analyses
Suggest how inevitable structural flaws can play a useful role
Context of metaphor in use

Analysis of metaphors must rest on empirical task analysis of what users actually do
Metaphors can be used to draw attention to specific features of target domain

Thereby motivate further thought about comparison with source
Often primary purpose of literary metaphor

Interface metaphors can pose questions and open new possibilities

Metaphor mismatches

Mismatch and its resolution can elaborate an accurate conceptual understanding of the system

As negative exemplars can help to clarify a new concept

Composite metaphors

Mismatched of incomplete correspondence sometimes addressed by composite metaphors
Useful beyond increasing coverage of a target domain

May help generate more and different kinds of inferences about target domain
May aid quick convergence of integrated understanding of target domain

Toward a Theory of Metaphor

Competence theories of metaphor: operational, structural
Performance theories of metaphor: pragmatic analyses
Suggests need for integrated theory of metaphor

3 phases of metaphorical reasoning

instantiation

recognition or retrieval of something known (potential source analog)
automatic and holistic activation process, analytically incomplete

elaboration

generation of inferences about how source can be applied
pragmatically guided structure mapping, identifying relevant predicates

consolidation

consolidate elaborated metaphor into a mental model of the target domain
integrates partial mappings into a single representation of target domain

Integrated understanding of target is not couched in a metaphor, but a mental model

“Distinction between models and metaphors one of open-endedness, incompleteness, and inconsistent validity of metaphoric comparisons versus the explicitness and comprehensiveness and validity of the models which the successful learner will ultimately obtain.”

Designing with Metaphors

Use of metaphors is currently haphazard. Can we systematize it?
Structured methodology for interface metaphors

Identify candidate metaphor or composite metaphors

Sources include

Predecessor tools and systems
Human propensities
Sheer invention

Levels for metaphor

Tasks – what people do (goals and subgoals)
Methods – how tasks are accomplished
Appearance – look and feel

Detail metaphor/software matches w.r.t. representative user scenarios

Match metaphors against the levels listed above
Can help to enumerate the objects of the metaphor domain
Can begin assessing ‘goodness’

Identify likely mismatches and their implications

Discrepancy in the software domain must be interpretable

Mismatch should be isolable
Salient alternative course in the target domain should be available

Identify design strategies to help users manage mismatches

Creating interface designs that encourage and support exploration is a key approach
Users must be able to “recover” from metaphor mismatches
Progressive disclosure of advanced functions
Iterative design
Composite metaphors can be a design solution for resolving mismatches
Help system can anticipate mismatches as well

Conclusion

Metaphors draw correspondences, and have motivational and affective consequences for users
They interact with an frame user’s problem-solving efforts in learning the target domain
Ultimate problem is for user to develop a mental model of the target domain itself.
There is no predictive theory of metaphor – they must be designed on a case by case basis, and carefully analyzed and evaluated
This endemic not just to interface metaphors, but user interface design itself

Posted by jheer at 03:53 PM | Comments (1)

paper: hci and disabilities

Human Computer Interfaces for People with Disabilities
Alan F. Newell and Peter Gregor
Handbook of Human-Computer Interaction, 1988

Human computer interface engineers should seriously consider the problems posed by people with disabilities, as this will lead to a more widespread understanding of the true nature and scope of human computer interface engineering in general.

Why HCI and disabilities?

While HCI keynotes, workshops, and tutorials acknowledge the need for a focus on disabled users, little is found in the scientific focus of HCI.
Statistics (commonly accepted figures in the “developed world”)

1/10 have sig. hearing impairment, 1/125 are deaf
1/100 have visual disabilities, 1/475 legally blind, 1/2000 totally blind
1/250 are wheelchair users
6 million mentally retarded people in the US, 2 million are in institutions
Estimated 20% of population has difficulty performing one ormore basic physical activities

Americans with Disabilities Act of 1992

Title One: Employers responsibility to accommodate disabilities of employees and applications. Illegal to discriminate when > 24 workers.
Title Two: Government facilities and services be accessible to the disabled.

Why do HCI engineers consider the disabled?

HCI Engineering is

Of high theoretical and practical value
"high tech" and leading edge research
important and academically respectable discipline

Designing systems for disabled, however is seen as

Having little or no intellectual challenge
Charitable rather than professional
At most, of fringe interest to researchers
Requirining individualized designs
Involving small unprofitable markets
Needing simple, low cost solutions
Dominated by home-made systems

Rarely is motivation the same as for joining mainstream science
Dangers: lack of quality control and commitment, disappointed users, deleterious effect on the commercial sector.
Reality: Designing for the disabilities is

Intellectually challenging, with greater scope for inventiveness
Achievements can be much greater and obviously worthwhile
The market for such innovation is not small

HCI in danger of ignoring a large market segment, but also of missing designs more widely useful that inventors originally intend.

E.g. curb-cuts, cassette tape recorders, remote controls, ballpoint pen

Who and what are people with disabilities?

Binary division of society (abled and disabled) is deeply flawed
Many designers/developers do not understand the narrowness of their vision of the human race
High-dimensional model of human ability

Physical, perceptual, mental abilities
Want to maximize the hyper-volume in this space of useful interfaces
People MOVE about in this space, abilities are not static

Contexts of use also often ignored (not JUST in an office environment)
Environment can induce "disabilities" in people as well

Assumption that designers are designing for fit human beings
Good design ought to be robust to changes in environment

Addressing the problems of extremes can provide impetus for better designs overall

Designing for disabled can improve performance for the abled in high-stress or extra-ordinary environments
E.g. flight deck of aircraft, air traffic control

Promising avenues of research

Predictive technologies
Multi-modal interaction

HCI is missing out...

Increased market share
Demographic trends (think baby-boomers)
** Extra-ordinary needs are only exaggerated ordinary needs
Most people have a mix of such needs
Temporary disabilities are common
** Environmental conditions can handicap users
** Deeper problem of increasing communication bandwidth
Greater inventiveness, innovation
Improved use of truly user-centric design methodologies

Posted by jheer at 12:02 PM | Comments (0)

paper: contextual inquiry

Contextual Design: Contextual Inquiry (Chapter 3)
Hugh Beyer and Karen Holtzblatt
Contextual Design: Defining Customer Centered Systems, 1998

This article discusses in depth the contextual inquiry phase of the contextual design methodology. Contextual inquiry emphasizes interacting directly with workers at their place of work within the constructs of a master/apprentice relationship model in order for designers to gain a real insight into the needs and work practices of their users.

Contextual Inquiry

Design process work when they build on natural human behavior
Use existing relationship models to interact with the customer

Master / Apprentice Model

When you're watching the work happen, learning is easy
Seeing the work reveals what matters
Seeing the work reveals details
Seeing the work reveals structure
Every current activity recalls past instances
Contextual Inquiry is apprenticeship compressed in time

Four Principles of Contextual Inquiry

Contextual Inquiry tailors apprenticeship to the needs of design teams
Context

Go where the work is to get the best data
Avoid summary data by watching the work unfold
Avoid abstractions by returning to real artifacts and events
Span time by replaying past events in detail
Keep the customer concrete by exploring ongoing work

Partnership

Help customers articulate their work experience
Alternate between watching and probing
Teach the customer how to see the work by probing work structure
Find the work issues behind design ideas
Let the customer shape your understanding of the work
Avoid other relationship models

Interviewer/Interviewee

You aren't there to get a list of questions answered

Expert/Novice

You aren't their to answer questions either

Guest/Host

It's a goal to be nosy

Partnership creates a sense of shared quest

Interpretation

Determine what customer words and actions mean together
Design ideas are the product of a chain of reasoning
Design is built upon interpretation of facts - so the interpretation better be right
Sharing interpretations with customers won't bias the data
Sharing interpretations teaches customers to see structure in the work
Customers fine-tune interpretations
Nonverbal cues confirm interpretations

Focus

Clear focus steers the conversation
Focus reveals detail
Focus conceals the unexpected
Internal feelings guide how to interview
Commit to challenging your assumptions, not validating them

Contextual Interview Structure

Conventional interview

Get to know customers and their issues

Transition

Explain the new rules of a contextual interview

Contextual interview proper

Observe and probe ongoing work

Wrap-up

Feedback a comprehensive interpretation

Context, Partnership, Interpretation, and Focus!

Posted by jheer at 01:32 PM | Comments (2)

paper: contextual design

Contextual Design: Introduction (Chapter 1)
Hugh Beyer and Karen Holtzblatt
Contextual Design: Defining Customer Centered Systems, 1998

This book chapter introduces the difficulties of customer centered design in organizations, and proposes the methodology of Contextual Design as a set of processes for overcoming these difficulties and achieving successful designs that benefit both the customer and the business.

Introduction

The challenge of system design is to fit into the fabrice of everyday life
Contextual Design is a backbone for organizing a customer-centered design process

Challenges for Design

Collect and manage complex customer data w/o losing detail
Design a response that is good for business and customers
Foster agreement and cooperation between stakeholders
Make the process practical given time constraints

Challenge of Fitting into Everyday Life

Support the way users want to work
Don’t increase work and frustration with automation

Creating an Optimal Match to the Work

Innovate through step-by-step introduction of new work practice

Keeping in Touch with the Customer

Organizational growth isolates developers from customers
Sitting with the users makes cross-departmental projects hard

Challenge of Design in Organizations

Breaking up work across groups creates communication problems
Different organizational functions focus on different parts of a coherent process
Every function needs customer data, but it has to be the right kind of data
Data showing what is wrong is frustrating if builders can’t fix it
Cross-functional design teams create a shared perspective

Teamwork in the Physical Environment

Organizations have no real spaces for continuing team work

Managing Face-to-Face Design

Face-to-face work depends on managing the interpersonal
Disagreements can lead to an incoherent design
Customer-centered design keeps user work coherent by creating a well-working team

Challenge of Design from Data

Learn how to see the implications of customer data
Recognize that designing from customer data is a new skill
Don’t expect to find requirements littering the landscape at the customer site

Complexity of Work

The complexity of work is overwhelming, so people oversimplify

Maintaining a coherent response

A systemic response --- not a list of features --- keeps user work coherent.
Diagrams of work and the system help a team think systemically

Contextual Design

Contextual Design externalizes good design practice for a team
Contextual Inquiry

Talk to the customers while they work

Work modeling

Represents peoples work in diagrams

Consolidation

Pull individual diagrams together to see the work of all customers

Work redesign

Crate a corporate response to the customers’ issues

User Environment Design

Structure the system work model to fit the work

Mock-up and test with customers

Test your ideas with users through paper prototypes

Putting into practice

Tailor Contextual Design to your organization

Posted by jheer at 01:27 PM | Comments (0)

paper: rapid ethnography

Rapid Ethnography: Time-Deepening Strategies for HCI Field Research
David R. Millen
Proceedings of DIS'00

HCI has come to highly regard ethnographic research as a useful and powerful methodology for understanding the needs and work practices of a user population. However, full ethnographies are also notoriously time and resource heavy, making it hard to fit into a deadline-driven development cycle. This paper presents techniques for rapid, targeted ethnographic work, in the hopes of accruing much of the benefit of field work while still fitting within acceptable time bounds.

The paper organizes its suggestions around three core themes:

Narrow the field of focus before entering the field

GOAL: Zoom in on important activities

Move from 'wide-angle lens' metaphor to 'telephoto lens'

Identify an informant to use as a field guide

People with broad access
Can discuss interesting behaviors and social tensions up front

Find liminal informants

Fringe members of groups
Usually have free movement about group
Not so engrained that work patterns and relationships have become taken for granted

Find corporate informants

Employees of the organization in question

Colleagues, field staff such as service reps

Use fringe sampling methods to identify potential informants of interest
Plan at onset to develop long-term informant relationships

Use multiple interactive observation techniques

GOAL: Increase likelihood of exceptional and useful behavior
More than one observer onsite.

Parallelize observations
Get multiple perspectives
Careful to avoid changing work environment by additional presence

Identify opportunistic times of observation

Maximize learning rate
E.g. look at electronic logs to determine times of peak activity

More interactive approaches

Interactive feature conceptualization
Group Elicitation Method

Observer-participation

Understanding through personal experience
Understand affective issues around field activities

Use collaborative and computerized iterative data analysis methods

GOAL: More efficiently analyze collected data
Use computer assisted analysis tools

Text (FolioViews, AskSAM)
Video, Audio coding and playback software

Collaborative analysis techniques

Cognitive (concept) mapping
Pictorial storytelling
Scenario analysis

Posted by jheer at 01:23 PM | Comments (0)

paper: 2D fitt's law

Extending Fitt’s Law to Two-Dimensional Tasks
I. Scott MacKenzie and William Buxton
Proceedings of CHI’92

This paper extends the famous Fitt’s Law for predicting human movement times to work accurately in two-dimensional scenarios, in particular rectangular targets. The main finding of the paper is that two models, one which models target width by projecting along the vector of approach and another which uses the minimum of the width or height achieved equal statistical fits, and showed a significant benefit over models which used (width+height), (width*height), and (width-as-horizontal-distance-only) models.

For those who don’t know, Fitt’s Law is an empirically validated law that describes the time it takes for a person to perform a physical movement, parameterized by the distance to the target and the size of the target. It’s formula is one-dimensional: it only considers movement along a straight line between the start and the target. The preferred formulation of the law is the Shannon formulation, so named because it mimics the underlying theorem from Information Theory --

MT = a + b log_2 (A/W + 1)

Where MT is the movement time, A is the target distance or amplitude, W is the target width, and a and b are constants empirically determined by linear regression. The log term is known as the Index of Difficulty (ID) of the task at hand and is in units of bits (note the typo in the paper).

The Shannon formulation is preferred for a number of reasons

Provides the best statistical fit
Mimics the underlying information theory
Always provides a positive value for the ID

This paper then considers two-dimensional cases. Clearly you can cast the movement along a one-dimensional line between start and the center of the target, and the amplitude is the Euclidean distance between these points. But what to use as the width term? Historically, the horizontal width was just used, but this seems like an unintuitive choice in a number of situations, particularly when approaching the target from directly above of below. This paper studies five possibilities: Using the minimum of the width and distance (“smaller-of”), using the projected width along the angle of approach (“w-prime”), using the sum of the dimensions (“w+h”), using the product of the dimensions (“w*h”), and using the historical horizontal width (“status quo”).

The study varied amplitude and target dimensions crossed with 3 approach angles (0, 45, and 90 degrees). Twelve subjects were used, who performed 1170 trials each over four days of experiments. The results found the following ordering among the models in terms of model fit: smaller-of > w-prime > w+h > w*h > status quo. Notably, the smaller-of and w-prime cases were quite close – their differences were not statistically significant.

The w-prime case is theoretically attractive, as it cleanly retains the one-dimensionality of the model. The smaller-of model is attractive in practice as it doesn’t depend on the angle of approach, and so require one less parameter than w-prime. The w-prime model. However, doesn’t require that the targets be rectangular as the smaller-of model assumes. Finally, it should be noted that these results may be slightly inaccurate in the case of big targets, as the target point is assumed to be in the center of the target object. In many cases users may click on the edge, decreasing the amplitude.

Posted by jheer at 01:18 PM | Comments (0)

paper: other ways to program

Drawing on Napkins, Video-Game Animation, and Other Ways to Program Computers
Ken Kahn
Communications of the ACM, Vol. 39, No. 6, 1996

This article describes a number of visual, interactive methods to programming. The main thesis is that visual programming environments have failed to date because they are not radical enough. Programs exhibit dynamic behavior that static visuals do not always convey appropriately and so dynamic visuals, or animation, should be applied. Furthermore, visual programming can avoid explicit abstraction (i.e. when visuals become just another stand in for symbols in a formal system) without necessarily sacrificing power and expressiveness. Put more abstractly, a programming language can be designed to use one of many possible syntactical structures. It then becomes the goal of the visual programming developer to find the appropriate syntax that can be mapped to the desired language semantics. To map an existing computer language (e.g., C or LISP) into a visual form would require the use of a visual syntax isomorphic to the underlying language. Doing so in a useful, intuitive, and learnable manner proves quite difficult.

Kahn describes a number of previous end-user programming systems. This includes AlgoBlocks, which allow users to chain together physical blocks representing some stage of computation. The blocks support parameterizations on them, and afford collaborative programming. Another system is Pictorial Janus, which uses visual topological properties such as containment, touching, and connection to (quite abstractly, in my view) depict program performance.

He goes on to describe a (quite imaginative) virtual programming "world", ToonTalk, which can be used to construct rich, multi-threaded applications using a video-game interaction style. The ToonTalk world maps houses to threads or processes, and the robots that can inhabit houses are the equivalent of methods. Method bodies are determined by actually showing the robot what to do. Data such as numbers and text are represented as number pads, text pads, or pictures that can be moved about, put into boxes (arrays or tuples), or operated upon with "tools" such as mathematical functions. Communication is represented using the metaphor of birds -- hand a bird a box, and they will take it to their nest at the other house, making it available for the robots of that abode to work with the data.

Kahn argues that while such an environment may be slower to use for the adept programmer, it is faster to learn, and usable even by young children. It also may be more amenable to disabled individuals. Furthermore, its interactive animated nature (you can see your program "playing out" in the ToonTalk world) aids error discovery and debugging. In conclusion, Kahn suggests that these techniques and others (e.g. speech interfaces) could be integrated into the current programming paradigm to create a richer, multimodal experience that plays off different media for constructing the appropriate aspects of software.

Inspiring, yes, but quite difficult to achieve. My biggest question of the moment is: what happened to Ken Kahn? The article footer says he used to work at PARC until 92, and then focused on developing ToonTalk full-time. I'll have to look him up on the Internet to discover how much more progress he made. While I'm skeptical of these techniques being perfected and adopted in production-level software engineering in the near future, I won't be surprised if they experience a renaissance in ubiquitous computing environments, in which everyday users attempt to configure and instruct their "smart" environs. If nothing else, VCRs could learn a thing or two...

Posted by jheer at 12:01 AM | Comments (0)

paper: prog'ing by example

Tonight I read a block of papers on end-user programming, aka Programming by Example (PBE), aka Programming by Demonstration (PBD). Very fun stuff, and definitely got me thinking about the kind of toys I would want any future children of mine to be playing with.

Eager: Programming Repetitive Tasks By Example
Allen Cypher
Proceedings of CHI'91

This paper introduces Cypher's Eager, a programming by example system designed for automating routine tasks in the HyperCard environment. It works by monitoring users actions in HyperCard and searching for repetitive tasks. When one is discovered it presents an icon, and begins highlighting what it expects the user's next action to be - an interaction technique Cypher dubs "anticipation". This allows the user to interactively - and non-abstractly - understand the model the system is building of user action. When the user is confident that Eager understands the task being performed, the user can click on the Eager icon and let it automate the rest of the iteration. For example, it can recognize the action of copying and pasting each name in a rolodex application into a new file, and completely automate the task.

Eager was written in LISP, and communicated to HyperCard over interprocess communication. When a recognized pattern is executed, Eager actually constructed the corresponding HyperCard program (in the language HyperTalk) and passed it back to the HyperCard environment for execution.

There are a couple of crucial things that make Eager successful. One is that Eager tries only to perform simple repetitive tasks... there are no conditionals, no advanced control structures. This simplifies the both the generalization problem and the presentation of system state to the user. Second, Eager uses higher-level domain knowledge. Instead of low-level mouse data, Eager gets semantically useful data from the HyperCard environment, and furthermore has domain knowledge about HyperCard, allowing it to better match usage patterns. Finally, Eager has the appropriate pattern matching routines programmed in, including numbering and iteration conventions, days of the week, as well as non-strict matching requirements for lower-level events, allowing it to recognize higher-level patterns (ones with multiples avenues of accomplishment) more robustly.

The downside, as I see it, however, is that for such a scheme to generalize across applications you either have to (a) reprogram for every application or (b) designers must equip each program not only with the ability to report high-level events in a standardized fashion, but to communicate application semantics to the pattern matcher. Introducing more advanced applications with richer control structures muddies this further. That being said, such a feature could be invaluable in integrated, high-use applications such as Office or popular development environments. Integrating such a system into the help, tutorial, and mediation features already existant in such systems could be very useful indeed.

Posted by jheer at 12:00 AM | Comments (0)

paper: charting ubicomp research

Charting Past, Present, and Future Research in Ubiquitous Computing
Gregory D. Abowd and Elizabeth D. Mynatt
ACM TOCHI, Vol. 7, No. 1, March 2000

This paper reviews ubiquitous computing research and suggests future directions. The authors present four dimensions of scale for characterizing ubicomp systems: device (inch, foot, yard), space (distribution of computation in physical space), people (critical mass acceptance), and time (availability of interaction). Historical work is presented and categorized under three interaction themes: natural interfaces (e.g. speech, handwriting, tangible UIs, vision), context-aware applications (e.g. implicit input of location, identity, activity), and automated capture and access (e.g. video, event logging, annotation). The authors then suggest a fourth, encompassing research theme of everyday computing, characterized by diffuse computational support of informal, everyday activities. This theme suggests a number of new pressing problems for research: continuously present computer interfaces, information presentation at varying levels of the periphery of human attention, bridging events between physical and virtual worlds, and modifying traditional HCI methods for informal, peripheral, and opportunistic behavior. Additional issues include how to evaluate ubicomp systems (for which the authors suggest CSCW-inspired, real-world deployment and long-term observation of use) and how to cope with the various social implications, both due to privacy and security and to behavior adaptation.

In addition to the useful synopsis and categorization of past work, I thought the real contribution of this paper was the numerous suggestions for future research, many of which are quite important and inspiring. I was also very happy to see that many of the lessons of CSCW, which are particularly relevant to ubicomp, were influencing the perspective of the authors.

However, on the critical side a couple things struck me. One is that many of the suggestions of research are lacking some kind of notion of how "deep" the research problem runs. For example, the research problems in capture and access basically summarize both the meta-data and retrieval problems, long-standing fundamental issues in the multimedia community. However, this depth and extent of the research issue, or how we might skirt the fundamental issues by domain-specificity, is not mentioned. Another issue I had was that I felt the everyday computing scenario might have used some fleshing out. I wanted the authors to provide me with the compelling scenario they say such research mandates. Examples were provided, so perhaps I am being overly critical, but I wanted a more concrete exposition, perhaps along the lines of Weiser's Sal scenario.

See the extended entry for a more thorough summary

Introduction

Ubicomp as proliferation of computing into the physical world
Historically three primary interaction themes:

Natural Interfaces
Context-Aware applications
Automated capture and access

Everyday computing: a new interaction theme:

Focused on scaling interaction with respect to time

Addressing interruption and resumption of interaction
Representing passages of time
Providing associative storage model

Informal and unstructured activities

All themes have difficult issues w.r.t. social implications

Privacy, Security, Visibility, Control
Social phenomena (behavior modification)

Evolutionary Path

Devices: PARCTab + Liveboard
Input: Unistroke
Infrastructure: Active Badge
Applications: Tivoli, Wearables as memory assistant + implicit info sharing

Different dimensions of scale

Device – the physical scale of the device (palm, laptop, whiteboard)
Space – distribution of computation into physical space
People – reaching critical mass acceptance
Time – availability of interaction

Proliferation of devices of varying scale has indeed occurred

Current ubicomp success has been in physical mobility (but NOT physical awareness!), suggest increased focus on issues of time – continuous interaction.
Applications research as the ultimate purpose for ubicomp research

Natural Input

Examples

Speech UIs
Pen Input
Computer Vision
Tangible Interfaces
Multimodal Interfaces

Requirements for rapid development

First-Class Natural Data Types + Operations
Handling Error

Error reduction
Error discovery
Reusable infrastructure for error correction

Context-Aware Computing

Context

Information characterizing the physical and social environment
Who, What, Where, When, Why
Context as Implicit Input

Representing Context
Context Fusion

Merging results of multiple context services

Augmented reality

Closing the loop b/w context and the world

Automated Capture and Access to Live Experiences

Augment inefficiency of human record-taking
Challenges in Capture and Access

Capture

Meta-data problem
Merging multiple sources (manageability vs. info overload)

Access

Search and retrieval problem
Handling versioning and annotation

Privacy management

Everyday Computing

Supporting informal, daily activities
Characterization of Activities

They rarely have a clear beginning or end
Interruption is expected
Multiple activities operate concurrently
Time is an important discriminator
Associative models of information are needed

Synergy Among Themes

Natural Input + Context-Awareness + Capture/Access

Research Directions

Design a continuously present computer interface

Information appliance, agents, wearables

Presenting information at varying levels of attentional periphery

CSCW, Ambient displays

Connecting events between physical and virtual worlds

Must understand how to combine such information such that the presentation matches user conceptual models

Modify traditional HCI methods for informal, peripheral, and opportunistic behavior

Additional Challenges

Evaluation

Finding and Addressing a Human Need

Compelling scenario underlying research
Feasibility studies – both technical and user-centric

Evaluate in the Context of Authentic Use

Effective evaluation requires a realistic deployment
Long-term study of usage

Task-centric evaluation techniques are inappropriate

Scenario is of informal, everyday use – not formalized, specific tasks
Challenge – how to apply qualitative or quantitative metrics

Social Issues

Dangers of privacy violation in ubicomp

Who can access and modify contents?
Security of data and data transmission
Users must know what is being sensed and collected
User control over recording and (at least) distribution
Acceptable policies for erasing or forgetting memory over time

Behavior modification

How does activity change in face of known computation surveillance?

Conclusion

The real goal for ubicomp is to provide many single-activity interactions that together promote a unified and continuous interaction between humans and computational services.

Posted by jheer at 02:50 PM | Comments (0)

paper: why and when five users aren't enough

Why and When Five Test Users aren’t Enough
Alan Woolrych and Gilbert Cockton
Proceedings of IHM-HCI 2001

This paper argues that Nielsen’s assertion that “Five Users are Enough” to determine 85% of usability problems does not always hold up. In the end, we walk away with the admonition that five users may or may not be enough. Richer statistical models are needed, as well good frequency and severity data. What does this mean for evaluators? Certainly this shouldn’t dissuade the use of usability evaluations, but it does imply that one should avoid false confidence and keep an eye to user/evaluator variability.

The paper starts by attacking the formula

ProblemsFound(i) = N ( 1 – ( 1 – lambda ) ^ i ),

in particular, the straightforward use of parameter (lambda = .31). Generalizing the formula shows we should actually expect, for n participants, that

ProblemsFound(n) = sum(j=1…N) ( 1 – ( 1 – lambda_j) ^ n ),

Where lambda_j is the probability of discovering usability problem j. Nielsen and Landauer’s formula assumes this probability is equal for all such problems (computed as lambda = the average of such empirically observed probabilities).

However, other studies, such as that by Spool and Schroeder, have found an average lambda of 0.081, showing that a study with ecologically valid tasks (in this case an unconstrained online shopping task with high N) can still miss many usability issues. Thus Nielsen’s claim that five is enough is only true under certain assumptions of problem discovery.

But other issues also abound. For instance, Nielsen’s model doesn’t take into account the variance between users, which can strongly affect the number of users needed. Further complications abound when considering severity ratings, as the authors found huge shifts in severity ratings based on different selections of five users. Other problems include which tasks are used for the evaluation (changes of task revealed undiscovered usability issues) and issues with usability issue extraction, determining the true value of N.

Posted by jheer at 07:16 PM | Comments (4)

paper: heuristic evaluation

Heuristic Evaluation
Jakob Nielsen
Chapter 2, Usability Inspection Methods, 1994

This paper describes the famous (in HCI circles) technique of Heuristic Evaluation, a discount usability method for evaluating user interface designs. HEs are conducted by having an evaluator walk through the interface, identifying and labeling usability problems with respect to a list of heuristics (listed below). It is usually recommended that multiple passes be made through the interface, so that evaluators can get a larger, contextual view of the interface, and then focus on the nitty-gritty details.

Revised Set of Usability Heuristics

Visibility of system status
Match between system and the real world
User control and freedom
Consistency and standards
Error prevention
Recognition over recall
Flexibility and efficiency of use
Aesthetic and minimalist design
Help users recognize, diagnose, and recover from errors
Help and documentation

The evaluators also go through a round of assigning severity ratings to all discovered usability problems, allowing designers to prioritize fixes. The severity is a mixture of frequency, impact, and persistence of an identified problem, and as presented forms a spectrum from 0-4, where 0 = Not a usability problem, 1 = Cosmetic problem only, 2 = Minor problem, 3 = Major problem, 4 = Usability catastrophe. Nielsen performs an analysis to show that inter-evaluator ratings have better-than-random agreement, and so ratings can be aggregated to get reliable estimates of severity.

Heuristic evaluation is cheap and can be done by user interface experts (i.e., they can be performed without bringing in outside users). Best results are experienced by evaluators that are familiar both with usability testing and the application domain of the evaluated interfaces. HE is faster and less costly than typical user studies, with which it can be used in conjunction (i.e. use HE first to filter out problems, then run a real user study to find remaining deeper seated issues). Lacking real user input, however, HE can run the risk of missing, or misestimating, usability infractions.

Nielsen found over multiple studies that the typical evaluator found only 31 percent (lambda = .31) of known usability problems in an interface. Using the model that

ProblemsFound(i) = N ( 1 – ( 1 – lambda ) ^ i ),

Where i is the number of evaluators and N is the total number of problems, we can arrive at the conclusion that 5 evaluators are enough to find 84% of usability problems. Nielsen also performs a cost-benefit analysis that finds 4 as the optimal number. Read the summary of the Woolrych and Cockton paper for a dissenting opinion.

Posted by jheer at 06:49 PM | Comments (0)

paper: your place or mine?

Finishing off my block of CSCW papers is Dourish, Belotti, et al's article on the long-term use and design of media spaces.

Your Place or Mine? Learning from Long-Term Use of Audio-Video Communication
Paul Dourish, Annette Adler, Victoria Belotti, and Austin Henderson
Computer-Supported Cooperative Work, 5(1), 33-62, 1996

This article reviews over 3 years of experience using an open audio-video link between the authors' offices to explore media spaces and remote interaction. The paper details the evolution of new behaviors in response to the communication medium, both at the individual and social levels. For example, the users learned to stare at the camera to initiate eye contact, but later learned to do without this but still establish attention. Also, colleagues would come to an office to speak to the remote participant.

I saw some important take home lessons here:

People will adapt to their environments over time in multiple ways -- long-term usage data is invaluable in social technologies.
Do not confuse social and physical mechanisms with the accomplishments they allow.
Analysis should be mindful of the duality of technology and practice -- being mindful of this duality implies that design should not attempt to eliminate it, or to encode the social in the technical.

My full outlined summary follows...

Introduction
- Over 3 years use of a shared open audio-video link (media space) between users
- Analyze from 3 non-traditional positions
  - Face-to-face communicative behavior in the real world not always an appropriate baseline for evaluation
  - Practices tailored to the nature of the medium arise over time as familiarity increases
  - Use, influence, and importance extend beyond the individuals who are directly engaged with it
- Look at video as part of the real world, rather than comparison between video and the real world
Perspectives on Mediated Interaction
- Individual
  - Interaction between a single user and technology
  - Experiences
    - Equipment takes up desk space, placement is significant and constrained
    - Ability to appropriately place equipment key to ability to manage video interaction as part of everyday activity
    - Directions
      - Equipment becomes surrogate for video partner
      - Gestures (e.g. directions) misleading due to orientation mismatch
      - Participants learned to point "through" the connection correctly
    - Noises Off
      - Camera field of view - some parts of room in view, others aren't
      - Over time users develop understanding of field of view
      - Others come and greet remote user, but can't be seen
      - Remote user becomes accustomed to such disembodied voices
- Interactional
  - Focus on individuals at the ends of a media connection, and their communication through it
  - Experiences
    - Open Audio
      - Enables lightweight initiation of conversation, short bursts of interaction
      - Act of turning audio on and off would be more intrusive than audio itself
      - Audio access lends peripheral awareness of each other's activities
    - Gaze Awareness
      - Over time, learned to stare at camera instead of monitor to create eye contact
      - With greater familiarity, abandoned this!
      - Lesson: don't confuse action with what it intends to accomplish
- Communal
  - Connections reach beyond direct users drawing in others physically or socially proximate
  - Experiences
    - Communication
      - Colleagues would come to office to talk to remote participant
      - Users and colleagues think of users as "sharing an office"
    - Presence and Telepresence
      - Hear sounds of typing coming from office - confuse remote work with local work
      - Functional space is no longer isomorphic to physical space
    - Virtual Neighbourhood
      - Sounds reaching one end of a connection that do not originate in either office
      - Inverse notion - user wants camera to also face towards the door to get a sense of remote activity
    - Projecting Audio
      - Projects sound of mediated conversation into area beyond remote office
      - Possibly troublesome during private conversation
      - Effectively used to attract attention of remotely-observed passers-by
      - Speaking softly close to mic produces an intimate yet loud effect
- Societal
  - Connections can affect the relationships between individuals and larger social groups
  - Experiences
    - Colleagues and Visitors
Encompassing Issues
- Ownership
  - Ownership of Technology
    - Shared communication link engenders shared ownership or responsibility for enabling technology
    - When no individuals see themselves as jointly owning a long-term connection, less use and responsibility
    - Technology itself plays a role
      - Can camera be moved around and adapted or fixed, immutable
  - Ownership of Space
    - Users thought of a single, shared office space - shared property of both occupants
    - Belotti reorganized office to better support mutual orientation
- Evolution
  - Evolution of orientation towards the technology of the media space
  - Evolution of communicative practices in two-way communication
  - Evolution of understandings of the way in which media spaces disrupt the communal resource of "space"
  - Emergence of video-specific mechanisms of interaction
  - Development of new behaviours tailored to the nature of the medium
  - As familiarity increases, so does the range of acitvities that can be effectively performed with relative ease
  - Analysis should be mindful of the duality of technology and practice
    - Being mindful of this duality implies that design should not attempt to eliminate it, or to encode the social in the technical
  - Evolution is larger groups of understandings not about video communication but understandings including video communication
Designing Media Spaces
- Crucial implication is the most general: Over time, adaptations take place as partners in long-term communication in media space environments learn effective ways to use the system. The sorts of problems people typically encounter, especially with respect to ability to manage and regulate conversation, lessen with time as new sets of resources to regulate interaction are learned.
- Do not confuse mechanisms of face-to-face interaction (e.g eye contact) for the accomplishments they support.
  - Otherwise we waste time unnecessarily in design, and fail to look beyond the inevitably flawed simulation of copresence.
- Linking Spaces, Not Just People
  - Media spaces link spaces, not just people
  - Spaces are foci of communal activity, emphasizing linkage of spaces enhances ability to participate in a wider space
  - Design decisions evaluated purely against the criteria of face-to-face communication may no longer be appropriate
- Audio
  - Supports lightweight interactions, peripheral monitoring of activity and remote space
  - Flattening of audio space
    - Impairs individuals ability to filter audio stream and listen selectively
  - Microphone headsets
    - Keeps communication private, removing ability of media space to reach out and draw in others
    - Distances wearer from local environment
  - Directional issues
    - Omnidirectional mics make it easier to maintain consistent audio environment, but lose directional information
- Digital Transmission and Shared Media
  - Analog vs. Digital systems
  - Analog systems: individual actions not dependent, use affects levels of service offered to others
- No Sense of Place
  - "Space is the opportunity, place is the understood reality"
  - Opportunity to flexibly organize activities and structures, giving the meanings for place to emerge from space, and the mutually recognizable orientation towars spaces which carries with it a sense of appropriate behaviour and expectations... a (shared) sense of place
  - Ability to appropriate, transform, and reuse space is rooted in the flexible switching which media spaces afford
  - Rigid and explicit geography is not a prereq for the mergence of a sense of place - it's community orientation that is critical
  - Use of geographical metaphor engenders the emergence of a shared understanding of the varied appropriate uses of spaces
  - Important that we allow for the way that place-orientations emerge out of the flexible, exploratory, and creative use of the space by its occupants
Conclusions
- Experiences lead to question basic assumptions:
  - The use of a real-world baseline
  - Person-to-person view of media spaces

Posted by jheer at 06:03 PM | Comments (0)

paper: computers, networks, and work

Computers, Networks, and Work
Lee Sproull and Sara Kiesler
Scientific American, September 1991

This article describes the early adoption of networked communication (e.g. e-mail) into the workplace. The often surprising social implications of networking began with the ARPANET, precursor of the modern internet. E-mail was originally considered a minor additional feature, but rapidly became the most popular feature of the network. We see immediately an important observation regarding social technologies: they are incredibly hard to predict.

In organizations that provided open-access to e-mail (i.e. without managerial restrictions in place), some thought that electronic discussion would improve the decision making process, as conversations would be “purely intellectual… less affected by people’s social skills and personal idiosyncracies.” The actual results were more complicated. Text-only conversation has less context cues (including appearance and manner) and weakened inhibitions. This has led to more difficult decision making, due to a more democratic style in which strong personalities and hierarchical relationships are eroded. While giving a larger voice to typically quieter individuals, lowered social inhibitions in electronic conversation is also prone to more extreme opinions and anger venting (e.g. “flaming”). One study even shows that people who consider themselves unattractive report higher confidence and liveliness over networked communication.

Given these observations, the authors posit a hypothesis: when cues about social context are weak or absent, people ignore their social situation and cease to worry about how others evaluate them. In one study, people self-reported much more illegal or undesirable behaviors over e-mail than when given the same study on pen and paper. In the same vane, traditional surveys of drinking account for only half of known sales, yet an online survey results matched more accurately the sales data than face-to-face reports. The impersonality of this electronic media ironically seems to engender more personal responses.

Networked communication has also been known to affect the structure of the work place. A study found that a networked work group, compared to a non-networked group, created more subcommittees and had multiple committee roles for group members. These networked committees were also designed in a more complex, overlapping structure. Networked communication also presents new opportunities for the life of information. Questions or problems can be addressed by other experienced employees, often from geographically disparate locations, allowing faster response over greater distance. Furthermore, by creating a protocol for saving and categorizing such exchanges, networked media can remember this information, increasing the life of the information and making it available to others.

As the authors illustrate, networked communication showed much promise at an early age. However, it doesn’t always come as expected or for free. The authors note the issue of incentive… shared communication must be beneficial to all those who would be using it for adoption to be successful. Also it may be the case that managers will end up managing people they have never met… hinting at the common ground problem described by the Olsens [Olsen and Olsen, HCI Journal, 2000]. Coming back to the authors’ hypothesis also raises one exciting fundamental question. As networked communication becomes richer, social context will begin to re-appear, modifying the social impact of the technologies. As this richer design space emerges, how can we utilize it to achieve desired social phenomena in a realm that is so prone to unpredictability?

Posted by jheer at 10:22 PM | Comments (0)

paper: distance matters

Distance Matters
Olson and Olson
Human-Computer Interaction, 2000

This paper examines and refutes the myth that remote cooperative technology will remove distance as a major factor effecting collaboration. While technologies such as videoconferencing and networking allow us to more effectively communicate and collaborate across great distances, the author's argue that distance will remain an important factor for the forseeable future, regardless of how sophisticated the technology becomes.

This paper reviews results of studies concerning both collocated and distant collaborative work, and extracts four concepts through which to understand collaborative processes and the adoption of remote technologies: common ground, coupling, collaboration readiness, and technology readiness. The case is then made that because of these issues and their interactions, distance will continue to have a strong effect on collaborative work processes.

Collocated Work Today
- Put people in a collaborative workspace (e.g. large conference room)
- Showed a doubling in productivity metrics
  - Important to note: work was at a stage appropriate for intense effort
- Spatiality of human interaction
  - People and objects exist in space, roles can be indexed by location
- Not explicitly mentioned: the interactive space must be of the right size to support the collaborative work. Otherwise would experience “thrashing”
- Key characteristics of collocated synchronous interactions
  - Rapid feedback
  - Multiple channels
  - Personal information
  - Nuanced information
  - Shared local context
  - Informal “hall” time before and after
  - Coreference
  - Individual control
  - Implicit cues
  - Spatiality of reference
Remote Work Today
- Successes
  - Collaboratory of Space Physicists
    - Simultaneous access to real-time data from throughout the world
    - User-centered design – 10 major redesigns over 7-year period
  - NetMeeting at Boeing
    - Moderator capable of debugging technology and eliciting interaction from remote participants
  - Telecom software maintenance and enhancement
    - Supported by email, a/v conferencing, transferred files, fax
  - Contributors to success
    - Known structure, boundaries
      - Who owns what, who can change what, what causes problems
    - Detailed process shared across sites
      - Common language about work
    - These takes about 2 years for novice to learn
- Failures
Findings Integrated: Four Concepts
- Common Ground
  - Knowledge that the participants have in common, and they are aware they have in common
  - Established from both general knowledge about a person’s background and through specific knowledge gleaned from the person’s appearance and behavior during conversation
  - Cues provided by media (Clark and Brennan 1991)
    - Copresence
    - Visibility
    - Audibility
    - Contemporality
    - Simultaneity
    - Sequentiality
    - Reviewability
    - Revisability
  - Prescription: focus on the importance of common ground. If not yet established, help develop it. Video is a great improvement over audio.
- Coupling in Work: A Characteristic of Work Itself (<- untrue assertion?)

The extent and kind of communication required by the work
Tightly coupled: depends strongly on talents of collections of workers and is nonroutine, even ambiguous
Loosely coupled: has fewer dependencies OR is more routine
Coupling is associated with the nature of the task, with some interaction with the common ground of the participants
=> Tightly coupled work is very difficult to do remotely
Prescription: design work organization so that tightly coupled work is collocated. The more formal the procedures, the more likely the success.

Collaboration Readiness
- Using shared technology assumes that coworkers need to share information and are rewarded for it.
- Prescription: one should not attempt to introduce groupware if there is no culture of sharing and collaboration. If needed, one has to align the incentive structure with the desired behavior.
Technology Readiness
- Organizations (infrastructure, expertise) and individuals (know-how, habits) must be ready to use the technology.
- Prescription: advanced technologies should be introduced in small steps.

Distance Work in the New Millennium

Though useful technologies will emerge and improve, there will always be a gap between remote and collocated collaboration.
Common Ground, Context, and Trust
- Local events, holidays, weather, and social interchange
- Common ground as pre-cursor to trust
- Suggests collocated team-building trips, followed by the remote work
Different Time Zones
- Differing work hours, circadian rhythms
Culture
- Local conventions (e.g. silicon valley casual attire vs. the suits)
- Difference in process
  - Task orientedness (e.g. tasks (American) vs. relationships (So. European)
  - Power distance – acceptance and/or questioning of authority
  - Management style
    - “Hamburger” style
      - Sweet-talk – top bun
      - Criticism – the meat
      - Encouragement – the bottom bun
    - American : hamburger, German : just meat, Japanese : just buns (“one has to smell the meat”)
  - Turn-taking (add pauses in conversation to open it up)
Interactions among these Factors and with Technology

Time-Zone vs. culture
- Multi-time-zone compatibility vs. holidays and accepted hours of work
Culture vs. culture
- cost vs. maintaining relations over video conferencing

Conclusion

Despite the limitations, distance technologies will continue to be useful and work their ways into social and organizational life, in some cases affecting profound changes in social and organizational behavior. However, through all this, distance will continue to matter.

Posted by jheer at 10:20 PM | Comments (0)

paper: groupware and social dynamics

Kicking off a batch of papers on Computer-Supported Cooperative Work (CSCW) is Grudin's list of challenges to the CSCW developer...

Groupware and Social Dynamics: Eight Challenges for Developers
Jonathan Grudin
Communications of the ACM

Groupware is introduced as software which lies in the midst of the spectrum between single-user applications and large organizational information systems. Examples include e-mail, instant messaging, group calendaring and scheduling, and electronic meeting rooms. The developers of groupware today come predominantly from a single-user background, and hence many do not realize the social and political factors crucial to developing groupware. Grudin outlines 8 major issues confronting groupware development and gives some proposed solutions.

The disparity between who does the work and who gets the benefit
Consider a shared calendaring system: A meeting organizer gets the benefit of automatically scheduled meetings, but it is all the involved employees who must do the work to maintain their calendars, and who might have no inclination to do so otherwise. Hence such features are rarely used.
Solution: Design processes that create benefits for all group members.

Critical Mass and Prisoner’s Dilemma Problems
Most groupware is only useful if a high percentage of group members use it. How useful is workplace IM if no one at work uses it…
Solution: Reduce work required of users, build incentives for use, suggest benefit-enhancing processes of use.

Social, Political, and Motivational Factors
Groupware will be resisted if it interferes with group social dynamics, e.g. a computer mistakenly schedules a manager’s unscheduled “free” time.
Solution: Contextual inquiry, domain familiarity / understanding.

Exception Handling in Workgroups
Much human problem-solving is ad hoc, often giving rise to exceptions and deviations from “standard process”. Systems that impede this are well situated for failure.
Solution: Again, contextual inquiry. Also, good configurability, though difficult, can help.

Designing for Infrequently Used Features
Groupware often focuses on communication and collaboration, but more is not necessarily better, and can lead to inefficiency, drawing away from the more frequently used single-user features. Groupware features should be known and accessible but not obtrusive.
Solution: Add groupware features to already successful applications if possible. Then create awareness and access to infrequently used features.

The Underestimated Difficulty of Evaluating Groupware
Evaluation of groupware is difficult: it is distributed, can span substantial time periods, and determining the factors of success/failure can be quite difficult.
Solution: “Development managers must enlist the appropriate skills, provide the resources, and disseminate the results”. In other words, Grudin doesn’t really know J.

The Breakdown of Intuitive Decision-Making
Groupware is often targeted for the benefit of managers (at least, this is what development decision-makers are drawn to), leaving out important end users of the system. Conversely, decision-makers do not recognize the value of apps that primarily benefit non-managers. This leads to more instances of Problem 1.
Solution: Recognition of this problem by development management. Again, contextual inquiry or domain understanding.

Managing Acceptance: A New Challenge for Product Developers
For groupware to be accepted it must be introduced carefully and deliberately. You have to attract a critical mass. Clear understanding by the users is crucial – a site visit and step-by-step learning can help.
Solution: Adding groupware to existing apps dodges the bullet. Other systems require good design (by understanding the environment of use, yet again contextual inquiry) and a developed adoption strategy.

Take home messages from the paper: groupware should strive to directly benefit all group members, build off of existing successful apps if possible, develop thoughtful adoption strategies, and be rooted in an understanding of the [physical|social|political] environment of use.

Posted by jheer at 09:49 PM | Comments (0)

paper: SNIF-ACT

I just read this paper by my research group's principal scientist, Peter Pirolli, and former PARC employee Wai-Tat Fu. The paper, entitled "SNIF-ACT: A Model of Information Foraging on the World Wide Web", recently won the Best Theoretical Paper Award at the 9th International Conference on User Modeling.

The paper extends the existing ACT-R cognitive modeling infrastructure to computationally simulate users surfing the web, at a fairly fine-grained psychological level. The system models the user's goals, knowledge, memory, and abilities (in the form of production rules) and combines these with the findings of Information Foraging theory to create a successful model of web surfing and decision making. Information Foraging theory applies the metaphor of animals foraging for food to the task of humans seeking information. In previous work, Pirolli and Card have found that the equations governing the cost structures of the two are the same.

The SNIF-ACT model works by extracting the content and links of a web page and then using a technique known as spreading activation to propagate "activity" through an associative memory network of individual words. Activation proceeds from the modeled user goals through the terms in working memory and out to the currently observed web content. Link weightings between word associations are determined by using word occurence and co-occurrence rates extracted from AltaVista. By finding the highest mutual activity between user goals and available links, the system can compute an estimate of the information scent (much like the scent tracked down by animals in the wild), and use this to construct a probability distribution of the likelihood of following different links. Drop-offs in scent measures are also used to predict when a user will leave the current web site to look elsewhere for a richer information patch (analogous to an animal moving on to greener pastures or hunting an easier prey). The SNIF-ACT model is psychologically richer than previous foraging-influenced systems like Bloodhound, which primarily uses techniques from the information retrieval (IR) field and earlier flawed cognitive approaches.

For more details about information foraging theory and it's applications, check out this essay by pixelcharmer (it even cites my first research paper!), this copy of a talk by Pete Pirolli, and my research group's publication archives.

There are at least two interesting avenues for this work to follow. One is in applications, as successful user models can create better automated usability metrics and could learn from individual behavior to create personalized research and surfing tools. The second is to simultaneously move from the web to other domains, building user models for other content-rich domains (e.g., information visualization). Down the road, I think the integration of content-based and perception-based (e.g. computer vision and audition) analyses will be the next big research leap - creating richer, more realistic, models of user behavior and furthering artificial intelligence research.

Posted by jheer at 02:15 PM | Comments (1)

paper: designing for usability

My prelims are coming sooner that I'd like to admit, and so I need to get hopping reading a bunch of papers. Fortunately, over the past semester our reading group read over half the assigned papers, but for those that are left I will be posting my summaries here for my own archival purposes (including some back-posts for previously read papers). Perhaps they will be of use to someone else as well, so I might as well make these public...

First up is "Designing for Usability: Key Principles and What Designers Think" by John D. Gould and Clayton Lewis. This paper was originally published in 1985 in the Communications of the ACM, and outlines the iterative design philosophy that is central to modern Human-Computer Interaction. The paper describes three central design principles (early focus on users, empirical measurement, and iterative design) and includes a survey of designers trying to ascertain how common and/or obvious these principles are. The paper also rebuts arguments against the use of these principles and presents a case study of these principles in action.

My biggest problem with this article is that the authors are too unsympathetic to the demands that a deadline-driven project can make. They seem to advocate iterating "as long as it takes", which while desirable is not particularly feasible. To be fair, they acknowledge these pressures and give some cogent arguments for why the costs of iteration are not as high as one might otherwise suspect. But what is missing in the methodology are strategies and techniques for optimizing the design as much as possible within bounded resources. Later work has attempted to address some of these issues, including discount usability methods (e.g., heuristic evaluation) and rapid ethnography techniques (e.g., David Millen's paper).

Designing for Usability: Key Principles and What Designers Think
John D. Gould and Clayton Lewis
Communications of the ACM, March 1985 - Volume 28, Number 3

The Principles
- Early Focus on Users and Tasks
  - Understand who the users will be
  - Study cognitive, behavioral, anthropometric, and attitudinal characteristics
  - Study the nature of the work to be accomplished
- Empirical Measurement
  - User test prototypes on intended users
  - Observe, record, and analyze performance and reactions
- Iterative Design
  - Cycles of discovering and fixing problems
  - Design, Prototype, Evaluate, Re-Design, ...
Principles are NOT obvious
- Survey discovered that most designers at a human factors conference missed most of the principles
  - 62% - Early focus on users
  - 40% - Empirical measurement
  - 20% - Iterative design
Principles in more detail
- Early Focus on Users
  - Understanding potential users (as opposed to identifying, describing, etc)
  - Bringing design team into direct contact with potential users
  - Make contact prior to system design
  - Participatory design
    - Have typical users participate in early formulation stages
- Empirical Measurement
  - Behavioral measurements of learnability and usability
  - Conducting studies very early in the development process
  - Designed not to study a prototype but how people use and react to the prototype
- Iterative Design
  - A process to ultimately ensure goals are met
Why the principles are undervalued
- Not Worth Following
- Confusion with Similar but Critically Different Ideas
- Value of Interaction with Users is Misestimated
  - User diversity is underestimated
  - User diversity is overestimated
  - Belief that users do not know what they need
    - A priori vs. knowing it when they see it
  - Belief that my job doesn not require it or need it
- Competing Approaches
  - Belief in the power of reason
    - Reason alone is likely to miss true work practices and cost structures
  - Belief that design guidelines should be sufficient
    - Guidelines ill-suited for highly context-dependent choices
  - Belief that good design means getting it right the first time
    - Humans unpredictable, necessitating an empirical approach
- Impractical
  - Belief that the development process will be lengthened
    - Assumptions
      - Usability work must be added to the end of the development cycle
      - Responding to tests must be time consuming
    - Rebuttals
      - User testing can start before a system is built
        
        Discount methods - paper prototyping, wizard of oz studies
        
        Helps bootstrap project - something tangible to motivate, stimulate thought
      - Modular design, multi-tiered designs - decouple UI from system internals
    - Still has price, but not as high as supposed
    - Poor design brings costs of it's own (support, vendor costs, updates)
  - Belief that iteration is just expensive fine-tuning
    - No, it is a basic design philosophy
  - Belief in the power of technology to succeed
    - From user's perspective, user interface is the product
Elaboration of Principles
- Initial Design Phase
  - Preliminary specification of the user interface
  - Collect critical information about users
  - Develop testable behavioral goals
    - Description of intended users (demographic)
    - The tasks to be performed and circumstances of user
    - The measurements of interest
  - Organize the work
    - Software, manuals, etc should be built by the same group
- Iterative Development Phase
  - Test behavioral goals, continuous evaluation
  - Modification of the interface
  - Fast, flexible prototyping
  - Highly modular implementations
  - Be prepared for results that dictate radical change
  - User comments and think-aloud protocol can help point the way for designing fixes

Posted by jheer at 09:55 PM | Comments (0)

talk: jan pedersen

pedersen_2003.07.jpg

Today Jan Pedersen, former PARC researcher and current Chief Scientist of AltaVista, spoke at the PARC Forum. His talk was entitled Internet Search: Past, Present, and Future. It seems particularly relevant given my recent exposure to personalized search start-up Kaltix. Jan primarily covered the developmental and economic history of search engines and spoke about current search technologies. Read on for my notes from the talk.

Notes: PARC Forum, July 24, 2003

Internet Search: Past, Present, and Future
Jan Pedersen, Chief Scientist, AltaVista

Search Engine Timeline

Pre-Cursors

Information Retrieval research
Discovery that free text queries win over Boolean queries (salton)

1^st Generation

1993 NCSA Mosaic

Webcrawler
Yahoo!
Lycos (400k indexed pages)
Infoseek

Power Players

1994 AltaVista

DEC labs, advanced query syntax, large index
Actually a showcase for DEC Alpha machines

1996 Inktomi

Berkeley Systems Lab, Eric Brewer
Massively parallel solution

2^nd Generation

Relevance

1998 DirectHit

re-ranking results using user click-through rates

1998 Google

re-ranking results using link authority

Size

1999 FAST/AllTheWeb

scalable architecture

User Matters

1996 AskJeeves

Users ask questions, natural language input

Money

1997 Goto/Overture

Pay-for-performance, pay for search rankings

3^rd Generation

Consolidation

2002 Yahoo! Purchases Inktomi
2003 Overture purchases AltaVista, AllTheWeb
2003 MSN announces intention of own search engine
2003 Yahoo! Purchases Overture

Maturity

$2B market, $6B by 2005
Requires large capital investment, limiting newcomers

Although Gigablast is an exception (2 years private development, mid-size search index)

Traffic focused on Yahoo!, Google, AOL
Consumer use driven by brand marketing

Economics

Overview

Popularity

Search is the most used Internet application after email
400M queries / day

High bar for quality in search results

Users spend 1.5 hours / week searching
Experience search rage after 12 minutes

Expensive centralized service

Indices cover billions on documents

The FAST index is 30TB large!

Query service is high performance application

Google claims 50K machines

Cost: $0.001 per query

Amortizes capital, operations, and engineering costs

Business Models

Early Monetization Models

Subscription services

Infoseek, Northern Lights
Failed: users can find equal results for free

Advertising

Invented by Infoseek, Netscape

Untargeted ads (banners, sponsorships)
Limited keyword targeting (low keyword coverage)

Portalitis

Search not profitable enough, need stickier services
Email, shopping, content channels
Tried by Excite, Infoseek, AltaVista -> disastrous
Led to lack of focus on core technology that opened the door for 2^nd generation search engines.

Performance Search Market

Goto/Overture – keyword auction
With 80k+ advertisers get good keyword coverage, currently exceeds 40%
Pay per click revenue

Marketers easily project to conversions
Search engine projects to CPM

Triple Win

Consumer: relevant ads
Marketer: qualified traffic
Search Engine: high-monetized impressions

Successful

Overture makes ~$1B/year
Strategy adopted by Google

Current Evolution

Greater automation

self-serve sign up, automated bidding

Increased competition

Google splits market with Overture
19% of Yahoo! Revenue from paid listings
MSN Search most profitable MS product group by headcount (50 people)

Trends available online

SearchEngineWatch.com (Neilsen / Netrating)
Traffic concentration

Google > Yahoo! > MSN…

Loyalty

AOL > Google > IS > Yahoo!

Technology

WWW Size

Dynamic pages -> effectively infinite pages
Domains: .com (23M), .net (4M), .org (2.5M)

Crawling

Index parameterized by size and freshness
Batch (discover, grab, index) and Incremental (mixed) approaches to crawling

Relative Size

Google – 3B, FAST– 2.5B, AltaVista – 1B
Anchor text only index (discovered links that are not yet crawled)

FAST 1.2B fully indexed pages (rest anchor text only)
Google 1.5 fully indexed pages

Freshness

Graph from (G. Notess)
Note use of hybrid indices

Subindices with differing update rates

Ranking

2.4 query terms -> 2B documents -> 10 highly relevant pages. All in 300ms.
Trouble queries: Travel, Cobra, John Ellis
Ingredients

Keyword match
Anchor text
Link authority
Click-through rates

SPAM – An Arms Race

Manipulate content purely to influence ranking
Dictionary spam, link sharing, domain hijacking, link farms
Robotic use of search results

Meta-search engines
Search engine optimizers
Fraud

UI

Ranked results lists

Document summaries are critical
Hit highlight, dynamic abstract
NO RECENT INNOVATIONS!

Blending

Pre-defined segmentation (e.g. paid listing)
Intermixed results from multiple sources

Future

Question Answering

Natural Language Processing
Dumais, SIGIR 2002 paper
WWW as language model

New Contexts

Ubiquitous searching
Implicit searching

New Tasks

Local / community search

Questions…

Personalization

Currently searchers are anonymous
Personalized search requires some form of user model

How much does the engine need to know?
Geographic location
Use context of surfing behavior

Personal Search Agents

Technical challenges to this
My idea: have distributed agents coupled with access to large, centralized indices.
Most importantly: what is the big advantage??

Need qualitative change in searching experience
Interesting, but not shown useful yet

My idea: have agents be pre-fetchers to automatically hunt for content for which you have a high probability of interest

e.g. citation mining to collect all research papers within a particular domain

Posted by jheer at 05:42 PM | Comments (0)

paper: animation support in a UI toolkit

Here's a back-post for a prelims paper: "Animation Support in a User Interface Toolkit", by Hudson and Stasko. I thought this paper particularly relevant, as I'm currently working in interactive graph visualization, which includes a heavy animation component. This paper got me considering higher level primitives I might use in the graph viz toolkit we are developing.

Animation Support in a User Interface Toolkit: Flexible, Robust, and Reusable Abstractions
Scott E. Hudson, John T. Stasko
UIST'93 (User Interface and Software Technology 1993)

In this paper, the authors present extensions to the Artkit user interface toolkit to support animation. The toolkit offers basic support for simple motion blur, "squash and stretch", use of arcing trajectories, controlled timing of actions, anticipation + follow-through, and slow-in / slow-out transitions. It also supports a robust scheduling system that helps deal with unpredictable performance from the windowing system... very important since this was running on X-Windows.

The main abstraction used is the transition, which consists of a pointer to the UI component that is moving, the trajectory the component will take, and the time interval over which to animate. The UI component can be any interactor object implemented in Artkit. The trajectory consists of the curve traveled (parameterized from 0 to 1) and a pacing function to determine velocities over the curve (e.g. using a line with slope 1 for uniform animation and an arctan or sigmoidal function to create slow-in / slow-out transitions). The times in the time interval can be expressed as absolute times, as a delay from the present time, or parameterized by the starting or ending of other transitions.

Robust animation and event-relative transitions are achieved using an animation dispatch agent. All that is assumed is that the tookit can ask what the current time is and that the window system will pass back control to the toolkit periodically. The agent constructs a scheduling queue of transitions, and attempts to estimate when the next draw cycle will appear on the screen using a measure of past updates. Using this redraw end time, the set of active transitions for the current cycle is selected. For each active transition, it is started or stopped as appropriate and current parameter values are passed through their pacing functions and mapped to screen positions using the trajectory.

This scheme will animate smoothly when the agent is given control at a regular intervals, but it will also properly handle delays, correctly delivering animation steps at larger intervals.

Criticism: The first thing that struck me is that no mention of scale is given. How many objects can I animate at once? What are the bottlenecks? Obviously rendering time is a major factor, but overhead is accrued through scheduling and through mapping each object through it's own pacing and transition items. In most cases I'd expect this to be a constant time overhead, but this isn't really discussed. Also, cool animated effects like squash and stretch was mentioned multiple times but the implementation of it is not discussed.

Today, 10 years later, we have incredibly more powerful processors and graphics cards, enabling much richer animation possibilities. This paper was ahead of it's time and today's popular toolkits - Swing, MFC, etc - are definitely behind the times. While toolkits like Java2D provide much of the rendering and geometric capabilities needed, animation managers like the kind presented here and in Xerox PARC's Information Visualizer are yet to be common. Hopefully as graphics power continues to grow and the drive for more powerful interactive technologies gains momentum (e.g. more widespread use of information visualization) these more powerful tools and abstractions will become commonplace.

Posted by jheer at 10:49 PM | Comments (0)

February 11, 2004

book: computers and cognition

October 27, 2003

book: philosophy of punk

September 09, 2003

paper: interface metaphors

September 08, 2003

paper: hci and disabilities

September 06, 2003

paper: contextual inquiry

paper: contextual design

paper: rapid ethnography

paper: 2D fitt's law

September 05, 2003

paper: other ways to program

paper: prog'ing by example

September 04, 2003

paper: charting ubicomp research

September 03, 2003

paper: why and when five users aren't enough

paper: heuristic evaluation

paper: your place or mine?

September 02, 2003

paper: computers, networks, and work

paper: distance matters

paper: groupware and social dynamics

August 05, 2003

paper: SNIF-ACT

July 29, 2003

paper: designing for usability

July 24, 2003

talk: jan pedersen

July 10, 2003

paper: animation support in a UI toolkit