home >> blog >>
summaries
 

February 11, 2004

book: computers and cognition

It seems we've all been getting phenomenological these days, so now is no time to stop. I just finished reading one of the better things to come out of the 1980's (my little brother and Metallica being two other notable exports) -- Winograd and Flores' monograph "Understanding Computers and Cognition".

The book is the retelling of an intellectual journey, philosophically examining the failure of Artificial Intelligence to achieve its lofty goals and directing the insights gained from this exploration towards a new approach to the design of computer systems. Or, more simply, how Heidegger and friends led an AI researcher to the study of human-computer interaction.

The authors begin by challenging what they call the "rationalistic" tradition (what today might be referred to as positivism?) stretching throughout most of Western thought. This tradition's problem solving approach consists of identifying relevant objects and their properties, and then finding general rules that act upon these. The rules can then be applied logically to the situation of interest to determine desired conclusions. Under this tradition, the question of achieving true artificial intelligence on computers, while daunting, holds the glimmer of possibility.

Winograd and Flores instead argue for a phenomenological account of being. The authors pull from a variety of sources to make their claims, but rest primarily on Heidegger's Being and Time and the work of biologist Humberto Maturana. One of the important implications is the notion of a horizon, background, or pre-understanding, making it impossible to completely escape our own prejudices or interpretations. Much of our existence is ready-to-hand, operating beneath the level of recognition and subject-object distinction, and this can not, in its entirety, be brought into conscious apprehension (i.e. made present-at-hand). AI programs at the time, however, were largely representational. The program's "background" is merely the encoding of the programmer's apprehension and assumptions of the program's domain. While this approach can certainly create useful programs, they are characteristic of the decontextualized, desituationalized nature commonly attributed to computer interaction and are a far cry from human intelligence.

The authors further delve into the issue of language, arguing that "...the essence of language as a human activity lies not in its ability to reflect the world, but in its characteristic of creating commitment. When we say a person understands something, we imply that he or she has entered into the commitment implied by that understanding." Thus, the authors argue that computers, by their very nature, are incapable of commitment and therefore prevented from entering into language on the same terms as humans.

The authors' conclusion? Move from AI to HCI. There is an error in assuming that success will follow the path of artificial intelligence. The key to design lies in understanding the readiness-to-hand of the tools being built, and in anticipating the breakdowns that will occur in their use. A system that provides a limited imitation of human facilities will intrude with apparently irregular and incomprehensible breakdowns. On the other hand, we can create tools that are designed to make the maximal use of human perception and understanding without projecting human capacities onto the computer.

Other thoughts and notes are in the extended entry.

The design section at the end of the book discusses the Coordinator system, which explicitly represents different speech acts as a way of attempting better coordination of organizational communication, in particular supporting the formation and evaluation of commitments. I'm not familiar with the literature on this system, but colleagues of mine have referred to it as a known failure of early CSCW (computer-supported cooperative work). The explicit encoding of otherwise "ready-to-hand" communication seems potentially dangerous and limiting of social nuance. For example, if a commitment is encoded formally, how much room for ambiguity (or delaying, or weaseling, or whatever) is left without making it present-at-hand? It is similar to one of the projects discussed in my friend Scott's thesis, in which by trying to leverage a theory of human behavior (in this case Goffman's notion of different fronts or faces), he encoded formally what people practice unconsciously with high degrees of nuance, thus creating a disconnect between actual human behavior and the well-intentioned mechanisms of the interface.

How would more recent AI developments be treated through the lens of this book? Modern statistical techniques can incorporate probabilisitic logic and learning from example data, but still revolves around the statistical model (e.g. specific graphical models) and training techniques (e.g. the EM algorithm) used. These are still representational (primarily in the choice of statistical model), but less strictly so. How far can we extrapolate this, loosening the representation?

Do we have any of our own 'hard-coded' models (e.g. Chomskian grammar)? Where do our own representational structures lie on the spectrum of nature (genetics, evolution) and nurture (socially learned and negotiated meaning)?

The question here is at the heart of modern cognitive neuroscience - at what representational level, if any, can we understand human functioning, cognition, and experience (at varying levels of consciousness)? Physics? Chemistry? Neuronal interaction? At what level should we look for the organization (or perhaps better stated, embodiment) of a structure-determined, autopoietic system that allows for experience, intelligence and a background to arise? In short, where and how do science and phenomenology dovetail?

In the meantime, it is argued that the design of computer programs should steer clear of these pretensions. The lesson from above teaches us that even as we understand mechanisms of thought, language, experience, etc, the way we naturally perceive and act in the world is not experienced or conceptualized in the terms of these mechanisms.

The big challenge left for us after reading this book: How do we determine the readiness-to-hand of the tools being built (or the desired 'invisibility' of ubiquitous computing environments)? How do we design for it, how do we measure it, evaluate it, and value it? Furthermore, how do we look beyond just 'tools'? How do we build things that appropriately shift between ready-to-hand and present-at-hand, and that are designed to evoke emotional as well as rational responses? (e.g. a nuclear missile launch control interface should be anything BUT ready-to-hand, requiring conscious deliberation). We've had almost 20 years of HCI research since this book was published, with numerous successes in various (often constrained) domains, but these are still the core theoretical and methodological motivations pushing us forward.

--NOTES--

Heideggerian Philosophy
- Our implicit beliefs and assumptions cannot all be made explicit
- Practical understanding is more fundamental than theoretical understanding
- We do not relate to things primarily by having representations of them
- Meaning is fundamentally social and can not be reduced to the meaning giving activity of individual subjects.

Ready-to-hand: the world in which we are always acting unreflectively. The ready to hand is taken as part of the background, taken for granted without explicit recognition or identification.

Present-at-hand: the world in which we are consciously reflective, identifying, labeling, and recognizing artifacts and ideas as such.

Breakdown: the event of the ready-to-hand becoming present-at-hand

Throwness: the condition of understanding in which our actions find some resonance or effectiveness in the world

Properties of throwness
- You can not avoid acting
- You can not step back and reflect on your actions
- The effects of actions can not be predicted
- You do not have a stable representation of the situation
- Every representation is an interpretation
- Language is action

------------------------------------------

The Biology of Cognition: Humberto Maturana

p.43
The structure of the organism at any moment determines a domain of perturbations--a space of possible effects the medium could have on the sequence of structural states that it could follow.

Autopoiesis. An autopoietic system is defined as: "...a network of processes of production (transformation and destruction) of components that produces the components that: (i) through their interactions and transformations continuously regenerate the network of processes (tealtions) that produced them; and (ii) constitue it (the machine) as a concrete unity in the space in which they (the components) exist by specifying the topological domain of its realization as such a network..." -Maturana and Verla, Autopoiesis and Cognition (1980), p.79

A plastic, structure-determined system (i.e., one whose strucutre can change over time while its identity remains) that is autopoietic will by necessity evolve in such a way that its activities are properly coupled to its medium.

Structural coupling is the basis not only for changes in an individual during its lifetime (learning) but also for changes carried through reproduction (evolution). In fact, all structural change can be viewed as ontogenetic (occurring in the life of an individual). A genetic mutation is a structural change to the parent which has no direct effect on its state of autopoiesis until it plays a rolue in the development of an offspring.

A cognitive explanationis one that deals with the relevance of action to the maintenance of autopoiesis. It operates in a phenomenal domain (domain of phenomena) that is distinct from the domain of mechanistic structure-determined behavior.

For Maturana the cognitive domain is not simply a different (mental) level for providing a mechanistic description of the functioning of an organism. It is a domain for characterizing effective action through time. It is essentially temporal and historical.

The sources of pertrubation for an organism include other organisms of the same and different kinds. In the interaction between them, each organism undergoes a process of structural coupling due to the pertrubations generated by the others. This mutual process can lead to interlocked patterns of behavior that form a consensual domain.

---------------------------------------------------

Speech Acts

Five categories of illocutionary point:
- Assertives
- Directives
- Commissives
- Expressives
- Declarations
Each can be applied with varying illocutionary force.

------

The failures of AI

p.123
[A] program's claim to understanding is based on the fact that the linguistic and experiential domains the programmer is trying to represent are complex and call for a broad range of human understanding. As with the other examples however, the program actually operates within a narrow micro-world that reflects the blindness of that representation.

...the essence of language as a human activity lies not in its ability to reflect the world, but in its characteristic of creating commitment. When we say a person understands something, we imply that he or she has entered into the commitment implied by that understanding. But how can a computer enter into a commitment?


p.133
In order to produce a set of rules for [an] ... 'expert' system, it is first necessary to pre-select the relevant factors and thereby cut out the role of the background. But as we have been arguing throughout this book, this process by its very nature creates blindness. there is always a limit set by what has been made explicit, and always the potential of breakdowns that call for moving beyond this limit.

p.137
There is an error in assuming that success will follow the path of artificial intelligence. The key to design lies in understanding the readiness-to-hand of the tools being built, and in anticipating the breakdowns that will occur in their use. A system that provides a limited imitation of human facilities will intrude with apparently irregular and incomprehensible breakdowns. On the other hand, we can create tools that are designed to make the maximal use of human perception and understanding without projecting human capacities onto the computer.

Posted by jheer at 10:15 AM | Comments (0)

October 27, 2003

book: philosophy of punk

Philosophy of PunkI just finished reading an interesting little book from self-proclaimed radical publisher AK Press: The Philosophy of Punk by Craig O'Hara.

First things first, let's just be clear that when we use the word philosophy here, we're not talking Kant (thank god), and we're not talking Wittgenstein... but we are talking about a possibly fascinating look at a largely misunderstood sub-culture... one with often conflicting views from it's own members. Unfortunately this book is not quite there.

In true punk spirit, however, O'Hara's book is a DIY (do-it-yourself) effort from the ground up. Fanzines (e.g., Maximum Rock n' Roll, Profane Existence, ...) and album liner notes form the primary sources of the book, which chronicles punk viewpoints on media misrepresentation, zines, anarchism, gender, sexuality, and environmentalism. For the most part little new is gleamed, though the book does a nice job of taking various snapshots of the (primarily early 90's) punk world. Skinheads (even the steadfastly anti-racist breed) and straight-edgers, however, are given particularly scathing treatment, as the author characteristically sways between a pseudo-objective tone and unrestrained vitriolic opinion. The same style, if you ask me, that so often characterizes punk.

I did appreciate the book's chapter on anarchism, as it was one of the few sections where I encountered some new perspectives, and set me on a path to discover some interesting readings such as this one. I also discovered that true punks, according to O'Hara's view, are utopians: "anarchy does not just mean no laws, it means no need for laws."

What really struck me, though, was how deeply the rhetoric of rebellion is woven into punk philosophy as presented. In seems that most punk causes can be formulated to always begin with the prefix "anti-". In so doing, it runs the risk of ever being a counter-culture, defined largely by resistance and therefore existing as a reactive movement, its identity dependent on the larger culture it lashes back against. As such, punk is limited, willingly or unwillingly, to merely modifying the culture it would like to see obliterated. This observation is an over generalization, of course: punk acts continue to promote more egalitarian financial models (e.g., the wonderful folks from Fugazi), and other trends in the sub-culture, particularly gender and environmental issues, tend to promote a more proactive outlook. If punk truly still exists in this day in age (it always seems to be pronounced dead or dying), it will be interesting to see how it further evolves.

In the end, I wouldn't recommend going out of your way to get ahold of this book. But if you're either interested or completely ignorant of punk, and like me, find this book sitting on a friend's bookshelf, pick it up and give it a read. At the very least, it will get you thinking. Or, if you want to expose yourself to one of the more beautiful (and for my young teenage self, life-changing) expressions of punk philosophy, buy this album and learn all the lyrics by heart.

Posted by jheer at 10:21 PM | Comments (2)

September 09, 2003

paper: interface metaphors

Interface Metaphors and User Interface Design
John M. Carroll, Robert L. Mack, Wendy A. Kellogg
Handbook of Human Computer Interaction, 1988

This paper examines the use of metaphor as a device for framing and understanding user interface designs. It reviews operational, structural, and pragmatic views on metaphor and proposes a metaphor design methodology. In short the operational approach concerns the measurable behavioral effects of applying metaphor; structural analyses attempt to define, formalize, and abstract metaphors; and the pragmatic approach views metaphors in context – including the goals motivating metaphor use and the affective effects of metaphor. The proposed design methodology consists of 4 phases: identifying candidate metaphors, elaborating source and target domain matches, identifying metaphorical mismatches, and finally designing fixes for these mismatches. Strangely, this paper makes absolutely no mention whatsoever of George Lakoff’s influential work on conceptual metaphor, which I’m almost certain had been published prior to this article. My outlined notes follow below.

  • Introduction
    • Design interface actions, procedures and concepts to exploit specific prior knowledge that users have of other domains
    • Metaphors as alternative to reducing the absolute complexity
    • Metaphors, by definition, must provide imperfect mappings to their target domains
      • (otherwise, it would be the item it mapped to)
    • Inevitable mismatches are a source of new complexities for users
    • Metaphors often apply unevenly within a software domain
    • Composite metaphors are common
      • Desktop metaphor x Direct manipulation
    • Learning by analogy, one of the most basic approaches to learning
  • Operational Approaches to Metaphor
    • Focus on demonstrating measurable behavioral effects from employing metaphor
    • Raise questions of precisely how metaphor operates in the mind
    • Offers no principles that predict “good” and “bad” metaphors extensibly
    • Offers no principled definition of what a metaphor actually is
  • Structural Approaches to Metaphor
    • Develop representational descriptions of metaphors
      • … as relations among primitives in the source and the target domain
    • Douglas and Moran - 1983
      • Structural analysis of typewriter metaphor
      • Domain operators as structural primitives
        • type character -> type character
        • space bar -> insert blank space
      • Problems causes by mismatches – 62/105 collected errors
    • Gentner’s structure mapping theory
      • Holistic mappings between graph theoretic representations
      • Doesn’t consider individual entities piece-meal
      • Rutherford example – planet/sun -> electron/nucleus
      • Attribute predicates generally fail to map
    • Characteristics of structural formulation
      • Base specificity – how well understood is source domain (bounds usefulness)
      • Clarity – precision of node correspondences across mapping (e.g. 1-1 or 1-many)
      • Richness – density of predicates
      • Abstractness – level relations compromising mapping are defined (node mappings vs. predicate mappings)
      • Systematicity – extent mapped relations are mutually constrained by membership in structure of relations
      • Exhaustiveness – directional surjectivity
      • Transparency – how easy to tell which source relations get mapped to target
      • Scope – extensibility of the mapping
    • Relate metaphors to cognitive aspects of use
      • Expressive (literary) metaphors
      • Explanatory (scientific) metaphors
      • Both rated better when clarity is higher, but richness more important to expressive metaphors
    • What is appropriate grain for mapping?
    • What operators, what relations should be defined for a particular metaphorical mapping?
    • Structure-mapping analysis can prove insufficiently objective
    • Misses external consequences of metaphor
      • E.g. interpersonal attraction -> ionic bonding in chemistry may make chemistry more interesting
  • Pragmatic Approaches to Metaphor
    • Focus on use in the context and complexity of real-world situations
    • Emphasize the intentional use of metaphor to an end (i.e. aid users learning less familiar domain)
    • Context
      • Goals associated with metaphor
      • Characteristics beyond similarity basis
        • Incompleteness
        • Involvement in compositions
    • Acts as a filter on structural analyses
    • Suggest how inevitable structural flaws can play a useful role
    • Context of metaphor in use
      • Analysis of metaphors must rest on empirical task analysis of what users actually do
      • Metaphors can be used to draw attention to specific features of target domain
        • Thereby motivate further thought about comparison with source
        • Often primary purpose of literary metaphor
      • Interface metaphors can pose questions and open new possibilities
    • Metaphor mismatches
      • Mismatch and its resolution can elaborate an accurate conceptual understanding of the system
        • As negative exemplars can help to clarify a new concept
    • Composite metaphors
      • Mismatched of incomplete correspondence sometimes addressed by composite metaphors
      • Useful beyond increasing coverage of a target domain
        • May help generate more and different kinds of inferences about target domain
        • May aid quick convergence of integrated understanding of target domain
  • Toward a Theory of Metaphor
    • Competence theories of metaphor: operational, structural
    • Performance theories of metaphor: pragmatic analyses
    • Suggests need for integrated theory of metaphor
      • 3 phases of metaphorical reasoning
        • instantiation
          • recognition or retrieval of something known (potential source analog)
          • automatic and holistic activation process, analytically incomplete
        • elaboration
          • generation of inferences about how source can be applied
          • pragmatically guided structure mapping, identifying relevant predicates
        • consolidation
          • consolidate elaborated metaphor into a mental model of the target domain
          • integrates partial mappings into a single representation of target domain
      • Integrated understanding of target is not couched in a metaphor, but a mental model
        • “Distinction between models and metaphors one of open-endedness, incompleteness, and inconsistent validity of metaphoric comparisons versus the explicitness and comprehensiveness and validity of the models which the successful learner will ultimately obtain.”
  • Designing with Metaphors
    • Use of metaphors is currently haphazard. Can we systematize it?
    • Structured methodology for interface metaphors
      • Identify candidate metaphor or composite metaphors
        • Sources include
          • Predecessor tools and systems
          • Human propensities
          • Sheer invention
        • Levels for metaphor
          • Tasks – what people do (goals and subgoals)
          • Methods – how tasks are accomplished
          • Appearance – look and feel
      • Detail metaphor/software matches w.r.t. representative user scenarios
        • Match metaphors against the levels listed above
        • Can help to enumerate the objects of the metaphor domain
        • Can begin assessing ‘goodness’
      • Identify likely mismatches and their implications
        • Discrepancy in the software domain must be interpretable
          • Mismatch should be isolable
          • Salient alternative course in the target domain should be available
      • Identify design strategies to help users manage mismatches
        • Creating interface designs that encourage and support exploration is a key approach
        • Users must be able to “recover” from metaphor mismatches
        • Progressive disclosure of advanced functions
        • Iterative design
        • Composite metaphors can be a design solution for resolving mismatches
        • Help system can anticipate mismatches as well
  • Conclusion
    • Metaphors draw correspondences, and have motivational and affective consequences for users
    • They interact with an frame user’s problem-solving efforts in learning the target domain
    • Ultimate problem is for user to develop a mental model of the target domain itself.
    • There is no predictive theory of metaphor – they must be designed on a case by case basis, and carefully analyzed and evaluated
    • This endemic not just to interface metaphors, but user interface design itself
Posted by jheer at 03:53 PM | Comments (1)

September 08, 2003

paper: hci and disabilities

Human Computer Interfaces for People with Disabilities
Alan F. Newell and Peter Gregor
Handbook of Human-Computer Interaction, 1988

Human computer interface engineers should seriously consider the problems posed by people with disabilities, as this will lead to a more widespread understanding of the true nature and scope of human computer interface engineering in general.

  • Why HCI and disabilities?
    • While HCI keynotes, workshops, and tutorials acknowledge the need for a focus on disabled users, little is found in the scientific focus of HCI.
    • Statistics (commonly accepted figures in the “developed world”)
      • 1/10 have sig. hearing impairment, 1/125 are deaf
      • 1/100 have visual disabilities, 1/475 legally blind, 1/2000 totally blind
      • 1/250 are wheelchair users
      • 6 million mentally retarded people in the US, 2 million are in institutions
      • Estimated 20% of population has difficulty performing one ormore basic physical activities
    • Americans with Disabilities Act of 1992
      • Title One: Employers responsibility to accommodate disabilities of employees and applications. Illegal to discriminate when > 24 workers.
      • Title Two: Government facilities and services be accessible to the disabled.
  • Why do HCI engineers consider the disabled?
    • HCI Engineering is
      • Of high theoretical and practical value
      • "high tech" and leading edge research
      • important and academically respectable discipline
    • Designing systems for disabled, however is seen as
      • Having little or no intellectual challenge
      • Charitable rather than professional
      • At most, of fringe interest to researchers
      • Requirining individualized designs
      • Involving small unprofitable markets
      • Needing simple, low cost solutions
      • Dominated by home-made systems
    • Rarely is motivation the same as for joining mainstream science
    • Dangers: lack of quality control and commitment, disappointed users, deleterious effect on the commercial sector.
    • Reality: Designing for the disabilities is
      • Intellectually challenging, with greater scope for inventiveness
      • Achievements can be much greater and obviously worthwhile
      • The market for such innovation is not small
  • HCI in danger of ignoring a large market segment, but also of missing designs more widely useful that inventors originally intend.
    • E.g. curb-cuts, cassette tape recorders, remote controls, ballpoint pen
  • Who and what are people with disabilities?
    • Binary division of society (abled and disabled) is deeply flawed
    • Many designers/developers do not understand the narrowness of their vision of the human race
    • High-dimensional model of human ability
      • Physical, perceptual, mental abilities
      • Want to maximize the hyper-volume in this space of useful interfaces
      • People MOVE about in this space, abilities are not static
    • Contexts of use also often ignored (not JUST in an office environment)
    • Environment can induce "disabilities" in people as well
      • Assumption that designers are designing for fit human beings
      • Good design ought to be robust to changes in environment
    • Addressing the problems of extremes can provide impetus for better designs overall
      • Designing for disabled can improve performance for the abled in high-stress or extra-ordinary environments
      • E.g. flight deck of aircraft, air traffic control
    • Promising avenues of research
      • Predictive technologies
      • Multi-modal interaction
  • HCI is missing out...
    • Increased market share
    • Demographic trends (think baby-boomers)
    • ** Extra-ordinary needs are only exaggerated ordinary needs
    • Most people have a mix of such needs
    • Temporary disabilities are common
    • ** Environmental conditions can handicap users
    • ** Deeper problem of increasing communication bandwidth
    • Greater inventiveness, innovation
    • Improved use of truly user-centric design methodologies
Posted by jheer at 12:02 PM | Comments (0)

September 06, 2003

paper: contextual inquiry

Contextual Design: Contextual Inquiry (Chapter 3)
Hugh Beyer and Karen Holtzblatt
Contextual Design: Defining Customer Centered Systems, 1998

This article discusses in depth the contextual inquiry phase of the contextual design methodology. Contextual inquiry emphasizes interacting directly with workers at their place of work within the constructs of a master/apprentice relationship model in order for designers to gain a real insight into the needs and work practices of their users.

  • Contextual Inquiry
    • Design process work when they build on natural human behavior
    • Use existing relationship models to interact with the customer
  • Master / Apprentice Model
    • When you're watching the work happen, learning is easy
    • Seeing the work reveals what matters
    • Seeing the work reveals details
    • Seeing the work reveals structure
    • Every current activity recalls past instances
    • Contextual Inquiry is apprenticeship compressed in time
  • Four Principles of Contextual Inquiry
    • Contextual Inquiry tailors apprenticeship to the needs of design teams
    • Context
      • Go where the work is to get the best data
      • Avoid summary data by watching the work unfold
      • Avoid abstractions by returning to real artifacts and events
      • Span time by replaying past events in detail
      • Keep the customer concrete by exploring ongoing work
    • Partnership
      • Help customers articulate their work experience
      • Alternate between watching and probing
      • Teach the customer how to see the work by probing work structure
      • Find the work issues behind design ideas
      • Let the customer shape your understanding of the work
      • Avoid other relationship models
        • Interviewer/Interviewee
          • You aren't there to get a list of questions answered
        • Expert/Novice
          • You aren't their to answer questions either
        • Guest/Host
          • It's a goal to be nosy
        • Partnership creates a sense of shared quest
    • Interpretation
      • Determine what customer words and actions mean together
      • Design ideas are the product of a chain of reasoning
      • Design is built upon interpretation of facts - so the interpretation better be right
      • Sharing interpretations with customers won't bias the data
      • Sharing interpretations teaches customers to see structure in the work
      • Customers fine-tune interpretations
      • Nonverbal cues confirm interpretations
    • Focus
      • Clear focus steers the conversation
      • Focus reveals detail
      • Focus conceals the unexpected
      • Internal feelings guide how to interview
      • Commit to challenging your assumptions, not validating them
  • Contextual Interview Structure
    • Conventional interview
      • Get to know customers and their issues
    • Transition
      • Explain the new rules of a contextual interview
    • Contextual interview proper
      • Observe and probe ongoing work
    • Wrap-up
      • Feedback a comprehensive interpretation
    • Context, Partnership, Interpretation, and Focus!
Posted by jheer at 01:32 PM | Comments (2)

paper: contextual design

Contextual Design: Introduction (Chapter 1)
Hugh Beyer and Karen Holtzblatt
Contextual Design: Defining Customer Centered Systems, 1998

This book chapter introduces the difficulties of customer centered design in organizations, and proposes the methodology of Contextual Design as a set of processes for overcoming these difficulties and achieving successful designs that benefit both the customer and the business.

  • Introduction
    • The challenge of system design is to fit into the fabrice of everyday life
    • Contextual Design is a backbone for organizing a customer-centered design process
  • Challenges for Design
    • Collect and manage complex customer data w/o losing detail
    • Design a response that is good for business and customers
    • Foster agreement and cooperation between stakeholders
    • Make the process practical given time constraints
  • Challenge of Fitting into Everyday Life
    • Support the way users want to work
    • Don’t increase work and frustration with automation
  • Creating an Optimal Match to the Work
    • Innovate through step-by-step introduction of new work practice
  • Keeping in Touch with the Customer
    • Organizational growth isolates developers from customers
    • Sitting with the users makes cross-departmental projects hard
  • Challenge of Design in Organizations
    • Breaking up work across groups creates communication problems
    • Different organizational functions focus on different parts of a coherent process
    • Every function needs customer data, but it has to be the right kind of data
    • Data showing what is wrong is frustrating if builders can’t fix it
    • Cross-functional design teams create a shared perspective
  • Teamwork in the Physical Environment
    • Organizations have no real spaces for continuing team work
  • Managing Face-to-Face Design
    • Face-to-face work depends on managing the interpersonal
    • Disagreements can lead to an incoherent design
    • Customer-centered design keeps user work coherent by creating a well-working team
  • Challenge of Design from Data
    • Learn how to see the implications of customer data
    • Recognize that designing from customer data is a new skill
    • Don’t expect to find requirements littering the landscape at the customer site
  • Complexity of Work
    • The complexity of work is overwhelming, so people oversimplify
  • Maintaining a coherent response
    • A systemic response --- not a list of features --- keeps user work coherent.
    • Diagrams of work and the system help a team think systemically
  • Contextual Design
    • Contextual Design externalizes good design practice for a team
    • Contextual Inquiry
      • Talk to the customers while they work
    • Work modeling
      • Represents peoples work in diagrams
    • Consolidation
      • Pull individual diagrams together to see the work of all customers
    • Work redesign
      • Crate a corporate response to the customers’ issues
    • User Environment Design
      • Structure the system work model to fit the work
    • Mock-up and test with customers
      • Test your ideas with users through paper prototypes
    • Putting into practice
      • Tailor Contextual Design to your organization
Posted by jheer at 01:27 PM | Comments (0)

paper: rapid ethnography

Rapid Ethnography: Time-Deepening Strategies for HCI Field Research
David R. Millen
Proceedings of DIS'00

HCI has come to highly regard ethnographic research as a useful and powerful methodology for understanding the needs and work practices of a user population. However, full ethnographies are also notoriously time and resource heavy, making it hard to fit into a deadline-driven development cycle. This paper presents techniques for rapid, targeted ethnographic work, in the hopes of accruing much of the benefit of field work while still fitting within acceptable time bounds.

The paper organizes its suggestions around three core themes:

  • Narrow the field of focus before entering the field
    • GOAL: Zoom in on important activities
      • Move from 'wide-angle lens' metaphor to 'telephoto lens'
    • Identify an informant to use as a field guide
      • People with broad access
      • Can discuss interesting behaviors and social tensions up front
    • Find liminal informants
      • Fringe members of groups
      • Usually have free movement about group
      • Not so engrained that work patterns and relationships have become taken for granted
    • Find corporate informants
      • Employees of the organization in question
        • Colleagues, field staff such as service reps
    • Use fringe sampling methods to identify potential informants of interest
    • Plan at onset to develop long-term informant relationships
  • Use multiple interactive observation techniques
    • GOAL: Increase likelihood of exceptional and useful behavior
    • More than one observer onsite.
      • Parallelize observations
      • Get multiple perspectives
      • Careful to avoid changing work environment by additional presence
    • Identify opportunistic times of observation
      • Maximize learning rate
      • E.g. look at electronic logs to determine times of peak activity
    • More interactive approaches
      • Interactive feature conceptualization
      • Group Elicitation Method
    • Observer-participation
      • Understanding through personal experience
      • Understand affective issues around field activities
  • Use collaborative and computerized iterative data analysis methods
    • GOAL: More efficiently analyze collected data
    • Use computer assisted analysis tools
      • Text (FolioViews, AskSAM)
      • Video, Audio coding and playback software
    • Collaborative analysis techniques
      • Cognitive (concept) mapping
      • Pictorial storytelling
      • Scenario analysis
Posted by jheer at 01:23 PM | Comments (0)

paper: 2D fitt's law

Extending Fitt’s Law to Two-Dimensional Tasks
I. Scott MacKenzie and William Buxton
Proceedings of CHI’92

This paper extends the famous Fitt’s Law for predicting human movement times to work accurately in two-dimensional scenarios, in particular rectangular targets. The main finding of the paper is that two models, one which models target width by projecting along the vector of approach and another which uses the minimum of the width or height achieved equal statistical fits, and showed a significant benefit over models which used (width+height), (width*height), and (width-as-horizontal-distance-only) models.

For those who don’t know, Fitt’s Law is an empirically validated law that describes the time it takes for a person to perform a physical movement, parameterized by the distance to the target and the size of the target. It’s formula is one-dimensional: it only considers movement along a straight line between the start and the target. The preferred formulation of the law is the Shannon formulation, so named because it mimics the underlying theorem from Information Theory --

MT = a + b log_2 (A/W + 1)

Where MT is the movement time, A is the target distance or amplitude, W is the target width, and a and b are constants empirically determined by linear regression. The log term is known as the Index of Difficulty (ID) of the task at hand and is in units of bits (note the typo in the paper).

The Shannon formulation is preferred for a number of reasons

  • Provides the best statistical fit
  • Mimics the underlying information theory
  • Always provides a positive value for the ID

This paper then considers two-dimensional cases. Clearly you can cast the movement along a one-dimensional line between start and the center of the target, and the amplitude is the Euclidean distance between these points. But what to use as the width term? Historically, the horizontal width was just used, but this seems like an unintuitive choice in a number of situations, particularly when approaching the target from directly above of below. This paper studies five possibilities: Using the minimum of the width and distance (“smaller-of”), using the projected width along the angle of approach (“w-prime”), using the sum of the dimensions (“w+h”), using the product of the dimensions (“w*h”), and using the historical horizontal width (“status quo”).

The study varied amplitude and target dimensions crossed with 3 approach angles (0, 45, and 90 degrees). Twelve subjects were used, who performed 1170 trials each over four days of experiments. The results found the following ordering among the models in terms of model fit: smaller-of > w-prime > w+h > w*h > status quo. Notably, the smaller-of and w-prime cases were quite close – their differences were not statistically significant.

The w-prime case is theoretically attractive, as it cleanly retains the one-dimensionality of the model. The smaller-of model is attractive in practice as it doesn’t depend on the angle of approach, and so require one less parameter than w-prime. The w-prime model. However, doesn’t require that the targets be rectangular as the smaller-of model assumes. Finally, it should be noted that these results may be slightly inaccurate in the case of big targets, as the target point is assumed to be in the center of the target object. In many cases users may click on the edge, decreasing the amplitude.

Posted by jheer at 01:18 PM | Comments (0)

September 05, 2003

paper: other ways to program

Drawing on Napkins, Video-Game Animation, and Other Ways to Program Computers
Ken Kahn
Communications of the ACM, Vol. 39, No. 6, 1996

This article describes a number of visual, interactive methods to programming. The main thesis is that visual programming environments have failed to date because they are not radical enough. Programs exhibit dynamic behavior that static visuals do not always convey appropriately and so dynamic visuals, or animation, should be applied. Furthermore, visual programming can avoid explicit abstraction (i.e. when visuals become just another stand in for symbols in a formal system) without necessarily sacrificing power and expressiveness. Put more abstractly, a programming language can be designed to use one of many possible syntactical structures. It then becomes the goal of the visual programming developer to find the appropriate syntax that can be mapped to the desired language semantics. To map an existing computer language (e.g., C or LISP) into a visual form would require the use of a visual syntax isomorphic to the underlying language. Doing so in a useful, intuitive, and learnable manner proves quite difficult.

Kahn describes a number of previous end-user programming systems. This includes AlgoBlocks, which allow users to chain together physical blocks representing some stage of computation. The blocks support parameterizations on them, and afford collaborative programming. Another system is Pictorial Janus, which uses visual topological properties such as containment, touching, and connection to (quite abstractly, in my view) depict program performance.

He goes on to describe a (quite imaginative) virtual programming "world", ToonTalk, which can be used to construct rich, multi-threaded applications using a video-game interaction style. The ToonTalk world maps houses to threads or processes, and the robots that can inhabit houses are the equivalent of methods. Method bodies are determined by actually showing the robot what to do. Data such as numbers and text are represented as number pads, text pads, or pictures that can be moved about, put into boxes (arrays or tuples), or operated upon with "tools" such as mathematical functions. Communication is represented using the metaphor of birds -- hand a bird a box, and they will take it to their nest at the other house, making it available for the robots of that abode to work with the data.

Kahn argues that while such an environment may be slower to use for the adept programmer, it is faster to learn, and usable even by young children. It also may be more amenable to disabled individuals. Furthermore, its interactive animated nature (you can see your program "playing out" in the ToonTalk world) aids error discovery and debugging. In conclusion, Kahn suggests that these techniques and others (e.g. speech interfaces) could be integrated into the current programming paradigm to create a richer, multimodal experience that plays off different media for constructing the appropriate aspects of software.

Inspiring, yes, but quite difficult to achieve. My biggest question of the moment is: what happened to Ken Kahn? The article footer says he used to work at PARC until 92, and then focused on developing ToonTalk full-time. I'll have to look him up on the Internet to discover how much more progress he made. While I'm skeptical of these techniques being perfected and adopted in production-level software engineering in the near future, I won't be surprised if they experience a renaissance in ubiquitous computing environments, in which everyday users attempt to configure and instruct their "smart" environs. If nothing else, VCRs could learn a thing or two...

Posted by jheer at 12:01 AM | Comments (0)

paper: prog'ing by example

Tonight I read a block of papers on end-user programming, aka Programming by Example (PBE), aka Programming by Demonstration (PBD). Very fun stuff, and definitely got me thinking about the kind of toys I would want any future children of mine to be playing with.

Eager: Programming Repetitive Tasks By Example
Allen Cypher
Proceedings of CHI'91

This paper introduces Cypher's Eager, a programming by example system designed for automating routine tasks in the HyperCard environment. It works by monitoring users actions in HyperCard and searching for repetitive tasks. When one is discovered it presents an icon, and begins highlighting what it expects the user's next action to be - an interaction technique Cypher dubs "anticipation". This allows the user to interactively - and non-abstractly - understand the model the system is building of user action. When the user is confident that Eager understands the task being performed, the user can click on the Eager icon and let it automate the rest of the iteration. For example, it can recognize the action of copying and pasting each name in a rolodex application into a new file, and completely automate the task.

Eager was written in LISP, and communicated to HyperCard over interprocess communication. When a recognized pattern is executed, Eager actually constructed the corresponding HyperCard program (in the language HyperTalk) and passed it back to the HyperCard environment for execution.

There are a couple of crucial things that make Eager successful. One is that Eager tries only to perform simple repetitive tasks... there are no conditionals, no advanced control structures. This simplifies the both the generalization problem and the presentation of system state to the user. Second, Eager uses higher-level domain knowledge. Instead of low-level mouse data, Eager gets semantically useful data from the HyperCard environment, and furthermore has domain knowledge about HyperCard, allowing it to better match usage patterns. Finally, Eager has the appropriate pattern matching routines programmed in, including numbering and iteration conventions, days of the week, as well as non-strict matching requirements for lower-level events, allowing it to recognize higher-level patterns (ones with multiples avenues of accomplishment) more robustly.

The downside, as I see it, however, is that for such a scheme to generalize across applications you either have to (a) reprogram for every application or (b) designers must equip each program not only with the ability to report high-level events in a standardized fashion, but to communicate application semantics to the pattern matcher. Introducing more advanced applications with richer control structures muddies this further. That being said, such a feature could be invaluable in integrated, high-use applications such as Office or popular development environments. Integrating such a system into the help, tutorial, and mediation features already existant in such systems could be very useful indeed.

Posted by jheer at 12:00 AM | Comments (0)

September 04, 2003

paper: charting ubicomp research

Charting Past, Present, and Future Research in Ubiquitous Computing
Gregory D. Abowd and Elizabeth D. Mynatt
ACM TOCHI, Vol. 7, No. 1, March 2000

This paper reviews ubiquitous computing research and suggests future directions. The authors present four dimensions of scale for characterizing ubicomp systems: device (inch, foot, yard), space (distribution of computation in physical space), people (critical mass acceptance), and time (availability of interaction). Historical work is presented and categorized under three interaction themes: natural interfaces (e.g. speech, handwriting, tangible UIs, vision), context-aware applications (e.g. implicit input of location, identity, activity), and automated capture and access (e.g. video, event logging, annotation). The authors then suggest a fourth, encompassing research theme of everyday computing, characterized by diffuse computational support of informal, everyday activities. This theme suggests a number of new pressing problems for research: continuously present computer interfaces, information presentation at varying levels of the periphery of human attention, bridging events between physical and virtual worlds, and modifying traditional HCI methods for informal, peripheral, and opportunistic behavior. Additional issues include how to evaluate ubicomp systems (for which the authors suggest CSCW-inspired, real-world deployment and long-term observation of use) and how to cope with the various social implications, both due to privacy and security and to behavior adaptation.

In addition to the useful synopsis and categorization of past work, I thought the real contribution of this paper was the numerous suggestions for future research, many of which are quite important and inspiring. I was also very happy to see that many of the lessons of CSCW, which are particularly relevant to ubicomp, were influencing the perspective of the authors.

However, on the critical side a couple things struck me. One is that many of the suggestions of research are lacking some kind of notion of how "deep" the research problem runs. For example, the research problems in capture and access basically summarize both the meta-data and retrieval problems, long-standing fundamental issues in the multimedia community. However, this depth and extent of the research issue, or how we might skirt the fundamental issues by domain-specificity, is not mentioned. Another issue I had was that I felt the everyday computing scenario might have used some fleshing out. I wanted the authors to provide me with the compelling scenario they say such research mandates. Examples were provided, so perhaps I am being overly critical, but I wanted a more concrete exposition, perhaps along the lines of Weiser's Sal scenario.

See the extended entry for a more thorough summary

  • Introduction
    • Ubicomp as proliferation of computing into the physical world
    • Historically three primary interaction themes:
      • Natural Interfaces
      • Context-Aware applications
      • Automated capture and access
    • Everyday computing: a new interaction theme:
      • Focused on scaling interaction with respect to time
        • Addressing interruption and resumption of interaction
        • Representing passages of time
        • Providing associative storage model
      • Informal and unstructured activities
    • All themes have difficult issues w.r.t. social implications
      • Privacy, Security, Visibility, Control
      • Social phenomena (behavior modification)
  • Evolutionary Path
    • Devices: PARCTab + Liveboard
    • Input: Unistroke
    • Infrastructure: Active Badge
    • Applications: Tivoli, Wearables as memory assistant + implicit info sharing
  • Different dimensions of scale
    • Device – the physical scale of the device (palm, laptop, whiteboard)
    • Space – distribution of computation into physical space
    • People – reaching critical mass acceptance
    • Time – availability of interaction
  • Proliferation of devices of varying scale has indeed occurred
  • Current ubicomp success has been in physical mobility (but NOT physical awareness!), suggest increased focus on issues of time – continuous interaction.
  • Applications research as the ultimate purpose for ubicomp research
  • Natural Input
    • Examples
      • Speech UIs
      • Pen Input
      • Computer Vision
      • Tangible Interfaces
      • Multimodal Interfaces
    • Requirements for rapid development
      • First-Class Natural Data Types + Operations
      • Handling Error
        • Error reduction
        • Error discovery
        • Reusable infrastructure for error correction
  • Context-Aware Computing
    • Context
      • Information characterizing the physical and social environment
      • Who, What, Where, When, Why
      • Context as Implicit Input
    • Representing Context
    • Context Fusion
      • Merging results of multiple context services
    • Augmented reality
      • Closing the loop b/w context and the world
  • Automated Capture and Access to Live Experiences
    • Augment inefficiency of human record-taking
    • Challenges in Capture and Access
      • Capture
        • Meta-data problem
        • Merging multiple sources (manageability vs. info overload)
      • Access
        • Search and retrieval problem
        • Handling versioning and annotation
      • Privacy management
  • Everyday Computing
    • Supporting informal, daily activities
    • Characterization of Activities
      • They rarely have a clear beginning or end
      • Interruption is expected
      • Multiple activities operate concurrently
      • Time is an important discriminator
      • Associative models of information are needed
    • Synergy Among Themes
      • Natural Input + Context-Awareness + Capture/Access
    • Research Directions
      • Design a continuously present computer interface
        • Information appliance, agents, wearables
      • Presenting information at varying levels of attentional periphery
        • CSCW, Ambient displays
      • Connecting events between physical and virtual worlds
        • Must understand how to combine such information such that the presentation matches user conceptual models
      • Modify traditional HCI methods for informal, peripheral, and opportunistic behavior
  • Additional Challenges
    • Evaluation
      • Finding and Addressing a Human Need
        • Compelling scenario underlying research
        • Feasibility studies – both technical and user-centric
      • Evaluate in the Context of Authentic Use
        • Effective evaluation requires a realistic deployment
        • Long-term study of usage
      • Task-centric evaluation techniques are inappropriate
        • Scenario is of informal, everyday use – not formalized, specific tasks
        • Challenge – how to apply qualitative or quantitative metrics
    • Social Issues
      • Dangers of privacy violation in ubicomp
        • Who can access and modify contents?
        • Security of data and data transmission
        • Users must know what is being sensed and collected
        • User control over recording and (at least) distribution
        • Acceptable policies for erasing or forgetting memory over time
      • Behavior modification
        • How does activity change in face of known computation surveillance?
  • Conclusion
    • The real goal for ubicomp is to provide many single-activity interactions that together promote a unified and continuous interaction between humans and computational services.

Posted by jheer at 02:50 PM | Comments (0)

September 03, 2003

paper: why and when five users aren't enough

Why and When Five Test Users aren’t Enough
Alan Woolrych and Gilbert Cockton
Proceedings of IHM-HCI 2001

This paper argues that Nielsen’s assertion that “Five Users are Enough” to determine 85% of usability problems does not always hold up. In the end, we walk away with the admonition that five users may or may not be enough. Richer statistical models are needed, as well good frequency and severity data. What does this mean for evaluators? Certainly this shouldn’t dissuade the use of usability evaluations, but it does imply that one should avoid false confidence and keep an eye to user/evaluator variability.

The paper starts by attacking the formula

ProblemsFound(i) = N ( 1 – ( 1 – lambda ) ^ i ),

in particular, the straightforward use of parameter (lambda = .31). Generalizing the formula shows we should actually expect, for n participants, that

ProblemsFound(n) = sum(j=1…N) ( 1 – ( 1 – lambda_j) ^ n ),

Where lambda_j is the probability of discovering usability problem j. Nielsen and Landauer’s formula assumes this probability is equal for all such problems (computed as lambda = the average of such empirically observed probabilities).

However, other studies, such as that by Spool and Schroeder, have found an average lambda of 0.081, showing that a study with ecologically valid tasks (in this case an unconstrained online shopping task with high N) can still miss many usability issues. Thus Nielsen’s claim that five is enough is only true under certain assumptions of problem discovery.

But other issues also abound. For instance, Nielsen’s model doesn’t take into account the variance between users, which can strongly affect the number of users needed. Further complications abound when considering severity ratings, as the authors found huge shifts in severity ratings based on different selections of five users. Other problems include which tasks are used for the evaluation (changes of task revealed undiscovered usability issues) and issues with usability issue extraction, determining the true value of N.

Posted by jheer at 07:16 PM | Comments (4)

paper: heuristic evaluation

Heuristic Evaluation
Jakob Nielsen
Chapter 2, Usability Inspection Methods, 1994

This paper describes the famous (in HCI circles) technique of Heuristic Evaluation, a discount usability method for evaluating user interface designs. HEs are conducted by having an evaluator walk through the interface, identifying and labeling usability problems with respect to a list of heuristics (listed below). It is usually recommended that multiple passes be made through the interface, so that evaluators can get a larger, contextual view of the interface, and then focus on the nitty-gritty details.

Revised Set of Usability Heuristics

  • Visibility of system status
  • Match between system and the real world
  • User control and freedom
  • Consistency and standards
  • Error prevention
  • Recognition over recall
  • Flexibility and efficiency of use
  • Aesthetic and minimalist design
  • Help users recognize, diagnose, and recover from errors
  • Help and documentation

The evaluators also go through a round of assigning severity ratings to all discovered usability problems, allowing designers to prioritize fixes. The severity is a mixture of frequency, impact, and persistence of an identified problem, and as presented forms a spectrum from 0-4, where 0 = Not a usability problem, 1 = Cosmetic problem only, 2 = Minor problem, 3 = Major problem, 4 = Usability catastrophe. Nielsen performs an analysis to show that inter-evaluator ratings have better-than-random agreement, and so ratings can be aggregated to get reliable estimates of severity.

Heuristic evaluation is cheap and can be done by user interface experts (i.e., they can be performed without bringing in outside users). Best results are experienced by evaluators that are familiar both with usability testing and the application domain of the evaluated interfaces. HE is faster and less costly than typical user studies, with which it can be used in conjunction (i.e. use HE first to filter out problems, then run a real user study to find remaining deeper seated issues). Lacking real user input, however, HE can run the risk of missing, or misestimating, usability infractions.

Nielsen found over multiple studies that the typical evaluator found only 31 percent (lambda = .31) of known usability problems in an interface. Using the model that

ProblemsFound(i) = N ( 1 – ( 1 – lambda ) ^ i ),

Where i is the number of evaluators and N is the total number of problems, we can arrive at the conclusion that 5 evaluators are enough to find 84% of usability problems. Nielsen also performs a cost-benefit analysis that finds 4 as the optimal number. Read the summary of the Woolrych and Cockton paper for a dissenting opinion.

Posted by jheer at 06:49 PM | Comments (0)

paper: your place or mine?

Finishing off my block of CSCW papers is Dourish, Belotti, et al's article on the long-term use and design of media spaces.

Your Place or Mine? Learning from Long-Term Use of Audio-Video Communication
Paul Dourish, Annette Adler, Victoria Belotti, and Austin Henderson
Computer-Supported Cooperative Work, 5(1), 33-62, 1996

This article reviews over 3 years of experience using an open audio-video link between the authors' offices to explore media spaces and remote interaction. The paper details the evolution of new behaviors in response to the communication medium, both at the individual and social levels. For example, the users learned to stare at the camera to initiate eye contact, but later learned to do without this but still establish attention. Also, colleagues would come to an office to speak to the remote participant.

I saw some important take home lessons here:

  • People will adapt to their environments over time in multiple ways -- long-term usage data is invaluable in social technologies.
  • Do not confuse social and physical mechanisms with the accomplishments they allow.
  • Analysis should be mindful of the duality of technology and practice -- being mindful of this duality implies that design should not attempt to eliminate it, or to encode the social in the technical.

My full outlined summary follows...

  • Introduction
    • Over 3 years use of a shared open audio-video link (media space) between users
    • Analyze from 3 non-traditional positions
      • Face-to-face communicative behavior in the real world not always an appropriate baseline for evaluation
      • Practices tailored to the nature of the medium arise over time as familiarity increases
      • Use, influence, and importance extend beyond the individuals who are directly engaged with it
    • Look at video as part of the real world, rather than comparison between video and the real world
  • Perspectives on Mediated Interaction
    • Individual
      • Interaction between a single user and technology
      • Experiences
        • Equipment takes up desk space, placement is significant and constrained
        • Ability to appropriately place equipment key to ability to manage video interaction as part of everyday activity
        • Directions
          • Equipment becomes surrogate for video partner
          • Gestures (e.g. directions) misleading due to orientation mismatch
          • Participants learned to point "through" the connection correctly
        • Noises Off
          • Camera field of view - some parts of room in view, others aren't
          • Over time users develop understanding of field of view
          • Others come and greet remote user, but can't be seen
          • Remote user becomes accustomed to such disembodied voices
    • Interactional
      • Focus on individuals at the ends of a media connection, and their communication through it
      • Experiences
        • Open Audio
          • Enables lightweight initiation of conversation, short bursts of interaction
          • Act of turning audio on and off would be more intrusive than audio itself
          • Audio access lends peripheral awareness of each other's activities
        • Gaze Awareness
          • Over time, learned to stare at camera instead of monitor to create eye contact
          • With greater familiarity, abandoned this!
          • Lesson: don't confuse action with what it intends to accomplish
    • Communal
      • Connections reach beyond direct users drawing in others physically or socially proximate
      • Experiences
        • Communication
          • Colleagues would come to office to talk to remote participant
          • Users and colleagues think of users as "sharing an office"
        • Presence and Telepresence
          • Hear sounds of typing coming from office - confuse remote work with local work
          • Functional space is no longer isomorphic to physical space
        • Virtual Neighbourhood
          • Sounds reaching one end of a connection that do not originate in either office
          • Inverse notion - user wants camera to also face towards the door to get a sense of remote activity
        • Projecting Audio
          • Projects sound of mediated conversation into area beyond remote office
          • Possibly troublesome during private conversation
          • Effectively used to attract attention of remotely-observed passers-by
          • Speaking softly close to mic produces an intimate yet loud effect
    • Societal
      • Connections can affect the relationships between individuals and larger social groups
      • Experiences
        • Colleagues and Visitors
          • The presence of connections can reveal or highlight delineations between various groups
          • Not simply a comment on familiarity and expectations
          • Becomes more significant when observing how people's reactions and understanding serve to act as determinants of social grouping
        • Public Affirmations
          • Connection use can be seen as explicit demonstration of cultural norms or of status within wider groups
          • Video connection is not socially or politically inert - can have strong effects on perceived groupings, membership relations, or even perceptions of the existence of particular groups.
  • Encompassing Issues
    • Ownership
      • Ownership of Technology
        • Shared communication link engenders shared ownership or responsibility for enabling technology
        • When no individuals see themselves as jointly owning a long-term connection, less use and responsibility
        • Technology itself plays a role
          • Can camera be moved around and adapted or fixed, immutable
      • Ownership of Space
        • Users thought of a single, shared office space - shared property of both occupants
        • Belotti reorganized office to better support mutual orientation
    • Evolution
      • Evolution of orientation towards the technology of the media space
      • Evolution of communicative practices in two-way communication
      • Evolution of understandings of the way in which media spaces disrupt the communal resource of "space"
      • Emergence of video-specific mechanisms of interaction
      • Development of new behaviours tailored to the nature of the medium
      • As familiarity increases, so does the range of acitvities that can be effectively performed with relative ease
      • Analysis should be mindful of the duality of technology and practice
        • Being mindful of this duality implies that design should not attempt to eliminate it, or to encode the social in the technical
      • Evolution is larger groups of understandings not about video communication but understandings including video communication
  • Designing Media Spaces
    • Crucial implication is the most general: Over time, adaptations take place as partners in long-term communication in media space environments learn effective ways to use the system. The sorts of problems people typically encounter, especially with respect to ability to manage and regulate conversation, lessen with time as new sets of resources to regulate interaction are learned.
    • Do not confuse mechanisms of face-to-face interaction (e.g eye contact) for the accomplishments they support.
      • Otherwise we waste time unnecessarily in design, and fail to look beyond the inevitably flawed simulation of copresence.
    • Linking Spaces, Not Just People
      • Media spaces link spaces, not just people
      • Spaces are foci of communal activity, emphasizing linkage of spaces enhances ability to participate in a wider space
      • Design decisions evaluated purely against the criteria of face-to-face communication may no longer be appropriate
    • Audio
      • Supports lightweight interactions, peripheral monitoring of activity and remote space
      • Flattening of audio space
        • Impairs individuals ability to filter audio stream and listen selectively
      • Microphone headsets
        • Keeps communication private, removing ability of media space to reach out and draw in others
        • Distances wearer from local environment
      • Directional issues
        • Omnidirectional mics make it easier to maintain consistent audio environment, but lose directional information
    • Digital Transmission and Shared Media
      • Analog vs. Digital systems
      • Analog systems: individual actions not dependent, use affects levels of service offered to others
    • No Sense of Place
      • "Space is the opportunity, place is the understood reality"
      • Opportunity to flexibly organize activities and structures, giving the meanings for place to emerge from space, and the mutually recognizable orientation towars spaces which carries with it a sense of appropriate behaviour and expectations... a (shared) sense of place
      • Ability to appropriate, transform, and reuse space is rooted in the flexible switching which media spaces afford
      • Rigid and explicit geography is not a prereq for the mergence of a sense of place - it's community orientation that is critical
      • Use of geographical metaphor engenders the emergence of a shared understanding of the varied appropriate uses of spaces
      • Important that we allow for the way that place-orientations emerge out of the flexible, exploratory, and creative use of the space by its occupants
  • Conclusions
    • Experiences lead to question basic assumptions:
      • The use of a real-world baseline
      • Person-to-person view of media spaces
Posted by jheer at 06:03 PM | Comments (0)

September 02, 2003

paper: computers, networks, and work

Computers, Networks, and Work
Lee Sproull and Sara Kiesler
Scientific American, September 1991

This article describes the early adoption of networked communication (e.g. e-mail) into the workplace. The often surprising social implications of networking began with the ARPANET, precursor of the modern internet. E-mail was originally considered a minor additional feature, but rapidly became the most popular feature of the network. We see immediately an important observation regarding social technologies: they are incredibly hard to predict.

In organizations that provided open-access to e-mail (i.e. without managerial restrictions in place), some thought that electronic discussion would improve the decision making process, as conversations would be “purely intellectual… less affected by people’s social skills and personal idiosyncracies.” The actual results were more complicated. Text-only conversation has less context cues (including appearance and manner) and weakened inhibitions. This has led to more difficult decision making, due to a more democratic style in which strong personalities and hierarchical relationships are eroded. While giving a larger voice to typically quieter individuals, lowered social inhibitions in electronic conversation is also prone to more extreme opinions and anger venting (e.g. “flaming”). One study even shows that people who consider themselves unattractive report higher confidence and liveliness over networked communication.

Given these observations, the authors posit a hypothesis: when cues about social context are weak or absent, people ignore their social situation and cease to worry about how others evaluate them. In one study, people self-reported much more illegal or undesirable behaviors over e-mail than when given the same study on pen and paper. In the same vane, traditional surveys of drinking account for only half of known sales, yet an online survey results matched more accurately the sales data than face-to-face reports. The impersonality of this electronic media ironically seems to engender more personal responses.

Networked communication has also been known to affect the structure of the work place. A study found that a networked work group, compared to a non-networked group, created more subcommittees and had multiple committee roles for group members. These networked committees were also designed in a more complex, overlapping structure. Networked communication also presents new opportunities for the life of information. Questions or problems can be addressed by other experienced employees, often from geographically disparate locations, allowing faster response over greater distance. Furthermore, by creating a protocol for saving and categorizing such exchanges, networked media can remember this information, increasing the life of the information and making it available to others.

As the authors illustrate, networked communication showed much promise at an early age. However, it doesn’t always come as expected or for free. The authors note the issue of incentive… shared communication must be beneficial to all those who would be using it for adoption to be successful. Also it may be the case that managers will end up managing people they have never met… hinting at the common ground problem described by the Olsens [Olsen and Olsen, HCI Journal, 2000]. Coming back to the authors’ hypothesis also raises one exciting fundamental question. As networked communication becomes richer, social context will begin to re-appear, modifying the social impact of the technologies. As this richer design space emerges, how can we utilize it to achieve desired social phenomena in a realm that is so prone to unpredictability?

Posted by jheer at 10:22 PM | Comments (0)

paper: distance matters

Distance Matters
Olson and Olson
Human-Computer Interaction, 2000

This paper examines and refutes the myth that remote cooperative technology will remove distance as a major factor effecting collaboration. While technologies such as videoconferencing and networking allow us to more effectively communicate and collaborate across great distances, the author's argue that distance will remain an important factor for the forseeable future, regardless of how sophisticated the technology becomes.

This paper reviews results of studies concerning both collocated and distant collaborative work, and extracts four concepts through which to understand collaborative processes and the adoption of remote technologies: common ground, coupling, collaboration readiness, and technology readiness. The case is then made that because of these issues and their interactions, distance will continue to have a strong effect on collaborative work processes.

  • Collocated Work Today
    • Put people in a collaborative workspace (e.g. large conference room)
    • Showed a doubling in productivity metrics
      • Important to note: work was at a stage appropriate for intense effort
    • Spatiality of human interaction
      • People and objects exist in space, roles can be indexed by location
    • Not explicitly mentioned: the interactive space must be of the right size to support the collaborative work. Otherwise would experience “thrashing”
    • Key characteristics of collocated synchronous interactions
      • Rapid feedback
      • Multiple channels
      • Personal information
      • Nuanced information
      • Shared local context
      • Informal “hall” time before and after
      • Coreference
      • Individual control
      • Implicit cues
      • Spatiality of reference
  • Remote Work Today
    • Successes
      • Collaboratory of Space Physicists
        • Simultaneous access to real-time data from throughout the world
        • User-centered design – 10 major redesigns over 7-year period
      • NetMeeting at Boeing
        • Moderator capable of debugging technology and eliciting interaction from remote participants
      • Telecom software maintenance and enhancement
        • Supported by email, a/v conferencing, transferred files, fax
      • Contributors to success
        • Known structure, boundaries
          • Who owns what, who can change what, what causes problems
        • Detailed process shared across sites
          • Common language about work
        • These takes about 2 years for novice to learn
    • Failures
        • Process of work changes, requiring more clarification and management overhead
        • Remote work is reorganized to fit the location and technology constraints
          • Partition work into regional units
          • Many affordances of collocation are lost
        • Technological barriers
          • Set-up time, working the camera (missing speaker from picture), not heard (bad microphone setup)
        • Leads to new behaviors
          • Always identifying oneself
          • More formal turn-taking
          • Discourse rules
          • => Increased effort of communication
        • For tasks involving unambiguous tasks and people with much in common, video (vs audio) has been shown to not effect the outcome performance of people, but often changes the process
        • For more ambiguous tasks, video has been correlated with significant improvements in performance
          • Gesture and viewing the speaker increased comprehension
          • More channels for communication / disambiguation
        • Users unaware of difficulty encountered with communication channel
          • Often change behavior rather than fix the technology
            • E.g. shouting due to bad volume on videophone
        • Proxemics (apparent distance)
          • Zoomed-out image of person less effective for conversation
        • Motivation
          • Individuals not compensated according to their competitive talents
          • Sharing of data and ideas improved by compensation and attribution
          • Example of using groupware to be seen using groupware by managers – that was the real motivation for adoption
        • Caveat: interesting behaviors can emerge when tools are used for long time (Dourish et al paper, 1996)
  • Findings Integrated: Four Concepts
    • Common Ground
      • Knowledge that the participants have in common, and they are aware they have in common
      • Established from both general knowledge about a person’s background and through specific knowledge gleaned from the person’s appearance and behavior during conversation
      • Cues provided by media (Clark and Brennan 1991)
        • Copresence
        • Visibility
        • Audibility
        • Contemporality
        • Simultaneity
        • Sequentiality
        • Reviewability
        • Revisability
      • Prescription: focus on the importance of common ground. If not yet established, help develop it. Video is a great improvement over audio.
    • Coupling in Work: A Characteristic of Work Itself (<- untrue assertion?)
      • The extent and kind of communication required by the work
      • Tightly coupled: depends strongly on talents of collections of workers and is nonroutine, even ambiguous
      • Loosely coupled: has fewer dependencies OR is more routine
      • Coupling is associated with the nature of the task, with some interaction with the common ground of the participants
      • => Tightly coupled work is very difficult to do remotely
      • Prescription: design work organization so that tightly coupled work is collocated. The more formal the procedures, the more likely the success.
    • Collaboration Readiness
      • Using shared technology assumes that coworkers need to share information and are rewarded for it.
      • Prescription: one should not attempt to introduce groupware if there is no culture of sharing and collaboration. If needed, one has to align the incentive structure with the desired behavior.
    • Technology Readiness
      • Organizations (infrastructure, expertise) and individuals (know-how, habits) must be ready to use the technology.
      • Prescription: advanced technologies should be introduced in small steps.
  • Distance Work in the New Millennium
    • Though useful technologies will emerge and improve, there will always be a gap between remote and collocated collaboration.
    • Common Ground, Context, and Trust
      • Local events, holidays, weather, and social interchange
      • Common ground as pre-cursor to trust
      • Suggests collocated team-building trips, followed by the remote work
    • Different Time Zones
      • Differing work hours, circadian rhythms
    • Culture
      • Local conventions (e.g. silicon valley casual attire vs. the suits)
      • Difference in process
        • Task orientedness (e.g. tasks (American) vs. relationships (So. European)
        • Power distance – acceptance and/or questioning of authority
        • Management style
          • “Hamburger” style
            • Sweet-talk – top bun
            • Criticism – the meat
            • Encouragement – the bottom bun
          • American : hamburger, German : just meat, Japanese : just buns (“one has to smell the meat”)
        • Turn-taking (add pauses in conversation to open it up)
    • Interactions among these Factors and with Technology
      • Time-Zone vs. culture
        • Multi-time-zone compatibility vs. holidays and accepted hours of work
      • Culture vs. culture
        • cost vs. maintaining relations over video conferencing
  • Conclusion
    • Despite the limitations, distance technologies will continue to be useful and work their ways into social and organizational life, in some cases affecting profound changes in social and organizational behavior. However, through all this, distance will continue to matter.
Posted by jheer at 10:20 PM | Comments (0)

paper: groupware and social dynamics

Kicking off a batch of papers on Computer-Supported Cooperative Work (CSCW) is Grudin's list of challenges to the CSCW developer...

Groupware and Social Dynamics: Eight Challenges for Developers
Jonathan Grudin
Communications of the ACM

Groupware is introduced as software which lies in the midst of the spectrum between single-user applications and large organizational information systems. Examples include e-mail, instant messaging, group calendaring and scheduling, and electronic meeting rooms. The developers of groupware today come predominantly from a single-user background, and hence many do not realize the social and political factors crucial to developing groupware. Grudin outlines 8 major issues confronting groupware development and gives some proposed solutions.

The disparity between who does the work and who gets the benefit
Consider a shared calendaring system: A meeting organizer gets the benefit of automatically scheduled meetings, but it is all the involved employees who must do the work to maintain their calendars, and who might have no inclination to do so otherwise. Hence such features are rarely used.
Solution: Design processes that create benefits for all group members.

Critical Mass and Prisoner’s Dilemma Problems
Most groupware is only useful if a high percentage of group members use it. How useful is workplace IM if no one at work uses it…
Solution: Reduce work required of users, build incentives for use, suggest benefit-enhancing processes of use.

Social, Political, and Motivational Factors
Groupware will be resisted if it interferes with group social dynamics, e.g. a computer mistakenly schedules a manager’s unscheduled “free” time.
Solution: Contextual inquiry, domain familiarity / understanding.

Exception Handling in Workgroups
Much human problem-solving is ad hoc, often giving rise to exceptions and deviations from “standard process”. Systems that impede this are well situated for failure.
Solution: Again, contextual inquiry. Also, good configurability, though difficult, can help.

Designing for Infrequently Used Features
Groupware often focuses on communication and collaboration, but more is not necessarily better, and can lead to inefficiency, drawing away from the more frequently used single-user features. Groupware features should be known and accessible but not obtrusive.
Solution: Add groupware features to already successful applications if possible. Then create awareness and access to infrequently used features.

The Underestimated Difficulty of Evaluating Groupware
Evaluation of groupware is difficult: it is distributed, can span substantial time periods, and determining the factors of success/failure can be quite difficult.
Solution: “Development managers must enlist the appropriate skills, provide the resources, and disseminate the results”. In other words, Grudin doesn’t really know J.

The Breakdown of Intuitive Decision-Making
Groupware is often targeted for the benefit of managers (at least, this is what development decision-makers are drawn to), leaving out important end users of the system. Conversely, decision-makers do not recognize the value of apps that primarily benefit non-managers. This leads to more instances of Problem 1.
Solution: Recognition of this problem by development management. Again, contextual inquiry or domain understanding.

Managing Acceptance: A New Challenge for Product Developers
For groupware to be accepted it must be introduced carefully and deliberately. You have to attract a critical mass. Clear understanding by the users is crucial – a site visit and step-by-step learning can help.
Solution: Adding groupware to existing apps dodges the bullet. Other systems require good design (by understanding the environment of use, yet again contextual inquiry) and a developed adoption strategy.

Take home messages from the paper: groupware should strive to directly benefit all group members, build off of existing successful apps if possible, develop thoughtful adoption strategies, and be rooted in an understanding of the [physical|social|political] environment of use.

Posted by jheer at 09:49 PM | Comments (0)

August 05, 2003

paper: SNIF-ACT

I just read this paper by my research group's principal scientist, Peter Pirolli, and former PARC employee Wai-Tat Fu. The paper, entitled "SNIF-ACT: A Model of Information Foraging on the World Wide Web", recently won the Best Theoretical Paper Award at the 9th International Conference on User Modeling.

The paper extends the existing ACT-R cognitive modeling infrastructure to computationally simulate users surfing the web, at a fairly fine-grained psychological level. The system models the user's goals, knowledge, memory, and abilities (in the form of production rules) and combines these with the findings of Information Foraging theory to create a successful model of web surfing and decision making. Information Foraging theory applies the metaphor of animals foraging for food to the task of humans seeking information. In previous work, Pirolli and Card have found that the equations governing the cost structures of the two are the same.

The SNIF-ACT model works by extracting the content and links of a web page and then using a technique known as spreading activation to propagate "activity" through an associative memory network of individual words. Activation proceeds from the modeled user goals through the terms in working memory and out to the currently observed web content. Link weightings between word associations are determined by using word occurence and co-occurrence rates extracted from AltaVista. By finding the highest mutual activity between user goals and available links, the system can compute an estimate of the information scent (much like the scent tracked down by animals in the wild), and use this to construct a probability distribution of the likelihood of following different links. Drop-offs in scent measures are also used to predict when a user will leave the current web site to look elsewhere for a richer information patch (analogous to an animal moving on to greener pastures or hunting an easier prey). The SNIF-ACT model is psychologically richer than previous foraging-influenced systems like Bloodhound, which primarily uses techniques from the information retrieval (IR) field and earlier flawed cognitive approaches.

For more details about information foraging theory and it's applications, check out this essay by pixelcharmer (it even cites my first research paper!), this copy of a talk by Pete Pirolli, and my research group's publication archives.

There are at least two interesting avenues for this work to follow. One is in applications, as successful user models can create better automated usability metrics and could learn from individual behavior to create personalized research and surfing tools. The second is to simultaneously move from the web to other domains, building user models for other content-rich domains (e.g., information visualization). Down the road, I think the integration of content-based and perception-based (e.g. computer vision and audition) analyses will be the next big research leap - creating richer, more realistic, models of user behavior and furthering artificial intelligence research.

Posted by jheer at 02:15 PM | Comments (1)

July 29, 2003

paper: designing for usability

My prelims are coming sooner that I'd like to admit, and so I need to get hopping reading a bunch of papers. Fortunately, over the past semester our reading group read over half the assigned papers, but for those that are left I will be posting my summaries here for my own archival purposes (including some back-posts for previously read papers). Perhaps they will be of use to someone else as well, so I might as well make these public...

First up is "Designing for Usability: Key Principles and What Designers Think" by John D. Gould and Clayton Lewis. This paper was originally published in 1985 in the Communications of the ACM, and outlines the iterative design philosophy that is central to modern Human-Computer Interaction. The paper describes three central design principles (early focus on users, empirical measurement, and iterative design) and includes a survey of designers trying to ascertain how common and/or obvious these principles are. The paper also rebuts arguments against the use of these principles and presents a case study of these principles in action.

My biggest problem with this article is that the authors are too unsympathetic to the demands that a deadline-driven project can make. They seem to advocate iterating "as long as it takes", which while desirable is not particularly feasible. To be fair, they acknowledge these pressures and give some cogent arguments for why the costs of iteration are not as high as one might otherwise suspect. But what is missing in the methodology are strategies and techniques for optimizing the design as much as possible within bounded resources. Later work has attempted to address some of these issues, including discount usability methods (e.g., heuristic evaluation) and rapid ethnography techniques (e.g., David Millen's paper).

Designing for Usability: Key Principles and What Designers Think
John D. Gould and Clayton Lewis
Communications of the ACM, March 1985 - Volume 28, Number 3

  • The Principles
    • Early Focus on Users and Tasks
      • Understand who the users will be
      • Study cognitive, behavioral, anthropometric, and attitudinal characteristics
      • Study the nature of the work to be accomplished
    • Empirical Measurement
      • User test prototypes on intended users
      • Observe, record, and analyze performance and reactions
    • Iterative Design
      • Cycles of discovering and fixing problems
      • Design, Prototype, Evaluate, Re-Design, ...
  • Principles are NOT obvious
    • Survey discovered that most designers at a human factors conference missed most of the principles
      • 62% - Early focus on users
      • 40% - Empirical measurement
      • 20% - Iterative design
  • Principles in more detail
    • Early Focus on Users
      • Understanding potential users (as opposed to identifying, describing, etc)
      • Bringing design team into direct contact with potential users
      • Make contact prior to system design
      • Participatory design
        • Have typical users participate in early formulation stages
    • Empirical Measurement
      • Behavioral measurements of learnability and usability
      • Conducting studies very early in the development process
      • Designed not to study a prototype but how people use and react to the prototype
    • Iterative Design
      • A process to ultimately ensure goals are met
  • Why the principles are undervalued
    • Not Worth Following
    • Confusion with Similar but Critically Different Ideas
    • Value of Interaction with Users is Misestimated
      • User diversity is underestimated
      • User diversity is overestimated
      • Belief that users do not know what they need
        • A priori vs. knowing it when they see it
      • Belief that my job doesn not require it or need it
    • Competing Approaches
      • Belief in the power of reason
        • Reason alone is likely to miss true work practices and cost structures
      • Belief that design guidelines should be sufficient
        • Guidelines ill-suited for highly context-dependent choices
      • Belief that good design means getting it right the first time
        • Humans unpredictable, necessitating an empirical approach
    • Impractical
      • Belief that the development process will be lengthened
        • Assumptions
          • Usability work must be added to the end of the development cycle
          • Responding to tests must be time consuming
        • Rebuttals
          • User testing can start before a system is built
            • Discount methods - paper prototyping, wizard of oz studies
            • Helps bootstrap project - something tangible to motivate, stimulate thought
          • Modular design, multi-tiered designs - decouple UI from system internals
        • Still has price, but not as high as supposed
        • Poor design brings costs of it's own (support, vendor costs, updates)
      • Belief that iteration is just expensive fine-tuning
        • No, it is a basic design philosophy
      • Belief in the power of technology to succeed
        • From user's perspective, user interface is the product
  • Elaboration of Principles
    • Initial Design Phase
      • Preliminary specification of the user interface
      • Collect critical information about users
      • Develop testable behavioral goals
        • Description of intended users (demographic)
        • The tasks to be performed and circumstances of user
        • The measurements of interest
      • Organize the work
        • Software, manuals, etc should be built by the same group
    • Iterative Development Phase
      • Test behavioral goals, continuous evaluation
      • Modification of the interface
      • Fast, flexible prototyping
      • Highly modular implementations
      • Be prepared for results that dictate radical change
      • User comments and think-aloud protocol can help point the way for designing fixes
Posted by jheer at 09:55 PM | Comments (0)

July 24, 2003

talk: jan pedersen

pedersen_2003.07.jpg

Today Jan Pedersen, former PARC researcher and current Chief Scientist of AltaVista, spoke at the PARC Forum. His talk was entitled Internet Search: Past, Present, and Future. It seems particularly relevant given my recent exposure to personalized search start-up Kaltix. Jan primarily covered the developmental and economic history of search engines and spoke about current search technologies. Read on for my notes from the talk.

Notes: PARC Forum, July 24, 2003

Internet Search: Past, Present, and Future
Jan Pedersen, Chief Scientist, AltaVista

  • Search Engine Timeline
    • Pre-Cursors
      • Information Retrieval research
      • Discovery that free text queries win over Boolean queries (salton)
    • 1st Generation
      • 1993 NCSA Mosaic
        • Webcrawler
        • Yahoo!
        • Lycos (400k indexed pages)
        • Infoseek
      • Power Players
        • 1994 AltaVista
          • DEC labs, advanced query syntax, large index
          • Actually a showcase for DEC Alpha machines
        • 1996 Inktomi
          • Berkeley Systems Lab, Eric Brewer
          • Massively parallel solution
    • 2nd Generation
      • Relevance
        • 1998 DirectHit
          • re-ranking results using user click-through rates
        • 1998 Google
          • re-ranking results using link authority
      • Size
        • 1999 FAST/AllTheWeb
          • scalable architecture
      • User Matters
        • 1996 AskJeeves
          • Users ask questions, natural language input
      • Money
        • 1997 Goto/Overture
          • Pay-for-performance, pay for search rankings
    • 3rd Generation
      • Consolidation
        • 2002 Yahoo! Purchases Inktomi
        • 2003 Overture purchases AltaVista, AllTheWeb
        • 2003 MSN announces intention of own search engine
        • 2003 Yahoo! Purchases Overture
      • Maturity
        • $2B market, $6B by 2005
        • Requires large capital investment, limiting newcomers
          • Although Gigablast is an exception (2 years private development, mid-size search index)
        • Traffic focused on Yahoo!, Google, AOL
        • Consumer use driven by brand marketing
  • Economics
    • Overview
      • Popularity
        • Search is the most used Internet application after email
        • 400M queries / day
      • High bar for quality in search results
        • Users spend 1.5 hours / week searching
        • Experience search rage after 12 minutes
      • Expensive centralized service
        • Indices cover billions on documents
          • The FAST index is 30TB large!
        • Query service is high performance application
          • Google claims 50K machines
      • Cost: $0.001 per query
        • Amortizes capital, operations, and engineering costs
    • Business Models
      • Early Monetization Models
        • Subscription services
          • Infoseek, Northern Lights
          • Failed: users can find equal results for free
        • Advertising
          • Invented by Infoseek, Netscape
            • Untargeted ads (banners, sponsorships)
            • Limited keyword targeting (low keyword coverage)
        • Portalitis
          • Search not profitable enough, need stickier services
          • Email, shopping, content channels
          • Tried by Excite, Infoseek, AltaVista -> disastrous
          • Led to lack of focus on core technology that opened the door for 2nd generation search engines.
      • Performance Search Market
        • Goto/Overture – keyword auction
        • With 80k+ advertisers get good keyword coverage, currently exceeds 40%
        • Pay per click revenue
          • Marketers easily project to conversions
          • Search engine projects to CPM
        • Triple Win
          • Consumer: relevant ads
          • Marketer: qualified traffic
          • Search Engine: high-monetized impressions
        • Successful
          • Overture makes ~$1B/year
          • Strategy adopted by Google
        • Current Evolution
          • Greater automation
            • self-serve sign up, automated bidding
          • Increased competition
            • Google splits market with Overture
            • 19% of Yahoo! Revenue from paid listings
            • MSN Search most profitable MS product group by headcount (50 people)
    • Trends available online
      • SearchEngineWatch.com (Neilsen / Netrating)
      • Traffic concentration
        • Google > Yahoo! > MSN…
      • Loyalty
        • AOL > Google > IS > Yahoo!
  • Technology
    • WWW Size
      • Dynamic pages -> effectively infinite pages
      • Domains: .com (23M), .net (4M), .org (2.5M)
    • Crawling
      • Index parameterized by size and freshness
      • Batch (discover, grab, index) and Incremental (mixed) approaches to crawling
    • Relative Size
      • Google – 3B, FAST– 2.5B, AltaVista – 1B
      • Anchor text only index (discovered links that are not yet crawled)
        • FAST 1.2B fully indexed pages (rest anchor text only)
        • Google 1.5 fully indexed pages
    • Freshness
      • Graph from (G. Notess)
      • Note use of hybrid indices
        • Subindices with differing update rates
    • Ranking
      • 2.4 query terms -> 2B documents -> 10 highly relevant pages. All in 300ms.
      • Trouble queries: Travel, Cobra, John Ellis
      • Ingredients
        • Keyword match
        • Anchor text
        • Link authority
        • Click-through rates
    • SPAM – An Arms Race
      • Manipulate content purely to influence ranking
      • Dictionary spam, link sharing, domain hijacking, link farms
      • Robotic use of search results
        • Meta-search engines
        • Search engine optimizers
        • Fraud
    • UI
      • Ranked results lists
        • Document summaries are critical
        • Hit highlight, dynamic abstract
        • NO RECENT INNOVATIONS!
      • Blending
        • Pre-defined segmentation (e.g. paid listing)
        • Intermixed results from multiple sources
  • Future
    • Question Answering
      • Natural Language Processing
      • Dumais, SIGIR 2002 paper
      • WWW as language model
    • New Contexts
      • Ubiquitous searching
      • Implicit searching
    • New Tasks
      • Local / community search
  • Questions…
    • Personalization
      • Currently searchers are anonymous
      • Personalized search requires some form of user model
        • How much does the engine need to know?
        • Geographic location
        • Use context of surfing behavior
    • Personal Search Agents
      • Technical challenges to this
      • My idea: have distributed agents coupled with access to large, centralized indices.
      • Most importantly: what is the big advantage??
        • Need qualitative change in searching experience
        • Interesting, but not shown useful yet
      • My idea: have agents be pre-fetchers to automatically hunt for content for which you have a high probability of interest
        • e.g. citation mining to collect all research papers within a particular domain
Posted by jheer at 05:42 PM | Comments (0)

July 10, 2003

paper: animation support in a UI toolkit

Here's a back-post for a prelims paper: "Animation Support in a User Interface Toolkit", by Hudson and Stasko. I thought this paper particularly relevant, as I'm currently working in interactive graph visualization, which includes a heavy animation component. This paper got me considering higher level primitives I might use in the graph viz toolkit we are developing.

Animation Support in a User Interface Toolkit: Flexible, Robust, and Reusable Abstractions
Scott E. Hudson, John T. Stasko
UIST'93 (User Interface and Software Technology 1993)

In this paper, the authors present extensions to the Artkit user interface toolkit to support animation. The toolkit offers basic support for simple motion blur, "squash and stretch", use of arcing trajectories, controlled timing of actions, anticipation + follow-through, and slow-in / slow-out transitions. It also supports a robust scheduling system that helps deal with unpredictable performance from the windowing system... very important since this was running on X-Windows.

The main abstraction used is the transition, which consists of a pointer to the UI component that is moving, the trajectory the component will take, and the time interval over which to animate. The UI component can be any interactor object implemented in Artkit. The trajectory consists of the curve traveled (parameterized from 0 to 1) and a pacing function to determine velocities over the curve (e.g. using a line with slope 1 for uniform animation and an arctan or sigmoidal function to create slow-in / slow-out transitions). The times in the time interval can be expressed as absolute times, as a delay from the present time, or parameterized by the starting or ending of other transitions.

Robust animation and event-relative transitions are achieved using an animation dispatch agent. All that is assumed is that the tookit can ask what the current time is and that the window system will pass back control to the toolkit periodically. The agent constructs a scheduling queue of transitions, and attempts to estimate when the next draw cycle will appear on the screen using a measure of past updates. Using this redraw end time, the set of active transitions for the current cycle is selected. For each active transition, it is started or stopped as appropriate and current parameter values are passed through their pacing functions and mapped to screen positions using the trajectory.

This scheme will animate smoothly when the agent is given control at a regular intervals, but it will also properly handle delays, correctly delivering animation steps at larger intervals.

Criticism: The first thing that struck me is that no mention of scale is given. How many objects can I animate at once? What are the bottlenecks? Obviously rendering time is a major factor, but overhead is accrued through scheduling and through mapping each object through it's own pacing and transition items. In most cases I'd expect this to be a constant time overhead, but this isn't really discussed. Also, cool animated effects like squash and stretch was mentioned multiple times but the implementation of it is not discussed.

Today, 10 years later, we have incredibly more powerful processors and graphics cards, enabling much richer animation possibilities. This paper was ahead of it's time and today's popular toolkits - Swing, MFC, etc - are definitely behind the times. While toolkits like Java2D provide much of the rendering and geometric capabilities needed, animation managers like the kind presented here and in Xerox PARC's Information Visualizer are yet to be common. Hopefully as graphics power continues to grow and the drive for more powerful interactive technologies gains momentum (e.g. more widespread use of information visualization) these more powerful tools and abstractions will become commonplace.

Posted by jheer at 10:49 PM | Comments (0)

    jheer@acm.ørg