Monthly Archive for October, 2007

Gestural communication over video stream: supporting multimodal interaction for remote collaborative physical tasks

Ou, J., Fussell, S. R., Chen, X., Setlock, L. D., and Yang, J. Gestural communication over video stream: supporting multimodal interaction for remote collaborative physical tasks. In ICMI ’03: Proceedings of the 5th international conference on Multimodal interfaces (New York, NY, USA, 2003), ACM Press, pp. 242–249. [pdf]

—————–

This paper presents the DOVE  system to support multimodal communication during collaborative physical tasks. In particular, the system allows the user to communicate at distance remote gestures, both simple pointing or more complex ones.

The author tested three different conditions: video only, DOVE with manual erase and DOVE with automatic erase finding better performance for this last condition. Their finding suggests that collaborators perform best when the gestures disappear automatically, much like ordinary hand gestures disappear once people have completed them (similarly to ConcertChat).

They also observed complex drawings involving several pen strokes. In this case automatic recognition or automatic erase does not help. The article contains also a good review of the literature.

Wicked Problems

Today, I had an interesting discussion with Khaled on these family of human problems:

A wicked problem is one for which each attempt to create a solution changes the understanding of the problem. Wicked problems cannot be solved in a traditional linear fashion, because the problem definition evolves as new possible solutions are considered and/or implemented. The term was originally coined by Horst Rittel. Wicked problems always occur in a social context. The wickedness of the problem reflects the diversity among the stakeholders in the problem.

According to Rittel and Webber [1], wicked problems have 10 characteristics:

  1. Wicked problems have no definitive formulation. Formulating the problem and the solution is essentially the same task. Each attempt at creating a solution changes your understanding of the problem.
  2. Wicked problems have no stopping rule. Since you can’t define the problem in any single way, it’s difficult to tell when it’s resolved. The problem-solving process ends when resources are depleted, stakeholders lose interest or political realities change.
  3. Solutions to wicked problems are not true-or-false, but good-or-bad. Since there are no unambiguous criteria for deciding if the problem is resolved, getting all stakeholders to agree that a resolution is “good enough” can be a challenge, but getting to a “good enough” resolution may be the best we can do.
  4. There is no immediate or ultimate test of a solution to a wicked problem. Since there is no singular description of a wicked problem, and since the very act of intervention has at least the potential to change that which we deem to be “the problem,” there is no one way to test the success of the proposed resolution.
  5. Every implemented solution to a wicked problem has consequences. Solutions to such problems generate waves of consequences, and it’s impossible to know, in advance and completely, how these waves will eventually play out.
  6. Wicked problems don’t have a well-described set of potential solutions. Various stakeholders have differing views of acceptable solutions. It’s a matter of judgment as to when enough potential solutions have emerged and which should be pursued.
  7. Each wicked problem is essentially unique. There are no “classes” of solutions that can be applied, a priori, to a specific case. “Part of the art of dealing with wicked problems is the art of not knowing too early what type of solution to apply.”
  8. Each wicked problem can be considered a symptom of another problem. A wicked problem is a set of interlocking issues and constraints that change over time, embedded in a dynamic social context. But, more importantly, each proposed resolution of a particular description of “a problem” should be expected to generate its own set of unique problems.
  9. The causes of a wicked problem can be explained in numerous ways. There are many stakeholders who will have various and changing ideas about what might be a problem, what might be causing it and how to resolve it. There is no way to sort these different explanations into sets of “correct/incorrect.”
  10. The planner (designer) has no right to be wrong. Scientists are expected to formulate hypotheses, which may or may not be supportable by evidence. Designers don’t have such a luxury—they’re expected to get things right. People get hurt, when planners are “wrong.” Yet, there will always be some condition under which planners will be wrong.

EXAMPLE: consider what it would take to “solve” terrorism, where even the term terrorism is highly controversial and difficult to define.

[1] Rittel, H., & Webber, M. (1973). “Dilemmas in a General Theory of Planning,” Policy Sciences, 4, 155-169.

Using linguistic features to measure presence in computer-mediated communication

Kramer, A. D. I., Oh, L. M., and Fussell, S. R. Using linguistic features to measure presence in computer-mediated communication. In CHI ’06: Proceedings of the SIGCHI conference on Human Factors in computing systems (New York, NY, USA, 2006), ACM Press, pp. 913–916. [pdf]

———

This paper reports an interesting study on how linguistic features of the communication between collaborators might account for the people’s sense of presence. The authors’ logic behind this measure is that: to the extent that people talk about a remote space in the same way they talk about local space, we can infer that they feel immersed in that remote space.

They used an Helper-Worker paradigm and they tested four communication condition: audio, video, video+drawing and face-to-face. The four conditions gave rise to different levels of self-reported presence. Presence was highest in the face-to-face condition, lowest in the audio-only condition and intermediate in the video conditions.

Presence scores were also highly correlated with the use of local deixis (e.g., “this”, “here”). Confirming that when people feel present in a remote environment, they talk about it in the same way they talk abot their physical environment.

The paper shows an interesting application of the regression analysis to verify how these linguistic features can predict participants’ sense of presence.

Kramer Robot-Presence



Kramer Linguistic-Regression

Fragmented interaction: establishing mutual orientation in virtual environments

Hindmarsh, J., Fraser, M., Heath, C., Benford, S., and Greenhalgh, C. Fragmented interaction: establishing mutual orientation in virtual environments. In CSCW ’98: Proceedings of the 1998 ACM conference on Computer supported cooperative work (New York, NY, USA, 1998), ACM Press, pp. 217–226. [pdf]

———-

The authors of this work reports a detailed analysis of the interactions of participants in Collaborative Virtual Environments (CVE). One of the most important limitations of such interaction spaces is that individuals could not easily determine what a participant was referring to. The problem derived from the difficulty in re-connecting an image of the other with the image of the object they were referring.

Object-focused discussions are problematic due to the ‘fragmentation’ of different elements of the workspace. In co-present interaction, when an individual asks a co-participant to look at an object at which they are pointing, that co-participant can usually see them in relation to their surroundings.  This is problematic in virtual interactions as participants have to re-assemble the relations between body and object.

Participants observed by the authors tended to overcome these limitation making the implicit references more explicit. Instead of saying: “what do you know about this” they would say: “See this sofa here?”.

Major problems of this technology are a limited horizontal field of view; a lack of information about others’ actions; slow movements; and a lack of parallelism for actions.

Speech in their hands

There was speech in their dumbness, language in their very gesture.

The Winter’s Tale (First Gentleman at V, ii)

Shakespeare

Turn it this way: grounding collaborative action with remote gestures

Kirk, D., Rodden, T., and Fraser, D. S. Turn it this way: grounding collaborative action with remote gestures. In CHI ’07: Proceedings of the SIGCHI conference on Human factors in computing systems (New York, NY, USA, 2007), ACM Press, pp. 1039–1048. [pdf]

———-

This paper present a fundamental study for my thesis work. The authors enquired the ability of remote gesturing tools to imporve distance collaborations performance. The authors reported a suggestion from previous work stating that complex gestures rather than simple deixis are responsible for performance enhancement (Fussell et al., 2004).

Their specific question was to understand the impact of complex remote gestures with language taking into consideration the temporal nature of the grounding process. Particularly, performance benefit derived from the use of a remote gesture tool is based on its ability to affect the process of developing common ground (Kirk, D. S., and Stanton Fraser, D., 2006).

Complex use of gestures in interaction can have a variety of other uses in collaborative discourse such as helping to marshal turn-taking and to signal understanding.

The authors use a helper-worker paradigm where the task at hand was the reconstruction of a lego model from diagrammatic instructions. “The system was constructed such that both participants would be in the same room during the study, but only had visual access to each other and each other’s desks through the mediating technology – partitions ensuring that direct visual access was blocked. This enabled us to retain full audio in all conditions without having to use any audio communications technology. Participants were allowed to speak to one another at all times during the study.

Kirk Experimental-Setting

They demonstrated how performance benefits of remote gesture tools appear to be strangest during early stages of an interaction, when remote gestures have the potential to reduce the amount of Workers’ speech. Independently of the phase, questioning behavior from the workers is slightly lessened by gesturing. Also, gesturing is associated with a reduction in the occurrence of speech overlaps.

Their findings demonstrated that performance improvements, already demonstrated by (Fussell et al., 2004) was still true when the remote gestures format was altered from a digital sketch to an unmediated representation of hands.

The authors finally discuss some implications for the deployment of remote gesturing technologies. They suggest three cooperative arrangements that fits into ideal applications of this form of technology.

(1) Non-Routing physical manipulations, where the nature of the task and the settings vary considerably and each cooperative iteration requires significant effort to ground the interaction.

(2) Regular changes in the participants, where common vocabulary of frames of references have to be reestablished frequently.

(3) Rapid cooperative diagnosis settings, where rapid coordination is required in order to decide the best possible actions.

Office collar

This is one of those interesting links I could expect to find in P&V. Office collar is the answer if you work in an open space:

Office Collar has been designed in response to the open plan, working environment. The collars act as spatial isolators, narrowing the field of vision, therefore enabling their wearer to focus on the tasks in front of them. The 15 individual hand made, white leather masks are to be worn on the head; the variation between each model explores the different actions undertaken whilst working and thinking.

[more]

Simone-Brewster-Collar-7

Copyright notice: the present content was taken from the following URL, the copyrights are reserved by the respective author/s.

They actually love black sheeps

Antonio Scarponi sent me this nice picture as a form of graphical protest against a poster published by the Swiss UDC party that was judged racist by the public opinion. I like this form of non-violent protest. Thanks Antonio!

Pecora Nera

Silk Maps

I got interested in this topic while I was discussing with a friend. He said that in rural India geographical maps are printed on silk as it makes them more resistant. Another advantage of the cloth over paper is that it makes easier folding back the map once used. So I googled around and I found that silk maps were intensively used during WWII [2], they were called Escape Maps:

During WWII hundreds of thousands of maps were produced by the British on thin cloth and tissue paper.  The idea was that a serviceman captured or shot down behind enemy lines should have a map to help him find his way to safety if he escaped or, better still, evade capture in the first place.  A map like this could be concealed in a small place (a cigarette packet or the hollow heel of a flying boot), did not rustle suspiciously if the captive was searched and, in the case of maps on cloth or mulberry leaf paper, could survive wear and tear and even immersion in water.  The scheme was soon extended to cover those who had already been captured, although a certain amount of ingenuity was required to get the maps into the POW camps.

 Catalog Images Access Silkmap1

I found this american company that sells some reproduction, and this one which sells silk scarfs with european maps for tourists.

Disambiguating complex visual information: Towards communication of personal views of a scene

Pomplun, M., Ritter, H., and Velichkovsky, B. Disambiguating complex visual information: Towards communication of personal views of a scene. Perception, 25 (1996), 931–948. [pdf]

——–

This paper reports two experiments on perception and eye-movement scanning of a set of 6 overtly ambiguous pictures. In the first experiment it was shown that specific perceptual interpretations of an ambiguous picture usually correlate with parameters of the gaze-position distributions. In the second experiment these distributions were used for an image-processing of initial pictures in such a way that in regions which attracted less fixations the brightness of all elements was lowered. The pre-processed pictures were then shown to a group of 150 naive subjects for an identification. The results of this experiment demonstrated that in 4 out of 6 pictures it was possible to influence perception of other persons in the predicted way.

Manual and gaze input cascaded (magic) pointing

Zhai, S., Morimoto, C., and Ihde, S. Manual and gaze input cascaded (magic) pointing. In CHI ’99: Proceedings of the SIGCHI conference on Human factors in computing systems (New York, NY, USA, 1999), ACM Press, pp. 246–253. [pdf]

———–

This paper presents an experimental setup where 3 different input mechanisms were compared: pure manual, pure gaze, and a mixed approach. The authors’ first claim is that pure gaze interaction mechanism is unatural as it overload a perceptual channel.

The authors tested the different input mechanisms with 36 subjects. Subjects using gaze only pointing performed worst than those using pure manual pointing mechanism. Best performance were achieved with the mixed approach.

This work explores a new direction in utilizing eye gaze for computer input. Gaze tracking has long been considered as an alternative or potentially superior pointing method for computer input. We believe that many fundamental limitations exist with traditional gaze pointing. In particular, it is unnatural to overload a perceptual channel such as vision with a motor control task. We therefore propose an alternative approach, dubbed MAGIC (Manual And Gaze Input Cascaded) pointing. With such an approach, pointing appears to the user to be a manual task, used for fine manipulation and selection. However, a large portion of the cursor movement is eliminated by warping the cursor to the eye gaze area, which encompasses the target. Two specific MAGIC pointing techniques, one conservative and one liberal, were designed, analyzed, and implemented with an eye tracker we developed. They were then tested in a pilot study. This early- stage exploration showed that the MAGIC pointing techniques might offer many advantages, including reduced physical effort and fatigue as compared to traditional manual pointing, greater accuracy and naturalness than traditional gaze pointing, and possibly faster speed than manual pointing. The pros and cons of the two techniques are discussed in light of both performance data and subjective reports.

Inspecting pictures for information to verify a sentence: Eye movements in general encoding and in focused search

Underwood, G., Jebbett, L., and Roberts, K. Inspecting pictures for information to verify a sentence: Eye movements in general encoding and in focused search. The Quarterly Journal of Experimental Psychology 1, 57A (2004), 165–182. [pdf]

—————

This article sheds some light on the following question: when we see combinations of text and graphics, such as photographs and their captions in printed media, how do we compare the information in the two components? The author employed a sentence verification task in which they used the subject to observe a picture with a caption and decide whether the sentence was correcly describing the scene.

They interpreted longer fixations as an indication of more difficult processing. The characteristic inspection pattern or scanpath started with a fixation near to the center of the picture. Within three fixations, typically, their eyes would saccade to the sentence, and they then read the sentence completely before inspecting the picture and made the decision immediately following this second visit to the picture (p.173).

The participants moved their eyes a number of times between the pictures and the sentence, but the decision of the validity of the sentence was taken not while reading the sentence but while viewing the picture. Sentences attracted more fixations than pictures.

The author discussed briefly also the priming effect (Sanocki and Epstein, 1997): the perception of a scene can be facilitated by prior presentation of a priming scene that makes the layout available early.

In the discussion, the authors question the assumption that performance of the sentence verification task requires the construction of a comparable abstract prepositional forms from the sentence and the picture. Larkin and Simon (1987) have argued that although they may contain the same information, the processing operations required to extract the information will not necessarily be equivalent: pictures and diagrams have advantages over textual descriptions. This ease of recognition of relationship from a picture was not reflected in fixation durations. Therefore the author conclude that the richer representations of information in pictures require extensive encoding durations which are comparable to the encoding of information from text.

Underwood Sentence-Verification-Task