Blog Category: Ubiquitous computing

Ego-Centric vs. Exo-Centric Tracking and Interaction in Smart Spaces

on Comments (0)

In the recent paper published at SUI 2014,”Exploring Gestural Interaction in Smart Spaces using Head-Mounted Devices with Ego-Centric Sensing”, co-authored with Barry Kollee and Tony Dunnigan, we studied a prototype Head Mounted Device (HMD) that allows the interaction with external displays by input through spatial gestures.

In the paper, one of our goals was to expand the scope of interaction possibilities on HMDs, which are currently severely limited, if we consider Google Glass as a baseline. Glass only has a small touch pad, which is placed at an awkward position on the devices rim, at the user’s temple. The other input modalities Glass offers are eye blink input and voice recognition. While eye blink can be effective as a binary input mechanism, in many situations it is rather limited and could be considered socially awkward. Voice input suffers from recognition errors for non-native speakers of the input language and has considerable lag, as current Android-based devices, such as Google Glass, perform text-to-speech in the cloud. These problems were also observed in the main study of our paper.

We thus proposed three gestural selection techniques in order to extend the input capabilities of HMDs: (1) a head nod gesture, (2) a hand movement gesture and (3) a hand grasping gesture.

The following mock-up video shows the three proposed gestures used in a scenario depicting a material selection session in a (hypothetical) smart space used by architects:

EgoSense: Gestural Interaction in Smart Spaces using Head Mounted Devices with Ego-Centric Sensing from FX Palo Alto Laboratory on Vimeo.

We discounted the head nod gesture after a preliminary study showed a low user preference for such an input method. In a main study, we found that the two gestural techniques achieved performance similar to a baseline technique using the touch pad on Google Glass. However, we hypothesize that the spatial gestural techniques using direct manipulation may outperform the touch pad for larger numbers of selectable targets (in our study we had 12 targets in total), as secondary GUI navigation activities (i.e., scrolling a list view) are not required when using gestures.

In the paper, we also present some possibilities for ad-hoc control of large displays and automated indoor systems:

Ambient light control using spatial gestures tracked by via an HMD.

Ambient light control using spatial gestures tracked by via an HMD.

Considering the larger picture, our paper touches on the broader question of ego-centric vs exo-centric tracking: past work in smart spaces has mainly relied on external (exo-centric) tracking techniques, e.g., using depth sensors such as the Kinect for user tracking and interaction. As wearable devices get increasingly powerful and as depth sensor technology shrinks, it may, in the future, become more practical to users to bring their own sensors to a smart space. This has advantages in scalability: more users can be tracked in larger spaces, without additional investments in fixed tracking systems. Also, a larger number of spaces can be made interactive, as the users carry their sensing equipment from place to place.

LoCo: a framework for indoor location of mobile devices

on Comments (0)

Last year, we initiated the LoCo project on indoor location.  The LoCo page has more information, but our central goal is to provide highly accurate, room-level location information to enable indoor location services to complement the location services built on GPS outdoors.

Last week, we presented our initial results on the work at Ubicomp 2014.  In our paper, we introduce a new approach to room-level location based on supervised classification.  Specifically, we use boosting in a one-versus-all formulation to enable highly accurate classification based on simple features derived from Wi-Fi received signal strength (RSSI) measures.  This approach offloads the bulk of the complexity to an offline training procedure, and the resulting classifier is sufficiently simple to be run on a mobile client directly.  We use a simple and robust feature set based on pairwise RSSI margin to both address Wi-Fi RSSI volatility.

h_m(X) = \begin{cases} 1 & X(b_m^{(1)}) - X(b_m^{(2)}) \geq \theta_m \\ 0 & \text{otherwise} \end{cases}

The equation above shows an example weak learner which simply looks at two elements in an RSSI scan and compares their difference against a threshold.  The final strong classifier for each room is a weighted combination of a set of weak learners greedily selected to discriminate that room.  The feature is designed to express the ordering of RSSI values observed for specific access points, and a flexible reliance on the difference between them, and the threshold \theta_m is determined in training.  An additional benefit of this choice is that processing a subset of the RSSI scan according to the selected weak learners further reduces the required computation.  Comparing against the kNN matching approach used in RedPin [Bolliger, 2008], our results show competitive performance with substantially reduced complexity.  The Table below shows cross validation results from the paper for two data sets collected in our office.  The classification time appears in the rightmost column.

We are excited about the early progress we’ve made on this project and look forward to building out our indoor location system in several directions in the near future.  But more than that, we look forward to building new location driven applications exploiting this technique which can leverage existing infrastructure (Wi-Fi networks) and devices (cell phones) we already use.

A User’s Special Touch


Yesterday Volker Roth came back for a visit and to give us a preview of the talk he will give next week at UIST 2010 on his work with Philipp Schmidt and Benjamin Güldenring on The IR Ring: Authenticating users’ touches on a multi-touch display. The work supports multiple users interacting with the same screen at the same time with different access and control permissions. For example, you may want to show me a document on a multi-touch display, but that does not mean you want me to be able to delete that document. Similarly, I may want to show you a particular e-mail I received, without giving you the ability to access my other e-mail messages, or to send one in my name. Roth et al. implemented hardware and software add-ons for a multi-touch display that restrict certain actions to the user wearing the IR ring emitting the appropriate signal. Users wearing different rings have different access and control privileges. In this way, only you can delete your document, and only I can access my other e-mail messages.

Roth and his coauthors frame their work as preventing “pranksters and miscreants” from carrying out “their schemes of fraud and malice.” To me, the work is most compelling as a means to avoid mistakes and to frustrate human curiosity. Continue Reading

Virtual Factory at IEEE ICME 2010, Singapore


Happy to note that our overview paper on the Virtual Factory work, “The Virtual Chocolate Factory: Building a mixed-reality system for industry” has been accepted at IEEE’s ICME 2010. The conference is in Singapore in July; I’ll be there, co-chairing a session there that focuses on workplace use of virtual realities, augmented reality, and telepresence. You can see more on the Virtual Factory work here.

Linking Digital Media to Physical Documents: Comparing Content- and Marker-Based Tags


There are generally two types of tags for linking digital content and paper documents. Marker-based tags and RFIDs employ a modification of the printed document. Content-based solutions remove the physical tag entirely and link using features of the existing printed matter. Chunyuan, Laurent, Gene, Qiong, and I recently published a paper in IEEE Pervasive Computing magazine that explores the two tag types’ use and design trade-offs by comparing our experiences developing and evaluating two systems  that use marker-based tagging — DynamInk and PapierCraft — with two systems that utilize content-based tagging — Pacer and ReBoard. In the paper, we situate these four systems in the design space of interactive paper systems and discuss lessons we learned from creating and deploying each technology.

Take a look!

pCubee: a interactive cubic display

on Comments (2)

Our friend Takashi Matsumoto, (who built the Post-Bit system with us here at FXPAL) built a cubic display called Z-agon with colleagues at the Keio Media Design Laboratory. Takashi points us at this video of a very nicely realized cubic display (well, five-sided, but still). It’s called pCubee: a Perspective-Corrected Handheld Cubic Display and it comes from the Human Communications Technology Lab at the University of British Columbia. Some of you may have seen a version of this demoed at ACM Multimedia 2009; it will also be at CHI 2010. Longer and more detailed video is here.

Reintroducing ReBoard

on Comments (2)

ReBoard is a system we built at the lab to automatically capture whiteboard images and make them accessible and sharable through the web. A technical description of the system is available here. At CHI 2010, Stacy Branham will present an evaluation of ReBoad that she conducted over the summer as an intern at FXPAL1.

Until then, check out our dorky demonstration video!

And be sure to watch the other videos of the latest and greatest FXPAL technologies.

1. The paper is
“Let’s go from the whiteboard: Supporting transitions in work through whiteboard capture and reuse” by Stacy Branham, Gene Golovchinksy, Scott Carter, and Jacob Biehl

mobile. very mobile.


Developers have built applications for mobile phones to support a wide swath of activities, but I would argue that there is no better use for a mobile phone than for those tasks that are fundamentally mobile. And what is more mobile than running? While there have been a variety of research projects (such as UbiFit) designed to encourage exercise, I am more interested here in those applications that support folks who’ve already bought in. For us, smart phones that make it easy to track pace, distance, and even elevation (such as RunKeeper, SportsTracker, and MotionXGPS) have been killer apps. Research projects (such as TripleBeat) are also exploring how to increase competition using past personal results as well as results from other users. Other work has explored using shared audio spaces to allow runners to compete over distances.

How else might we use mobile technologies to improve the running experience? Continue Reading

ARdevcamp: Augmented Reality unconference Dec. 5 in Mountain View, New York, Sydney…

on Comments (1)

We’re looking forward to participating in ARdevcamp the first weekend in December. It’s being organized in part by Damon Hernandez of the Web3D Consortium, Gene Becker of Lightning Labs, and Mike Liebhold of the Institute for the Future (among others – it’s an unconference, so come help organize!) So far, there are ~60 people signed up; I’m not sure what capacity will be, but I’d sign up soon if you’re interested. You can add your name on the interest list here.

From the wiki:

The first Augmented Reality Development Camp (AR DevCamp) will be held in the SF Bay Area December 5, 2009.

After nearly 20 years in the research labs, Augmented Reality is taking shape as one of the next major waves of Internet innovation, overlaying and infusing the physical world with digital media, information and experiences. We believe AR must be fundamentally open, interoperable, extensible, and accessible to all, so that it can create the kinds of opportunities for expressiveness, communication, business and social good that we enjoy on the web and Internet today. As one step toward this goal of an Open AR web, we are organizing AR DevCamp 1.0, a full day of technical sessions and hacking opportunities in an open format, unconference style.

AR DevCamp: a gathering of the mobile AR, 3D graphics and geospatial web tribes; an unconference:
# Timing: December 5th, 2009
# Location: Hacker Dojo in Mountain View, CA

Looks like there will be some simultaneous ARdevcamp events elsewhere as well – New York and Manchester events are confirmed; Sydney, Seoul, Brisbane, and New Zealand events possible but unconfirmed.

Designing User Friendly Augmented Work Environments


We’re happy to note that the book “Designing User Friendly Augmented Work Environments” (edited by Saadi Lahlou) has been published by Springer, in hardcover with an online version available. We have a chapter in it on our USE smart conference room system: “Designing an Easy-to-Use Executive Conference Room Control System.” The chapter starts with some of the field work we did to understand the work flows of the stakeholders, and then describes the evolution of the system we built to support the executive, his assistant, and others who used the meeting room. The system developed during this project was the precursor to the DICE system.

The process of writing and publishing this chapter took a considerable amount of time, and thus it is interesting to look back on some of our early designs to see how they have evolved. One aspect that changed was the name of project: we started out calling the system USE (Usable Smart Environment) and that terminology is used in the book chapter. By the time we completed this project and moved onto the larger conference room, we changed the name to DICE (Distributed Intelligent Conferencing Environment). DICE now runs in both rooms, and USE is the name of Gene’s group, just to add to the confusion.

For more information on this work, check out the video, some before/after pictures, and the CHI 2009 paper. We’re also working on a journal article that extends the CHI findings. Look for it in a few years!