Blog Author: Scott Carter

Introducing cemint


At FXPAL we have long been interested in how multimedia can improve our interaction with documents, from using media to represent and help navigate documents on different display types to digitizing physical documents and linking media to documents.


In an ACM interactions piece published this month we introduce our latest work in multimedia document research. Cemint (for Component Extraction from Media for Interaction, Navigation, and Transformation) is a set of tools to support seamless intermedia synthesis and interaction. In our interactions piece we argue that authoring and reuse tools for dynamic, visual media should match the power and ease of use of their static textual media analogues. Our goal with this work is to allow people to use familiar metaphors, such as copy-and-paste, to construct and interact with multimedia documents.

Cemint applications will span a range of communication methods. Our early work focused on support for asynchronous media extraction and navigation, but we are currently building a tool using these techniques that can support live, web-based meetings. We will present this new tool at DocEng 2014 — stay tuned!

mVideoCast: Mobile, real time ROI detection and streaming


In the past, media capture and access suffered primarily from a lack of storage and bandwidth. Today networked, multimedia devices are ubiquitous, and the core challenge has less to do with how to transmit more information than with how to capture and communicate the right information. Our first application to explore intelligent media capture was NudgeCam, which supports guided capture to better document problems, discoveries, or other situations in the field. Today we introduce another intelligent capture application: mVideoCast. mVideoCast lets people communicate meaningful video content from mobile phones while semi-automatically removing extraneous details. Specifically, the application can detect, segment, and stream content shown on screens or boards, faces, or arbitrary, user-selected regions. This can allow anyone to stream task-specific content without needing to develop hooks into external software (e.g., screen recorder software).

Check out the video demonstration below and read the paper for more details.

the problem with paper

on Comments (2)

A writer for the TC blog, Erick Schonfeld, recently posted a description of an encounter he had with a Stanford student at a drug store trying to recruit users to experiment with a paper prototype. The prototype and study were being carried out as a requirement for an HCI course the student is taking. The TC writer, in short, found the whole experience ridiculous, especially with respect to all of the whiz-bang, interactive demos he is used to seeing. While, as many point out, paper prototyping is a standard technique in HCI, that does not mean that it always works or is always appropriate. In fact, in my experience with early stage prototypes I was overall underwhelmed with paper prototyping. But I realized over time that experience did not reflect a problem with the prototyping tool per se, but rather a lack of understanding of the context of the user.

Continue Reading

Linking Digital Media to Physical Documents: Comparing Content- and Marker-Based Tags


There are generally two types of tags for linking digital content and paper documents. Marker-based tags and RFIDs employ a modification of the printed document. Content-based solutions remove the physical tag entirely and link using features of the existing printed matter. Chunyuan, Laurent, Gene, Qiong, and I recently published a paper in IEEE Pervasive Computing magazine that explores the two tag types’ use and design trade-offs by comparing our experiences developing and evaluating two systems  that use marker-based tagging — DynamInk and PapierCraft — with two systems that utilize content-based tagging — Pacer and ReBoard. In the paper, we situate these four systems in the design space of interactive paper systems and discuss lessons we learned from creating and deploying each technology.

Take a look!

ipad redux

on Comments (1)

For an article I’m writing for a well-known magazine I needed to get my hands on one of the new iPads for a few moments, pre-release. I went bottom-up, top-down, pretended to be a reporter, employed vague threats, etc. All to no avail. I suppose the powers-that-be have a good reason for this, but it is a mystery to me. I mean at this point, the cat is out of the bag! On the other hand, I’m not really in the target market (like these guys, I find Apple’s mobile devices far too restrictive — my particular pet peeve is having to subvert the OS just to mount as a drive). So maybe I’m not meant to understand.

Continue Reading

Summer Intern position in Data Mining and Visual Search

on Comments (1)

Update: This intern slot has been filled.

This is one in a series of posts advertising internship positions at FXPAL for the summer of 2010. A listing of all internship positions currently posted is available here.

Making a decision can be difficult. From choosing the right camera to
finding a place to live, people are faced with a dizzying array of
choices on one hand and commentary (in the form of blog posts, reviews, etc.) about their different options on the other. But little scaffolding connects the two. We are interested in how to make those connections in order to help people make decisions using innovative data mining and search techniques integrated with rich, interactive visualizations.

Specifically, this project will involve building a data mining system
capable of extracting useful summaries and metadata from consumer
reviews, and a walk-up-and-use visual interface that makes use of these data to help users browse collections. It is expected that the intern will be responsible for a subset of the system that is tailored to their interests. The intern will be expected to contribute to a paper suitable for IUI or a similar conference that describes the system and their experience designing it.

Prospective candidates should be enrolled in a PhD program and should have some experience with data mining and GUI design. Experience with information retrieval is a plus. Please contact Scott Carter if you are interested in this position. For more information on the FXPAL internship program, please visit our web site.

Reintroducing ReBoard

on Comments (2)

ReBoard is a system we built at the lab to automatically capture whiteboard images and make them accessible and sharable through the web. A technical description of the system is available here. At CHI 2010, Stacy Branham will present an evaluation of ReBoad that she conducted over the summer as an intern at FXPAL1.

Until then, check out our dorky demonstration video!

And be sure to watch the other videos of the latest and greatest FXPAL technologies.

1. The paper is
“Let’s go from the whiteboard: Supporting transitions in work through whiteboard capture and reuse” by Stacy Branham, Gene Golovchinksy, Scott Carter, and Jacob Biehl

mobile. very mobile.


Developers have built applications for mobile phones to support a wide swath of activities, but I would argue that there is no better use for a mobile phone than for those tasks that are fundamentally mobile. And what is more mobile than running? While there have been a variety of research projects (such as UbiFit) designed to encourage exercise, I am more interested here in those applications that support folks who’ve already bought in. For us, smart phones that make it easy to track pace, distance, and even elevation (such as RunKeeper, SportsTracker, and MotionXGPS) have been killer apps. Research projects (such as TripleBeat) are also exploring how to increase competition using past personal results as well as results from other users. Other work has explored using shared audio spaces to allow runners to compete over distances.

How else might we use mobile technologies to improve the running experience? Continue Reading

Open-access publishing

on Comments (2)

Laurent and I recently published an article (SeeReader: An (Almost) Eyes-Free Mobile Rich Document Viewer) in the special issue on Pervasive Computing in the International Journal of Computer Science Issues (IJCSI). The IJCSI is open-access, meaning that the content is not hidden behind a paywall. Open-access journals are still seen as dubious by many, and perhaps rightly so. These journals are universally new and tend to enjoy less prestige and quality than mainstream journals. In return, though, they offer fast turn-around times and wide indexing.

Continue Reading

Ada Lovelace Day (2)

on Comments (1)

Today is Ada Lovelace Day. Given that I’ve named my child after Ms. Lovelace, I feel obligated and honored to take part in the pledge to “highlight [a] women in technology” that I look up to.

While I’ve many present and past fabulous female colleagues, if I’m to choose one to write about it’s a no-brainer.

Jennifer Mankoff is an associate processor at the Human Computer Interaction Institute (HCII) at Carnegie Mellon University. Jen was my graduate advisor at Berkeley, seeing me through a master’s and PhD. Perhaps “nurse” is a better word, as she not only worked tirelessly with me to improve my abilities but at times literally cared for me when I was ill.

Jen is a whirling dervish. A good Samaritan. A force of nature.

Continue Reading