Blog Category: Research

Copying and Pasting from Video

on

This week at the ACM Conference on Document Engineering, Laurent and Scott are presenting new work on direct manipulation of video.  The ShowHow project is our latest activity involving expository or “how to” video creation and use. While watching videos of this genre, it is helpful to create annotations that identify useful frames or shots using ShowHow’s annotation capability directly, or by creating a separate multimedia notes document.  The primary purpose of such annotation is for later reference, or incorporation into other videos or documents.  While browser history might be able to get you back to a specific video you watched previously, it won’t readily get you to a specific portion of much longer source video efficiently, or provide you with the broader context in which you found that portion of the video noteworthy.  ShowHow enables users to create rich annotations around expository video that optionally include image, audio, or text to preserve this contextual information.

For creating these annotations, copy and paste functionality from the source video is desirable.  This could be selecting a (sub)frame as an image or even selecting text shown in the video.  Also, we demonstrate capturing dynamic activity across frames in a simple animated GIF for easy copy and paste from video to the clipboard.  There are interaction design challenges here, and especially as more content is viewed on mobile/touch devices, direct manipulation provides a natural means for fine control of selection.

Under the hood, content analysis is required to identify events in the video to help drive the user interaction.  In this case, the analysis is implemented in javascript and runs in the browser on which the video is being played.  So efficient means of standard image analysis tools such as region segmentation, edge detection, and region tracking are required.  There’s a natural tradeoff between robustness and efficiency here that constrains the content processing techniques.

The interaction enabled by the system is probably best described in the video below:

Video Copy and Paste Demo

Go find Scott or Laurent in Florence or contact us for more information.

Looking ahead

on Comments (4)

It is reasonably well-known that people who examine search results often don’t go past the first few hits, perhaps stopping at the “fold” or at the end of the first page. It’s a habit we’ve acquired due to high-quality results to precision-oriented information needs. Google has trained us well.

But this habit may not always be useful when confronted with uncommon, recall-oriented, information needs. That is, when doing research. Looking only at the top few documents places too much trust in the ranking algorithm. In our SIGIR 2013 paper, we investigated what happens when a light-weight preview mechanism gives searchers a glimpse at the distribution of documents — new, re-retrieved but not seen, and seen — in the query they are about to execute.

Continue Reading

In Defense of the Skeuomorph, or Maybe Not…

on

Hard drive iconJony Ive is a fantastic designer. As a rule, his vision for a device sets the trend for that entire class of devices. Apparently, Jony Ive hates skeuomorphic design elements. Skeuomorphs are those sometimes corny bits of realism some designers add to user interfaces. These design elements reference an application’s analog embodiment. Apple’s desktop and mobile interfaces are littered with them. Their notepad application looks like a notepad. Hell, the hard drive icon on my desktop is a very nice rendering of the hard drive that is actually in my desktop.

Continue Reading

Client-side search

on

When we rolled out the CHI 2013 previews site, we got a couple of requests for being able to search the site with keywords. Of course interfaces for search are one of my core research interests, so that request got me thinking. How could we do search on this site? The problem with the conventional approach to search is that it requires some server-side code to do the searching and to return results to the client. This approach wouldn’t work for our simple web site, because from the server’s perspective, our site was static — just a few HTML files, a little bit of JavaScript, and about 600 videos. Using Google to search the site wouldn’t work either, because most of the searchable content is located on two pages, with hundreds of items on each page. So what to do?

Continue Reading

Details, please

on Comments (3)

At a PARC Forum a few years ago, I heard Marissa Mayer mention the work they did at Google to pick just the right shade of blue for link anchors to maximize click-through rates. It was an interesting, if somewhat bizarre, finding that shed more light on Google’s cognitive processes than on human ones. I suppose this stuff only really matters when you’re operating at Google scale, but normally the effect, even if statistically-significant, is practically meaningless. But I digress.

I am writing a paper in which I would like to cite this work. Where do I find it? I tried a few obvious searches in the ACM DL and found nothing. I searched in Google Scholar, and I believe I found a book chapter that cited a Guardian article from 2009, which mentioned this work. But that was last night, and today I cannot re-find that book chapter, either by searching or by examining my browsing history. The Guardian article is still open in a tab, so I am pretty sure I didn’t dream up the episode, but it is somewhat disconcerting that I cannot retrace my steps.

Continue Reading

Slow down!

on

The prolific Jaime Teevan has decided to blog, as evidenced by the creation of “Slow Searching” a few weeks ago. In a recent post, Jaime wrote about some ways in which Twitter search differed from web search, among which she included monitoring behavior, running “the same query over and over again just to see what is new.” Putting on my Lorite hat for a minute, this seems quite similar (albeit on a different timescale) to the “pre-web” concept of routing or standing queries. At some point, later, Google introduced Alerts, which seemed to be its reinvention of the same concept. And of course tools like TweetDeck make  it much easier to keep up with particular Twitter topics.

Continue Reading

Social media mining intern

on

We are looking for an intern to work with us this summer in the area of social media analysis. The project will involve understanding and mining patterns within Twitter data, in both text and images. An ideal candidate is a PhD student with strong machine learning skills. Prior experience in image understanding, text data mining, social network analysis, or statistical modeling is a plus.  If you are interested in this project, please send your CV to Dhiraj dhiraj@fxpal.com or Francine  chen@fxpal.com.

HCIR 2012 papers published!

on Comments (1)

One of the things we did slightly differently in this year’s HCIR Symposium was to introduce full-length, pier reviewed, top-tier conference-quality papers. We received a number of submissions, each of which was read and discussed by three reviewers. We then rejected some of papers, and sent several back for a rewrite-and-resubmit cycle. In the end, we accepted four papers, which have now been published in the ACM Digital Library.

Continue Reading

HCIR 2012 keynote

on Comments (3)

Last week we held the HCIR 2012 Symposium in Cambridge, Mass. This is the sixth in a series that we have organized. We expanded the format of this year’s meeting to a day and a half, and in addition to the posters, search challenge reports, and short talks, we introduced full papers reviewed to first-tier conference standards. I will write more about these later, and for details on other events at the Symposium, I refer you to the excellent blog post by one of the other co-orgranizers, Daniel Tunkelang.

In this post, I wanted to record my impressions of the keynote talk by Marti Hearst from UC Berkeley.

Opening slide from Marti Hearst's keynote address

 

Continue Reading

Open sourcing DisplayCast

on

Open source plays an important role in a research laboratory like FXPAL. It allows our researchers to focus their energy on their own innovations and build on the efforts of the community. Open source projects thrive when many openly contribute their work for the common good. However, FXPAL has a business imperative to protect its innovations. We believe that we have found the balance between contributing back to the open source community and protecting our innovations.

Thus we are happy to announce that we have open sourced DisplayCast using a liberal NewBSD license. DisplayCast is a high performance screen sharing system designed for Intranets. It supports real time multiuser screen sharing across Windows 7, Mac OS X (10.6+) and iOS devices. The technical details of our screen capture and compression algorithms will be presented at the upcoming ACM Multimedia 2012 conference. The source code is hosted at GitHub. We provide two repositories: an Objective C based screen capture, playback and archive component that targets the Apple Mac OS X and iOS platforms, and an .NET/C# based screen capture and real time playback component that targets Windows 7.

We hope others find DisplayCast useful and that they will release their own innovations back to the open source community. FXPAL will continue to open source relevant projects in the future.