Talking with Twitter


I’ve been messing with the Twitter search API, and I am here to whine about it. Overall, it’s a great feature, but it’s interesting that it imposes costs on the third-party client that the Twitter interface seemingly doesn’t share. For example, I can run a search and get back a bunch of results. When I do it from the Twitter web page, it gives me the option of drilling down and showing conversations when they come up in search results.

When I execute the same query using the API, however, there is no indication that a particular message was related to some other message in any way. Sure, I know who sent what to whom, but that’s not enough! Not only does the search API not tell me when a message is a reply, it doesn’t provide useful information to indicate a retweet, either.

Consider the following search result:

Search results, showing conversationHow does Twitter know that this conversation took place? It must look at the metadata for each status independently (where conversations are tracked) because the search results themselves only represent the matching tweets. Here’s the relevant JSON for a single posting, as returned by Twitter:

"created_at":"Mon, 01 Feb 2010 04:51:44 +0000",
"text":"@emancan Or are you saying he's just enough of a landscape
feature that pop should have recognized him somehow at the Grammys?",
"source":"<a href="">web</a>"}

The “inReplyToStatusId” field is nowhere to be found. So not only must the client go back to the server to get the user info (and it must use the screen name rather than an id because of a bug reported more than a year ago), but it must also, separately, get the status to understand whether it was part of a conversation. And it must do this one user or status message at a time, further decreasing the efficiency of evaluating the query.

It is interesting to note, however, that Twitter’s conversation feature does not appear to rely on “inReplyToStatusId” fields in a status object, but rather seems to bring together posts between two people within a certain amount of time. I discovered this by manually fetching the metadata for each tweet in the above conversation, only to discover that some of the tweets didn’t have “inReplyToStatusId” fields set. To replicate the conversation in a third-party client, you would have to issue two queries, one “from: a to: b” and the other “from: b to: a” and then merge the results by time. This part seems reasonable, although it would be nice to be able to issue the conjunction: “(from: a to: b) OR (from: b to: a)”. Unfortunately, Twitter returns no results for such queries.

Thus to build up a graph of who sent what to whom would be quite  time-consuming. While Twitter may not want to suffer the cost of large bulk transfers, it ought to be possible for clients to be a bit more efficient in their communication. Perhaps a way of optionally specifying which fields the client is interested in would be a reasonable compromise. For example, I may not be interested in the URLs, in the language, or in the source of a tweet, but would like to have refers-to data instead. Furthermore, since search results often contain multiple tweets by the same user, throwing in a collection of user records into the search results would be more efficient than having to fetch all the extra data in independent requests.

Overall, Twitter’s API is an excellent piece of infrastructure that is partly responsible for the large diversity of clients and applications based on Twitter. In some cases, however, it falls short of what seems reasonably useful.

Share on: 


  1. Twitter Comment

    RT @HCIR_GeneG: Posted “Talking with Twitter” [link to post] comments on twitter search API

    Posted using Chat Catcher

  2. Twitter Comment

    Posted “Talking with Twitter” [link to post] comments on twitter search API

    Posted using Chat Catcher

  3. […] I would also love to see a good encapsulation of the API for communicating with Twitter that deals gracefully with communication failures and call quotas, and of course improvements to the way search results are returned. […]

Comments are closed.