Twitter recently released some of its tweet-related code as open source. This is great news for those building applications on top of twitter, as it reduces the need to write the same code over and over. The released code includes parser and HTML markup generator classes, and a Regex class that includes a bunch of Pattern instances. Code is available in Java and Ruby.
The examples seem straightforward to use, which means I will be using them!
I was a bit disappointed, however, that the code focuses only on mentions and hashtags, and that no mention was made of parsing retweets. Mentions and hashtags are reasonably easy to pull out; RT syntax is a bit more varied (see boyd et al.‘s paper for some variants), and it would be nice to have some help here. An extra bonus would include a way of unifying old-style and new-style retweets. Nonetheless, it’s a good first step toward making it easier to deal with Twitter data.
I would also love to see a good encapsulation of the API for communicating with Twitter that deals gracefully with communication failures and call quotas, and of course improvements to the way search results are returned.