Blog Archive: 2010

When is one>two and seven==eight?

on Comments (1)

So Google recently released the Google books N-gram viewer along with the datasets.

There’s been plenty of press about it, and the Science article based on this data is an interesting read.

I was trying to come up with a simple, yet insightful query. My initial trial was modernism,postmodernism which immediately had me wondering about hyphenation or the lack thereof…  In any case, the upshot seems to be that the use of the term postmodernism started 1978ish. Neat, though I think I won’t need to clear space for my Nobel Prize anytime soon.

I toyed a little bit with other terms like generation X which has an odd sort of bump in the graph around 1970. Not sure what’s up with that, though perhaps there’s some data collection artifacting as discussed in this article.  I wasn’t inclined to deep end on this and was happy enough to have my prior knowledge confirmed by noting that the use of “generation X” took off in the mid 1990’s.

My final trial was a bit more on the minimal side: one,two,three,four,five,six,seven,eight,nine,ten. There shouldn’t be any surprise here that “one” is more common than “two” is more common than “three”, is more common than “four”. It probably shouldn’t be a surprise that each succeeding number is less frequent by roughly a factor of 2.

Occurence of numbers in google books N-gram viewer

Google books n-gram viewer for numbers

Less intuitive (to me anyway) is that “ten” squeezes in front of “seven” and “eight” (OK, so maybe it’s a round number), “seven” and “eight” are basically tied, but even more odd is that before 1790 or so, the putative occurrence of “six” and “seven” were virtually non-existent.

Detail on number occurrences

Turns out it appears to be the same issue with the “medial S” that Danny Sullivan describes in greater detail in his post. In other words, it’s an artifact of OCR and an indication of the evolution of typography rather than the evolution of language.

One mystery solved; now why are “seven” and “eight” tied in frequency?

Kudos to Google for releasing the viewer and data.

What’s in your database?

on Comments (1)

If you work for a small or medium business, someone in your office needs to buy things.   Paperclips, computers, mailing envelopes, office furniture, etc.   If you work for a small or medium research lab, someone in your office needs to buy these same things, but someone also needs to buy more unusual stuff.   Twenty pounds of modeling clay.   A Sony Aibo.  Make that two.   Lots of different types of video encoding software and hardware.  Stuff like that.

At our research lab, I am often the person who does the actual purchasing of the strange items.   If I’m buying a computer from HP, I expect the process to be pretty straightforward.   If I’m buying industrial laser elements from Bob’s House-o’-Lasers, I expect complications.  Reality is often the other way around.  Since I’ve been doing this since the mid 1990’s, I’ve seen how technology has often made it easier and sometimes much harder to buy things, use things, and deal with problems.   I’m going to describe a few examples in this and later posts.  Just a warning that my bias is somewhat anti-technology – I joke that I’m a neo-luddite.

Continue Reading

Spam comments

on Comments (2)

I was going to wait until we hit 100 legitimate comments to make the math easier, but we are close, and many of our usual posters are at CHI, so I’ll report now.

Along with the 89 real comments we’ve gotten 2537 spam comments.   I’ll be generous and call that 3.4% real comments.

Continue Reading

Creating an iAbbreviation: SFMI


Fairly recently I became one of those iPhone types. You know the ones – gaze ever downwards, fingers poised to pinch or pick or tap-tap.  I love the thing, though I’m not sure I love what I’ve become with it. Continue Reading

Advice for researchers

on Comments (16)

The Princeton Companion to Mathematics, which came out just a few month ago, contains a wonderful short section entitled “Advice to a Young Mathematician” with advice from five eminent mathematicians. I was in the need of inspiration this weekend, and found some in these personal statements. Below the fold you will find a few excerpts applicable to any researcher of any age.

Readers: Please help me and other readers of this blog by posting in the comments section pointers to your favorite sources of research advice.

Continue Reading

Kumo – searching for a name?

on Comments (1)

A while back, I saw some reports that Microsoft was using Kumo as the name for an experimental search system.  Recently there have been more reports that this is the case, or that perhaps the name will be used for some other product.  It has been my experience that the deployment of Kumos, whether they be clouds or spiders, needs to be carefully planned.

Continue Reading