Sunday, 9 August 2009

Too Many Mammals!

Disclaimer: No mammals were hurt in the writing of this blog post!

For quite a while I've been looking for the opportunity to have a little fun with the papers I write at work. They usually have very boring titles and the content is often not of interest to anyone outside the small research community that attends the conference. So the chance to brighten up a paper was not to be missed.

I did some work late last year with three colleagues on improving the diversity of the output of automatic term recognition systems -- sounds boring right? Basically we were applying a re-ranking approach to a list of terms so that the top x terms were more diverse than before (clear as mud I know). Anyway, we were using a corpus of Wikipedia documents about wildlife so many of the terms were the names of animals, birds, fish etc. In the first version of the software the list seemed to be heavily weighted towards whales and so I was going to be really controversial and suggest that there were Too Many Whales. Unfortunately after some fine tuning of the baseline things settled down a bit and so the paper title became Too Many Mammals -- to be completely truthful it became Too Many Mammals: Improving the Diversity of Automatically Recognized Terms, which will be presented at RANLP in September.

I'm still aiming to improve on this as I know that it is nowhere near as good as the classic paper on the Big Bang Theory by Alpher, Bethe and Gamow or the fantastic question answering paper A Sys Called Quanda.
