Subscribe to murtworld by Email



Smart reading assistant for RSS feeds

Sunday, March 12, 2006

Since I am now having RSS feeds routed to my email, I thought I'd run an experiment to set up a smart reading assistant.

The idea is to use POPFile, a Bayesian mail filter, to classify my feed messages into three categories: high interest, medium interest, and boring. High interest messages are ones I'd probably like to read as they come in; medium interest messages are those I'd like to read when I have time; boring messages are ones I won't miss reading at all.

POPFile learns very fast when dealing with just two buckets, inbox and spam. You can start getting decent results within just a couple of days. Of course, the distinction between good messages and spam is usually pretty clear.

Any time you add new buckets to POPFile, accuracy is going to go down at first as it adjusts to the new scheme. It also seems that my distinction between highly interesting and mildly interesting messages is going to to take longer to train. It's pretty much got the "boring" category working after about a week of training.

I'm not sure yet if this will work for the long term, but it seems like a worthwhile experiment to see if my reading list can be automatically prioritized.

Posted by murt at 11:44 AM  |  2 comments  |  links to this post