Subscribe to murtworld by Email



Smart reading assistant for RSS feeds

Sunday, March 12, 2006

Since I am now having RSS feeds routed to my email, I thought I'd run an experiment to set up a smart reading assistant.

The idea is to use POPFile, a Bayesian mail filter, to classify my feed messages into three categories: high interest, medium interest, and boring. High interest messages are ones I'd probably like to read as they come in; medium interest messages are those I'd like to read when I have time; boring messages are ones I won't miss reading at all.

POPFile learns very fast when dealing with just two buckets, inbox and spam. You can start getting decent results within just a couple of days. Of course, the distinction between good messages and spam is usually pretty clear.

Any time you add new buckets to POPFile, accuracy is going to go down at first as it adjusts to the new scheme. It also seems that my distinction between highly interesting and mildly interesting messages is going to to take longer to train. It's pretty much got the "boring" category working after about a week of training.

I'm not sure yet if this will work for the long term, but it seems like a worthwhile experiment to see if my reading list can be automatically prioritized.

Posted by murt at 11:44 AM


Anonymous Anonymous said...
so, what were your results?

I'm considering doing this myself and am wondering if there's any use in trying it.
9/07/2006 6:41 PM  

Blogger murt said...
The results were somewhat mixed - my initial attempt at having three categories (high interest, medium interest, and boring) did not go so well. It may have been different if I had kept training longer, but the differences between high and medium interest were too subtle to get much accuracy.

When I changed it to simply interesting and boring, this was more effective, in that I got closer to the accuracy I was expecting.

However, my final decision was that it just wasn't worth it for me at this time. Instead I focused on just cutting out (unsubscribing) any feeds that didn't have a high level of content that interested me.
9/07/2006 7:25 PM  

Post a Comment

Links to this post:

Create a Link  |  Home