rblefftest: simple RBL false-positive test

drh.net

David Harris, dharris@

Posted September 27, 1999.

Abstract
This document describes a simple test that was done to determine if the RBL spam-prevention utilities returned too many false positives or not to be deployed into production on my server. I present the data. You may make your own choices.

Available for consulting work

I am currently available for consulting work. For consulting, I am happy to telecommute or come to your site for a period of time.

For more information about my skills and experience, view my statement of consulting capabilities or contact me for more information.

Back to davideous.com home.


Purpose of test

The purpose of the test was to determine how many false positives were identified by the various RBL spam-prevention filters out there. I simply captured copies of e-mail that would have been filtered by the RBL filtering services and examined them to see if they were really SPAM or not. No attempt was made to measure how many SPAM messages slipped through the filters, which would be a useful datum in determining how effective the filters are at catching SPAM.

Data

Here is the data I collected over seven days and 5,000 e-mail messages accepted through SMTP by my machine:

rbl.maps.vix.com

  • filtered 3 borderline spam messages. By borderline SPAM, I mean that they claimed to be corporate opt-in announcements, but the user many not have opted-in. One was from RealAudio and the other two were from PowerQuest Corporation.
relays.radparker.com
  • filtered 19 spam messages
  • filtered 26 legitimate non-spam messages, which break down as follows:
    • 8 typical legitimate non-spam messages
    • 11 messages from a list-serv using an open relay (the user did subscribe)
    • 7 messages from an about.com opt-in list-serv (this is not borderline spam because I know that user opted in)
dul.maps.vix.com
  • filtered 10 spam messages
  • filtered 12 legitimate non-spam messages
I admit that this test does not have a lot of datapoints, but I was able to look at the data and decide if I wanted to deploy an RBL system or not.

Other people's data

  • Qmail list discussion
    Some people on the qmail list volunteered their data about how many messages RBL was rejecting. This data does not show anything about the number of false positives received, but I've included it here because it is interesting.
If you have some data that you think would be useful here, please send it to me.

Conduct your own test

If you'd like to conduct your own test, here is the software that I created to run this test.

If you run your own test, I'll include your results here if you like.