Testing "nannyware" tools for fitlering URLs
the expensive (websense, and the obscure (smartfilter.
To test for false-positives, I have a Perl script with http client behavior which will read a file of URLs and attempt each one with realistic client-like headers, then examine the result to see if the request was successful or blocked.
If you have a budget, there are a number of good tools for running
load tests against web sites; The best are actually meant for QA'ing your web server and site backend, but work just as well to test web filtering -- a web page being blocked by a filter looks a lot like a web server failure, timing out, returning a HTTP error resutl code, or returning a bogus "blocked" web page.
URL filtering is a lot easy to implement and test when you force all
desktop clients to make their HTTP/HTTPS requests via an explicitly
configured proxy. You can't go out directly to internet IP addresses on TCP/80 or TCP/443 or any other port -- clients MUST make the request via the proxy, and the proxy knows how to check with the URL filtering software.
- Smartfilter hooks directly into the proxy, so can only be as fast as the proxy itself. Since it is only doing URL lookups, it is quite fast.
- Dansguardian wants to inspect the page itself, and can become very slow under load. It is possible to throw hardware at a software problem, if you run this on a fast enough machine you don't notice the lag quite so much.
- Websense is normally deployed as a sniffer, where it just inspects traffic passing by. This is useful if you need a "fail open" environment where a crashed filter doesn't just kill all web access. More on Websense later this week.