April 2007

You may turn over your papers and start now.

You are a product tester and frequently bring your work home. Yesterday, while dressed in a flame resistant suit (up to 3,000 degrees) and carrying the latest model fire extinguisher, you discover your neighbor’s house is on fire. As the flames quickly spread, you stand and watch your neighbor’s new baby burn to death. Which of the following best describes your behavior?

  1. All-powerful
  2. All-knowing
  3. All-loving
  4. Mysterious

One of the better questions from Religion 101: Final Exam.

For a while now, I’ve been getting comments on my LiveJournal which apparently aren’t spam, but rather are questions which are totally out of context. For instance, I got one the other day which said “Hi. I find forum about work and travel. Where can I to see it?”

I recently got some more comment spam advertising something called XRumer, a clever and nasty program for spamming bulletin boards and other forums (like LJ), which is brought to us by some evil Russians (“No Meester Bond, I expect you to die”). One of the things the authors claim it can do is a crude form of astroturfing. They say you can configure it to post a comment asking about something, and response apparently from another user mentioning the site you actually want to advertise. It looks like this feature doesn’t quite work, and that the questions I’ve been seeing are examples of it misfiring. Mystery solved.

The spammers seem to favour certain entries of mine, so I’m screening anonymous comments on those entries (and on this one too, since I imagine it might attract undesirables). I don’t want to do that for my entire journal, as I get comments from people who aren’t on LJ but who say worthwhile things. In an ideal world, the way round this would be OpenID, but that’s not in widespread use yet, possibly because people who have an OpenID often don’t know they do. [Attention LJ users: you have an OpenID. Congrats. You’ve got a Jabber instant messaging account, too. See how good bradfitz is to you?]

A system which allows easy communication between two people who have no previous connection to each other is susceptible to spam. The trick is to keep this desirable feature while not being buried in junk (you could go the other way and remove this feature, of course, as many some IM users have, or make a virtue of it with social networking sites, but that’s not really an option for public blogs). Anything an ordinary user might to do create an identity, a spammer can do too, so cryptographic certificates aren’t a magical solution. Legislation doesn’t help, because the police don’t care and anyhow, spammers are in Wild West states like China or Russia, or at least run front operations there.

Most spam is still sent via email. Email spammers have been subject to an evolutionary arms race. The remaining effective spammers are bright and totally amoral. They’ll hijack millions of other people’s computers to send their spam or even to host the website they’re advertising, making it hard for blacklists to keep up (and they’ll use these computers to flood centralised blacklist sites with traffic in an attempt to knock them off the net). They’ll vary the text they use, to defeat schemes which detect the same posting lots of times. They’ll use images rather than text, or simply links to those images, to defeat textual analysis. You can bet that blog spammers will learn from this (some of them are probably email spammers too).

What’s working for email spam, and will similar ideas work for blog spam?

  • Banning mail sent directly from consumer ISP connections is the single most effective thing I do (you can do this with the Spamhaus PBL and with a few checks for generic rDNS to catch what the PBL misses). You can’t do that with blog comments, as spam or not, they almost all come from consumer ISP connections.

  • Banning mail sent from IPs which are known sources of spam is also effective. You can do that with blog comments, but you either need to be big enough to generate your own list (as LJ might be) or have the resources to run a centralised list like Spamhaus (which will be attacked by spammers). There are currently no IP blacklists devoted to blog spamming, as far as I know, although some spam comments I’ve seen came from IPs which were in the Spamhaus XBL.

  • Filtering on ways in which spamming programs differ from legitimate SMTP clients (greylisting, greet pause) is currently effective, but only as long as these methods don’t become so widespread that it’s worth the spammers’ while to look more like a legitimate sender. Still, this doesn’t seem that likely. Incompetent admins aren’t in short supply, and I don’t have to outrun the bear, only outrun them. This sounds promising against blog spammers. Apparently simple minded schemes are pretty effective.


What else can we do with a website that we can’t do on email?

  • CAPTCHAs are popular, but a bit of a bugger if you’re blind. The evil Russians claim to have defeated most of the deployed ones which use obscured letters, though that still leaves the “click on the picture of a cat” variant.

  • Proof-of-work or hashcash schemes are currently very effective, suggesting that blog spammers don’t yet have the huge amounts of stolen computing resources available to email spammers, or that they don’t have the knowledge to implement the hashcash algorithm in their spamming software. By using proof-of-work, we can at least drive the weak blog spammers to the wall.

    You can think of proof-of-work as a variant on the tactic of differentiating spam programs from real humans. Spammers can defeat simple-minded checks on how long a user has been reading a page before commenting without slowing their spamming rate up by much (how to do this is left as an exercise to the prospective spammer), but if a web browser has to do a computation which takes a fixed time and send the result along with the comment, the spammers have to slow down or do the work in parallel on many computers. If you can work out a way of doing the calculation in the background as the user looks at your page and writes their comment, so much the better. If you can dynamically generate the code you send to the browser to make it prove it’s done some work, you stop the spammers writing something equivalent in a real programming language and force them to run it in Java or Javascript. That’d really show them who’s boss.

    This hurts people who’ve turned off Javascript or Java, but it’s time for those dinosaurs to join the web 2.0 world, right?

I guess most people on LiveJournal saw their proposal to turn LJ into MySpace (lj_dirtycache is particularly good fun for anyone who’s ever looked at bands’ sites on MySpace). What’s funny about LJ’s effort is that LJ clearly understand what is going to provoke their users to apoplectic rage until they realise they’ve been had. By comparison, Facebook was a bit lame, merely offering to send someone round to physically poke the people you “poked” on Facebook. They should have announced some variant on the Facebook feed to get all the “OMG UR HELPING STALKERS” people up in arms again.

Google announced TISP, their IP-round-the-U-bend service, as well as Gmail Paper, for those who prefer their email on paper. Slashdot had a collection of unconvincing stories. Poor show.

Disappointingly, the IETF don’t seem to have done anything very exciting lately, at least nothing to match the seminal Standard for the transmission of IP datagrams on Avian Carriers.

Finally, robhu announced he’d reconverted to Christianity. It initially seemed he’d converted to a fluffy sort of Christianity in which God is a metaphor for the good which, in a very real sense, is in us all. However, in the discussion thread which followed, it soon became clear he’d reverted to his old evangelical habits, informing me that I was blinded by the devil and was “just as much of a fundamentalist as Richard Hawkings“. His later post contains the de-brief, in which it is revealed that I was in on it from shortly after he’d posted the entry. robhu used some excellent observational humour to convincingly impersonate evangelical responses to my ultra-atheist straight man.

In summary, burr86 and robhu jointly win the Internet. Tonight, we dine in Hell.