August 2007

Channel 4 screened Enemies of Reason, another Dawkins mini-series, on Monday night. Slaves to Superstition was the first of a two-parter; the second, The Irrational Health Service is on Monday night at 8pm. If you missed the first one, or are foreign, you can see it on Google Video, or get it from BitTorrent.

Having dealt with religion in Root of All Evil?, Dawkins has turned his attention to astrology, spiritualism, dowsing and suchlike; the sort of stuff which regular readers of Ben Goldacre’s Bad Science column know as “woo-woo”.

As Charlie Brooker’s excellent review says, this time round Dawkins seems to have toned down the outspokenness which gets him a bad reputation in some quarters. Sometimes I found myself wishing he’d been a little more direct, but his tactic of sitting quietly while someone tried to give him a psychic reading (or whatever) and politely pointing out where they were getting it wrong made his opponents look silly without making him look mean-spirited, so perhaps it was for the best. As a commenter on James Randi’s forum said, “There should be at least one program a week where Dawkins stares at people while they try to explain their woo.”

The first part of the programme mostly consists of that sort of thing, of one charlatan (or sincerely deluded practitioner) after another facing Dawkins’s quiet questioning. I thought of it as shooting fish in a barrel, but maybe there really are people out there who don’t know that there’s bugger all evidence for woo-woo. scribb1e pointed out that most of those people probably weren’t watching, but he did have a prime-time slot. We can but hope, I suppose.

Dawkins talks to Derren Brown about mediums and cold reading. You can see Brown’s classic illustration on how psychics work on YouTube, although as ever with Brown, be aware that sometimes his “explanations” of how he did a trick are themselves misdirection. Nevertheless, Brown claims no special powers and yet is able to do this sort of thing. Brown rightly points out that there’s something particularly sleazy about the medium industry, as it feeds of the grief of the bereaved.

Dawkins is genuinely concerned that woo-woo is supplanting science, and intersperses his examination of the woo with paeans to science and to the wonders of the natural world. He talks about the decline in people studying science at A-level and university, and of the closure of science departments at some universities (does anyone know how common this is? It’s a worrying trend, if it’s true). Perhaps responding to critics who call him a fundamentalist, he says “I’m often asked how I know that there isn’t a spirit world or psychic clairvoyance. Well, I don’t. It seems improbable, but unlike the fixed worldviews of mystical faith, science is always open to new possibilities.” He follows this up with the story of the discovery of echo-location in bats, a relatively recent example of evidence causing scientific theories to change.

To illustrate the sort of evidence he’s after, Dawkins shows a double-blind trial of dowsers, who are asked to identify which of some sealed boxes contain bottles of water and which contain bottles of sand. After they all fail to do better than random guesses would, their denial in the face of evidence leads into the final part of the programme, where Dawkins questions why these people continue to believe in their abilities.

He settles on the same sort of explanations which some evolutionists have advanced for religions, namely that we are good at spotting patterns and sometimes do so when the patterns aren’t real. Skinner’s superstitious pigeons are an example of the sort of thing he means. We have cognitive and perceptual glitches (see the “Slight of Mind” section in the endnotes to Peter Watts’s Blindsight, for example). These make us vulnerable to conspiracy theories of the sort which, Dawkins point out, find their natural home on the Internet, in the many pages which insist that Armstrong never went to the Moon, or that “Jews did WTC”. In the face of this, how can we know anything at all? Dawkins seems to get close to tripping over something like C.S. Lewis’s arguments on the rationality of naturalism (as a character in Blindsight says, our brains may delude us if that has more survival value than showing us the truth).

In the end though, Dawkins is a pragmatist. He points of the successes of the scientific method as evidence that it works, and to the MMR scandal as an example of what happens when the careful gathering of evidence is ignored in favour of personal feelings. Our glitches may cause us to make mistakes, but we have to do the best we can. Dawkins speaks of the gradual build up of evidence for echo-location in bats, contrasting it with the fleeting evidence for the paranormal. The careful steps of science may be frustratingly slow, but make us less likely to fall into the cracks in our minds.

Oh my. The feminist bloggers have taken on the Internet Hate Machine known as Anonymous. Encyclopedia Dramatica (very NSFW and extremely offensive, don’t blame me if you get fired) has the scoop on the post which might have been from Biting Beaver that started it all, as well as the on-going aftermath.

Some of the commenters on the feminist blogs get it, and actually tell them what’s going on and how to weather the raids (ilyka, or Holly in this thread). Luckily for Anonymous, the rest of the commenters either ignore them or jump on them and accuse them of misogyny, while beginning the countdown which will end in them reaching Defcon 1 and launching the e-lawyers against the Patriarchy. Hint: the only winning move is not to play.

It’s like the Internet perfect storm. Who brought popcorn?

This is an article about the sort of thing I spend my days doing. Usually, I can’t talk about that, for reasons of commercial confidentiality, however, this particular case is completely unrelated to anything my employer sells, so I should be OK. I’ve tried to explain things sufficiently well that someone non-technical can get it. Hopefully it’s not too dull or incomprehensible.

First, we need some technical background…

The Naming of the Parts

A graph is a bunch of things connected by lines. The CDC Snog Graph is an example of what I mean. The things (representing people, in this case) are known as vertices or nodes, and the lines (representing joyous sharing experiences of some sort, in this case) are known as edges.

The lines on the Snog Graph don’t have a direction, so we call it an undirected graph. If you add a direction to each edge, you get a directed graph, which we can represent by putting arrows on the lines. Friendship on Facebook can be represented as an undirected graph (where the nodes are people and the edges mean “is friends with”), because all friendships are mutual, but LiveJournal friendships need a directed graph to represent them, since I can be your friend without you being mine, and vice versa.

A graph is cyclic if there’s a way to walk along edges starting at one node (following the arrows if the edges are directed) and get back to that node again without walking the same edge more than once. The Snog Graph is cyclic, as is the graph of friendships on a Facebook or LJ (trivially so on LJ, where it’s possible to friend yourself. Regular dancers might consider how this applies to their graph, vis-a-vis people who consider avoiding other dancers an optional extra). Graphs where you cannot do this are called acyclic.

Usenet news: you tell kids today that, and they don’t believe you

Usenet is an electronic discussion system which pre-dates all this newfangled World Wide Web nonsense. It distributes messages which are known as articles. It has some desirable features that web-based discussion forums often lack, like comment threading and remembering what you’ve already read. These days, people think Usenet is owned by Google, but in fact it’s a distributed system, with no central server (another advantage over LJ). When you want to talk to Usenet without using Google, you run a client program on your own machine, which talks to your local server. Your local server forwards your article to servers it knows about, which then forward it to servers they know about, and so on (the path of an article through the servers forms a directed acyclic graph, in fact). When you want to read other people’s articles, your client program fetches them from your local server.

Each article is posted to one or more groups, which are like communities on LJ. Note that, unlike LJ, the same article can be posted to more than one group without having to cut and paste it: the same article exists in each group it is cross-posted to.

On each server, articles have a number within each group. The first article to arrive in a group has number 1, the second article number 2, and so on. Cross-posted articles have more than one number, one for each group the article appears in.

Article numbers differ between servers, because the order of arrival depends on the path the article has taken to reach the server, but since your client program only talks to your local Usenet server, it usually refers to articles by their number (there’s also a unique string of letters and numbers which identifies the article, which is how servers know which ones they’ve seen already, but that’s not important right now). Remembering which articles you have read is then just a matter of storing some ranges of numbers for each group (so your client might remember that you have read articles 1-100,243-299 and 342-400, say).

The problem

We wanted to de-commission a Usenet server and move its articles to another server. The servers run different and incompatible software, so the most obvious way to get articles from one server to another is to post them like a client or another server would.

The new server is supposed to be a drop-in replacement for the old one, so we can’t change the numbers or the existing client programs will get confused and think you’ve read articles you haven’t, or vice versa. So you can’t just grab all the articles from the old server any old how and post them to the new one, because they’ll be jumbled up. Unfortunately, the old server has no way of directly telling us the precise order that articles arrived in, though it will tell us article numbers within each group.

“Aha!”, we think. Since the order of arrival matters, we’ll grab the articles in order from one group on the old server, post them in that order on the new server, and move on to the next group, where we’ll do the same, and so on until we’ve done all the groups.

This idea is ruined by cross-posts, because they have more than one article number associated with them. If a cross-posted article is number 10 in one group and number 3 in another, you’d better post the first 9 articles to the first group, the first 2 to the second group, and then you can make the cross-post. But maybe there’s a cross-post to a third group in those 9 articles, so you’ll need to get that group up to date before you can post one of those. How do you work out what order to post the articles in?

Obscure Unix tools to the rescue

You might have guessed that the wibbling about graphs wasn’t entirely tangential to all this. You can draw a graph of the problem. Each article is a node. For each group, connect each article to the one with the next number up by a directed edge (in this case, the direction of the arrow means “must be posted before”). You’ve drawn yourself a directed acyclic (since the article numbers only increase) graph. The cross-posted articles are then nodes with more than one edge coming into them.

One of the wizards at work realised this, and also pointed out that there’s a standard Unix tool for converting such graphs into a list of nodes whose order preserves the order implied by the arrows, a procedure which is known as a topological sort. The tool’s called tsort. From there, it’s just a matter of representing each article in the way tsort understands. When you do that, tsort gives you an order in which you can post the articles from the old server to the new server so they’ll be given the same numbers on the new server as they had on the old one.

In which I’m not as knowledgeable as the wizards

The way I thought of to do it was to write my own program. You pick a group, and start trying to post articles from the old to the new server in article number order. If they’re not cross-posted, you just keep going. When you hit a cross-posted article, you switch to each group it is cross-posted to in turn, and try to post all the articles before the cross-post to each of those groups. If while you’re doing that, you hit another cross-post, you remember where you got up to and do exactly the same thing with the groups that second cross-post is posted to, and when you’ve done that, you switch back to the groups for the first cross-post. I had a working version of this at about the same time as someone pointed out that tsort does the job 🙂

(Aside for the techies: this can be done using recursion. It looks like this is pretty much equivalent to one of the ways of doing a topological sort, namely the depth-first search mentioned on the Wikipedia page).

So I learned about topological sorts by re-inventing one. Most of the problems I deal with aren’t in fact cases of well-known problems (although I would say that if I was merely bad at recognising the fact, wouldn’t I?), but when they are, recognising that can save a lot of time.

And that’s what I do when I’m not on holiday.

I have a message between two people who aren’t me (and aren’t known to me, don’t worry!) sat in both my Facebook Inbox and Sent Messages. The message was sent at 3:04 pm today, apparently.

This does not appear to be the problem mentioned in The Register recently, whose symptoms were that people would see whole pages belonging to other users. I can see my Inbox with messages people have sent to me, but I can see a message between these two people in it. I’ve sent them a message to ask whether they meant to message me, but right now, that looks unlikely.

A while back I wrote about some of the advantages of centralisation for keeping out spam and making new features available quickly. The downside, as livredor pointed out, is that Facebook is a single point of failure.

Could this happen with standard Internet email? Yes: I could mis-address the mail (less likely if I use an address book rather than typing an address by hand), or the recipient’s server could mis-deliver it (usually, if my outbound server hands my mail to the wrong remote server, the remote end will reject it). Are popular mail servers more reliable than Facebook? Almost certainly, I’d say. Lots of people are on Facebook, but I reckon the volume of Internet email is still orders of magnitude greater than that of Facebook messages. The email servers handling that volume are so reliable that I’ve never heard of a case of mis-delivered (as opposed to mis-addressed or lost) email. Google Groups doesn’t seem to have done so either, or at least, the evidence is uncertain. The Usenet postings I found talking about mis-delivered mail seemed to be explained by the little-known fact that Internet email is like a letter: there’s an envelope destination address used to deliver it, as well as the “Dear Fred” saluation you see in the To: header or Cc: header. I had a friend at university who used to send out party invites which looked as if they been addressed to president@whitehouse.gov and god@heaven.org. Anyway…

Don’t send anything sensitive in Facebook messages, will you?

Edited to add: The message has gone again now. I’ve used the help form to tell Facebook about it, so we’ll see what they say.