programming

Here’s a C language technique which I’d not seen before starting my current job, which I think deserves to be better known.

Let’s say you have a bunch of things, to which you’ve given sequential integer IDs using an enum.

  enum my_things {
  thing_this,
  thing_that,
  thing_other,
  num_things
  };


For each of those things, you want to store some other information in various arrays, and index into the array using the enumerated value to pull that information out. Examples might be a parser (where the enums might identify operators, and an array might contain pointers to functions to implement them) or a message handling system (where the enums might identify types of message, and the array might contain pointers to functions to handle the messages).

void (*thing_fns[num_things])(void) = {
  do_this,
  do_that,
  do_other
  };
  
  void do_thing(enum my_things current_thing)
  {
  thing_fns[current_thing]();
  }
  
  void do_this(void)
  {
  /* wibble */
  }
  /* etc, etc. */
  


The problem comes when you want to add a new thing. You’re sizing the array using the enum, so it’s hard to get that wrong. But if you have a lot of things, sooner or later someone will add a new my_things member to the middle of the list (perhaps you’re listing your things in some order which makes sense in the context), go to update thing_fns, miscount how far down the definition thing_fns they’ve gone and screw up the ordering of the function pointers. Now the wrong function handles some (but not all) of the things, with hilarious results.

The solution is to maintain a single list of your things, and transform it into definitions for the enum and the associated arrays. You don’t need to write fancy code generator scripts for this, you can use the pre-processor. Wikipedia explains how.

(Note that Wikipedia’s example does away with sizing the array using the enum. They need to have a final NULL entry with no comma on the end to keep the compiler happy, so they’d end up writing the clunky num_things + 1. Now their array is sized correctly anyway.)

This chap goes into it in more detail, and also illustrates that your list of things doesn’t have to be in a seperate file.

Re-writing my example to use X-macros is left as an exercise for the reader. 🙂

Edited To Add: gjm11 points out that two of his friends have posted about this recently. gareth_rees has a more detailed example and some discussion.

I recently finished Andrew Hodges’s Alan Turing: the Enigma. The book is a definitive account of Turing’s life and work. In some places I found the level of detail overwhelming, but in others I admired the way Hodges uses his obviously extensive research to evoke the places and people in Turing’s life. The book is well worth reading for the perspective it gives on Turing, something which is absent from other, purely technical, accounts of his work.

Hodges portrays Turing as a man ahead of time, conceiving of the Turing machine as a thought experiment before the invention of the general purpose electronic computer, and inventing the Turing test when computing was in its infancy. Turing’s naivete was reflected in his refusal to accept what other people said could be done, but also in a lack of interest in the politics of his post-war work on computers and of his own homosexuality. A proto-geek, Turing was prickly, odd, and seemed to expect that the facts alone, when shown to people, would lead them to the same conclusions as he found.

Turing’s suicide is placed in the context of a move from regarding homosexuality as criminal to regarding it as a medical problem, and an increasing suspicion of homosexuals in classified government work. Hodges seems to conclude that Turing felt he had nowhere else to go.

You can’t help but wonder what else Turing might have accomplished had he not committed suicide. Greg Egan’s short, Oracle, is an entertaining what-if story, which also features a character very obviously based on C.S. Lewis. What if Turing had received help from a friend? It’s a pity that in reality there was no-one to lead him out of his cage.

Some of you might have played with emulators which let your PC pretend to be the classic computers of your youth (the BBC Micro, Acorn Electron, Commodore 64 and so on). The emulators work by simulating the processors in those machines on your PC, so you end up with Chuckie Egg running on a virtual BBC Micro running on a PC running on whatever the underlying physics of the universe happen to be.

robhu found a puzzle based on a similar idea. The International Conference on Functional Programming have an annual contest. Last year’s was particularly fun:

In 1967, during excavation for the construction of a new shopping center in Monroeville, Pennsylvania, workers uncovered a vault containing a cache of ancient scrolls. Most were severely damaged, but those that could be recovered confirmed the existence of a secret society long suspected to have been active in the region around the year 200 BC… Among the documents found intact in the Monroeville collection was a lengthy codex, written in no known language and inscribed with superhuman precision.

You’re given the codex, and the specification for a processor used by the Cult of the Bound Variable. Once you’ve written an emulator for that processor, you’re off (unless, like me, you write the emulator in a few hours and spend half a day corrupting the files from the contest website with your editor and wondering why it isn’t working). There’s an ancient Unix system, complete with ancient spam, and various puzzles which I’ve not had a chance to look at yet.

According to reports on the contest, lots of people tried to write the emulator in high level languages where it ran like treacle. A C implementation like mine (which may count as a spoiler for people who want to do it themselves) without any namby-pamby array bounds checking nonsense works fast enough that you can’t tell you’re not talking to a real ancient Unix system. There’s life in the C language yet.

I’m lost in admiration for the people who produced the codex. It required writing a compiler backend targeting the fictional processor, and implementing various other languages on top of that, as well as coming up with the puzzles. I’m not sure how far I’ll get with the puzzles before my ability or patience expires, as I’m a hacker rather than a computer scientist, but still, the thing is amazingly cool. It’s a time sink for geeks, so I pass it on to those of you who, like me, will waste many days on it. Have fun.

A couple of shiny new bits of software have come out in the last few weeks. Both of them are at version 7, for some reason.

Inform 7

Inform 7 is the latest version of Inform, the language for creating interactive fiction. The interesting thing about it is that Inform 7 programs are written in a subset of English:

The wood-slatted crate is in the Gazebo. The crate is a container. Instead of taking the crate, say “It’s far too heavy to lift.”

Inform is not capable of understanding arbitrary written English, but has a set of sentence forms it understands, and some inference rules built in (for example, if you tell it that “Mr Brown wears a hat”, it will infer that Mr Brown is a person).

scribb1e pointed out that this makes the work of writing the story similar to playing it. That could turn out to be a bad thing: most programming languages are so stylised and full of random punctuation symbols that programmers realise they’re not writing English and don’t try writing arbitrary English text in the hope of being understood by the computer. Even for people who understand Inform isn’t actually intelligent and that they have to write in Inform’s dialect to be understood, writing in something close to English will make it harder to remember to restrict their vocabulary. At worst, it could become a game of guess the verb, which would be painful (as opposed to a game of Guess The Verb, which I thought was fun, especially the Old Man River bit in the help).

However, unlike playing a game, looking at the excellent and witty online help doesn’t risk spoiling your fun. Since it’s all English, it’s easy to crib paragraphs of text from the examples and adapt them to your own works. Hopefully, writing the games in English will enable more people to create them without feeling that they have to be expert programmers. They’ll still have to think like a programmer, but won’t face the intimidating prospect of curly brackets.

Inform 7 itself isn’t just the compiler, it’s is a complete suite of tools for writing, testing and releasing interactive fiction, the IF equivalent of an Integrated Development Environment. It’s rather nice (although not yet available for anything other than Windows and Mac OS, because of the difficulty of getting the graphical stuff going on a variety of platforms).

Vim 7

I use the Vim editor, which is the old Unix vi with all the features you want from a modern programmer’s editor bolted on. New in Vim 7 there’s a spelling checker, “IntellisenseTM” style context-sensitive completion of names in code, and tabbed windows (no software is complete without tabbed windows these days).

The completion stuff is particularly useful, as it now pops up a menu of possible completions which you can select from with the cursor keys, and appears to be trying harder to find completions from nearby files in the background as you’re typing (I’ve not quite worked out what it’s doing yet, it’s reaching the stage where it’s just magic). Completion isn’t just for programmers, of course: when I’m typing an email, if I find myself using the same, long, word more than once, typing the initial letters and then letting Vim complete it is a boon.

I’ve finally got around to writing the Greasemonkey script which I’ve long been threatening.

What it does

The script remembers which comments you’ve seen on LJ (or Dreamwidth) and helps you navigate to new comments. That’s right, I’m finally dragging LiveJournal kicking and screaming into the 1980s.

If you’re on an entry page, pressing “n” skips you to the next new comment, and “p” skips to the previous one. If the style has an “Expand” link, moving to an unexpanded comment with these keys will also expand the thread. If the style has a permanent link or a reply link for each comment in that comment’s header or footer, the script inserts another link next to it, labelled “NEW”. That link shows you that the comment is new at a glance. Clicking the “NEW” link selects the comment so that pressing “n” will go to the next comment from there. On some styles, the currently selected comment will be outlined with a dotted line.

On a journal or friends page, the script will also add the number of new comments to the link text, so that, say, “15 comments” becomes “15 comments (10 new)”, and enable the “n” and “p” keys to move between entries which have new comments, and the “Enter” key to view the selected entry. This only works if you’re looking at a journal which adds “nc=N” to entry links to say there are N comments on an entry (LJ can do this as a trick to confuse your browser’s history function into thinking you’ve not visited that entry whenever there are new comments). If you want to turn this on for your journal then ensure you’re logged in, visit this page, check the box which says “Add &nc=xx to comment URLs” and hit the “Save” button.

How it works

You don’t need to understand this section to use the script. If you don’t care about programming, skip to the next part.

<lj-cut text=”Gory details”> LJ makes it a total pig to do this sort of thing: there’s so little uniformity in journal styles that getting a script like this to work for all of them is impossible. It’s fair enough that LJ allows people to customise their journal’s appearance, but there aren’t even standardised CSS class names for stuff. Not that I’m bitter. So, what the script does is look for anchor tags of the form <a name="tNNNN"> or elements with an id attribute of ljcmtNNNN or tNNNN. NNNN is the comment number, which seems to be unique for each comment on a given user’s journal. It then looks for the permanent link to that comment, which is usually to be found in the header of the comment (or footer, in my current style), and adds a “New” link after that. So, new comments are marked with a link to the next new comment.

The upshot of all this is that if you’re reading a journal with a style which doesn’t use either anchor tags or elements with the given id for all comments, the script won’t work correctly. If the style doesn’t provide each comment with a permanent link in the comment’s header, the comment won’t be marked with a “New” link. Such is life. Please don’t ask me for special case changes to make it work with LJ’s many horribly customised journals. Pick a sensible style of your own and learn to use “style=mine” instead. There’s even another Greasemonkey userscript which will help. On the other hand, if there’s a large class of the standard styles for which it doesn’t work, tell me and I’ll have a look at it.

Using it

If you want to use it, you will need:

  • Firefox, the web browser, version 1.5 or later.
  • Greasemonkey, the extension which lets people write little bits of Javascript to run on certain pages.
  • LJ New Comments, which is what I’ve imaginatively entitled my script. If the userscripts site is down again, you can find a copy on my site.
  • Your flask of weak lemon drink.

After you’ve installed all of the above, visit an entry on LJ and marvel at the “NEW” links on all the new comments (which will be all of them at this point, as the script wasn’t around previously to remember which ones you’d seen before). See above for operating instructions.

Privacy

Note that the script stores a Firefox preference key for each journal entry you visit, listing the IDs of the comments it finds there. The script doesn’t let the database grow without limit: when the script has seen 500 entries, it starts to drop the history for the entries you’ve not visited recently.

Clearing the browser’s history doesn’t affect the script’s list of visited entries. Thus your visits to polybdsmfurries will be recorded for posterity, even if you clear the browser’s history. You can wipe the entire history by using the “Manage User Scripts” entry on the Tools menu to delete the script and its associated preferences (you can re-install it afterwards, but you must clear out the preferences for it to delete the history).

The script does not record the contents of any entry or comment. The script does not transmit any information to LJ or any other website, it merely acts on what it sees when you request journal entries.

Your questions

I’ve given this entry as the homepage for the script on Userscripts.org. That means this entry is intended to serve as a repository for questions about the script, so if you’ve got a question, comment here. I prefer this to commenting on my other entries or to emailing me, unless you already know me. Ta.

To keep up to date with new releases of my greasemonkey scripts, track the tag “greasemonkey” on my journal. This link should enable you to subscribe to that tag and get notified when I post a new entry about greasemonkey scripts.

Revision history

2006-01-02, version 0.1: First version.

2006-01-03, version 0.2: Added the “p” key. Used javascript to move between comments so doing so does not pollute the browser’s history. Coped with the id=ljcmtNNNN way of marking comments. Made “n” and “p” keys work even in the absence of permalinks on each comment.

2006-01-04, version 0.3: Apparently you can have id=tNNNN, too.

2006-01-04, version 0.4: Broke 0.3, fixed it again. I hope.

2006-01-19, version 0.5: Updated to cope with LJ’s new URL formats. Changed how comments are stored internally so that the database does not grow without limit: the script now remembers comments for the last 500 entries you visited, and forgets the entries you’ve visited least. Also added “New” marker based on reply link as well as thread link, for styles which don’t have a thread link for every comment.

2006-01-19, version 0.6: Convert dashes I find in URLs to underscores internally, to preserve access to history from older versions of the script before LJ’s URL change.

2006-02-09, version 0.7: Work around the fact that Firefox leaks memory like a sieve. Never display negative number of new comments. Change licence to MIT as GPL is overkill for this script.

2006-02-09, version 0.8: There was a bug in the workaround code I got off the Greasemonkey mailing list. Fixed that.

2006-06-04, version 0.9: Enabled the “n” and “p” keys on the friends/journal view. Added the box around the current comment.

2007-02-20, version 1.0, baby: Try harder to draw a box around the current new comment. Applied legolas‘s fix for pressing CTRL at same time as the N or P keys (see comments).

2008-03-31, version 1.1: Make it work faster on entries with lots of comments. Altered behaviour of “NEW” link so it now selects the comment you’re clicking on, as that makes more sense.

2008-09-24, version 1.2: Support Russian keyboards thanks to mumi_0, make threads expand.

2009-01-27, version 1.3: Support for independentminds journals.

2009-05-04, version 1.4: Support for Dreamwidth.

2009-09-22, version 1.5: Amend support for Dreamwidth.

2010-08-09, version 1.6: Made syndicated journals work.

Some while ago, Mark Pilgrim did a post on cool software tools he couldn’t do without. In the absence of any pending rants about religion, here’s mine. Probably only of interest to fellow geeks, so cut for length.
<lj-cut>
Mail:

I’m still using Pine, that staple from university days. As it’s terminal based, I can use it to read my mail by logging in to my machine from wherever I happen to be. It supports multiple incoming folders for mailing lists and the like, and multiple roles. It’ll invoke an external editor, so I can use Vim to write email. The address book is nice. It keeps mail in flat text files, which, despite being a somewhat broken format, is easily understood by grep and the like. What more do you want? I hear good things about Mutt and also Apple’s own Mail.app, but nothing which compels me to change.

I run Exim as a mail transport agent, and use its nice filtering lanuage to handle sorting stuff into folders, ditching HTML mail sent to my Usenet posting address, and that kind of thing. Fetchmail gets the mail to Exim. Exim calls dccproc, which checks for bulkiness, and rbfilter, which checks for blacklisted senders.

My pobox.com forwarding address has been around since 1998 and so gets a tonne of spam, but since I used their spam filtering options to block China, Korea, and Brazil; and also turned on their cunning “looks like a consumer broadband machine” test (which looks for bytes from the IP address in the machine’s hostname, as that’s a common naming convention for broadband addresses), spam is a solved problem for me.

News:

I was using trn, but gave that up after failing to compile it for OS X. slrn is a worthy replacement, with colour highlighting and a useful scoring language. As well as using that to killfile people, I can increase the score of posters or threads which interest me and sort by score in the thread view. slrn shares trn’s handy habit of doing the right thing when you just keep hitting space, which is handy for eating and reading news at the same time.

I use Leafnode to fetch news from a variety of servers (NTL groups from their server, news.individual.net for everything else). A tip for Mac users: Leafnode creates directories full of lots and lots of small files (one per article, in fact). HFS+, the native OS X filesystem, is dog slow at accessing these. Make a UFS disk image and put your news spool on there.

Editing:

I use Vim, which combines the usability of the old Unix vi with the startup time of Emacs. It does all the usual good stuff like syntax highlighting every language known to man (including quoted text in mail messages, which is nice), indenting automatically and all that jazz. A killer feature is the function which will complete words from occurrences in the same file, or from a tags file (a list of all the names defined in a program). Helpful for not getting variable names wrong and also in rants where you find yourself writing “evangelical” a lot. The interface to cscope is also very useful when writing C code (and more importantly, trying to understand other people’s C code).

Browsing:

Since I started using OS X, I’ve been happily using Safari as my web browser. When writing long comments here on LiveJournal, I occasionally miss the text entry box editing facility of Mozex, since I could then edit the comments with Vim, but since no-one’s ported Mozex to the Mac yet, I’ve not switched to Mozilla or Firefox. You Windows users should so switch, of course, because Firefox is nicer and a lot more secure than IE.

Uploading:

I maintain my websites with sitecopy, which replaces that Perl or Python script which everyone seems to have written at least once to FTP stuff to their provider’s web space. sitecopy is works with both NTL’s and Gradwell’s servers and can do useful stuff like uploading based on hash values rather than modification times.

I post to LiveJournal using Xjournal, a pretty and featureful client for OS X. If I want to post from the command line, I use Charm.

Scripting:

I prefer Python to Perl for scripting tasks. As Yoda says, you will know Python is better than perl when your code you try to read six months from now.

Mudding:

I use Mudwalker when I have the OS X GUI available, mainly because I’ve made it talk using the Mac’s speech synthesis stuff. I’d also recommend Crystal.

Coding:

From within Vim, I make heavy use of ctags and cscope for browsing around code and jumping to declarations and references to a symbol. You can do it with grep, but it’s not as nice (and a lot slower on big projects).

I’ve also used Smatch to write customised static checkers for C code. Smatch is a modified version of GCC which outputs an intermediate language which is readily processed by your scripting language of choice. If you’ve ever found yourself writing Perl or Python code to parse C directly, you probably should have used this instead. There are some scripts which come with it which can do useful things like attach state to particular code paths as your script parses the code, and allow you to describe what happens to that state when the paths merge (so you could check that all paths free anything they’ve allocated, say).

ladysisyphus writes that Jesus has laser beams. As does Aslan, which makes sense if you think about it.

It turns out that Macs have speech synthesis built in. It’s not bad, and it’s easily accessible to programmers. So I’ve spent an entertaining evening making my MUD client talk. That way, if the window is hidden, I still find out when someone interesting logs in. I’ve ended up using MudWalker, a free, open source MUD client for Mac OS X. It’s scriptable in Lua, and helpfully provides a speak function to Lua scripts. The thing prospective programmers will want to know is that your regular expression match groups (the things Perl would call $1 and so on) are arg[n] to the Lua scripts you can use to write triggers. For console use, I’d still recommend Crystal as a good MUD client, but it turns out to be a bugger trying to get that to talk (Crystal is supposedly scriptable in Lua, but my attempts killed it).

Also been looking at Twisted, Python’s marvelous asynchronous mouse organ networked application framework thingy. It seems that as well as being very clever, it’s actually reasonably well documented these days. The Perspective Broker and Avatar stuff seems to be a good fit for games where the players can write code which is not trusted by the system, since the choice of which methods allow remote access imposes some sort of capability based access control. If I ever wrote a MUD in Python, something I’ve been threatening for some years now, Twisted would be the way forward (indeed, it was originally created to provide multiplayer interactive fiction in the form of Twisted Reality, another addition to fine the fine Internet tradition of hugely ambitious, but largely unfinished, MUD servers).

It’d probably be easier just to do this in Java. Python’s restricted execution stuff is not really there, so if you wanted to allow players to program (which I think is essential for holding people’s interest once they’ve finished the game proper) you’d probably end up running untrusted code in another process and using PB to talk back to the server. Still, it’s a nice dream. I saw that the author of MudWalker has got a prototype MUD written in E, the capability-based security language, which might well be worth a look too.

<lj-cut>
I publish an Atom feed of my comments on other people’s public posts. My comments on locked entries are not published. The program which produces the feed periodically checks that entries are still public, and treats entries which are new to it as private for a few hours in case they were made public accidentally. The feed is marked with various runes to ward off indexing by Google or other feed searching sites.

Still, if you don’t want the feed to include my comments on your public postings, you can opt out by filling in this poll. You’ll need to be logged in as the journal you want to opt-out. If you want to undo the opt-out, you can fill out the poll again and uncheck the box.

The opt-out is automatically checked once per day, so please allow that long for changes to take effect.

[ LJ Poll 1257467 ]