computers

I’ve finally got around to writing the Greasemonkey script which I’ve long been threatening.

What it does

The script remembers which comments you’ve seen on LJ (or Dreamwidth) and helps you navigate to new comments. That’s right, I’m finally dragging LiveJournal kicking and screaming into the 1980s.

If you’re on an entry page, pressing “n” skips you to the next new comment, and “p” skips to the previous one. If the style has an “Expand” link, moving to an unexpanded comment with these keys will also expand the thread. If the style has a permanent link or a reply link for each comment in that comment’s header or footer, the script inserts another link next to it, labelled “NEW”. That link shows you that the comment is new at a glance. Clicking the “NEW” link selects the comment so that pressing “n” will go to the next comment from there. On some styles, the currently selected comment will be outlined with a dotted line.

On a journal or friends page, the script will also add the number of new comments to the link text, so that, say, “15 comments” becomes “15 comments (10 new)”, and enable the “n” and “p” keys to move between entries which have new comments, and the “Enter” key to view the selected entry. This only works if you’re looking at a journal which adds “nc=N” to entry links to say there are N comments on an entry (LJ can do this as a trick to confuse your browser’s history function into thinking you’ve not visited that entry whenever there are new comments). If you want to turn this on for your journal then ensure you’re logged in, visit this page, check the box which says “Add &nc=xx to comment URLs” and hit the “Save” button.

How it works

You don’t need to understand this section to use the script. If you don’t care about programming, skip to the next part.

<lj-cut text=”Gory details”> LJ makes it a total pig to do this sort of thing: there’s so little uniformity in journal styles that getting a script like this to work for all of them is impossible. It’s fair enough that LJ allows people to customise their journal’s appearance, but there aren’t even standardised CSS class names for stuff. Not that I’m bitter. So, what the script does is look for anchor tags of the form <a name="tNNNN"> or elements with an id attribute of ljcmtNNNN or tNNNN. NNNN is the comment number, which seems to be unique for each comment on a given user’s journal. It then looks for the permanent link to that comment, which is usually to be found in the header of the comment (or footer, in my current style), and adds a “New” link after that. So, new comments are marked with a link to the next new comment.

The upshot of all this is that if you’re reading a journal with a style which doesn’t use either anchor tags or elements with the given id for all comments, the script won’t work correctly. If the style doesn’t provide each comment with a permanent link in the comment’s header, the comment won’t be marked with a “New” link. Such is life. Please don’t ask me for special case changes to make it work with LJ’s many horribly customised journals. Pick a sensible style of your own and learn to use “style=mine” instead. There’s even another Greasemonkey userscript which will help. On the other hand, if there’s a large class of the standard styles for which it doesn’t work, tell me and I’ll have a look at it.

Using it

If you want to use it, you will need:

  • Firefox, the web browser, version 1.5 or later.
  • Greasemonkey, the extension which lets people write little bits of Javascript to run on certain pages.
  • LJ New Comments, which is what I’ve imaginatively entitled my script. If the userscripts site is down again, you can find a copy on my site.
  • Your flask of weak lemon drink.

After you’ve installed all of the above, visit an entry on LJ and marvel at the “NEW” links on all the new comments (which will be all of them at this point, as the script wasn’t around previously to remember which ones you’d seen before). See above for operating instructions.

Privacy

Note that the script stores a Firefox preference key for each journal entry you visit, listing the IDs of the comments it finds there. The script doesn’t let the database grow without limit: when the script has seen 500 entries, it starts to drop the history for the entries you’ve not visited recently.

Clearing the browser’s history doesn’t affect the script’s list of visited entries. Thus your visits to polybdsmfurries will be recorded for posterity, even if you clear the browser’s history. You can wipe the entire history by using the “Manage User Scripts” entry on the Tools menu to delete the script and its associated preferences (you can re-install it afterwards, but you must clear out the preferences for it to delete the history).

The script does not record the contents of any entry or comment. The script does not transmit any information to LJ or any other website, it merely acts on what it sees when you request journal entries.

Your questions

I’ve given this entry as the homepage for the script on Userscripts.org. That means this entry is intended to serve as a repository for questions about the script, so if you’ve got a question, comment here. I prefer this to commenting on my other entries or to emailing me, unless you already know me. Ta.

To keep up to date with new releases of my greasemonkey scripts, track the tag “greasemonkey” on my journal. This link should enable you to subscribe to that tag and get notified when I post a new entry about greasemonkey scripts.

Revision history

2006-01-02, version 0.1: First version.

2006-01-03, version 0.2: Added the “p” key. Used javascript to move between comments so doing so does not pollute the browser’s history. Coped with the id=ljcmtNNNN way of marking comments. Made “n” and “p” keys work even in the absence of permalinks on each comment.

2006-01-04, version 0.3: Apparently you can have id=tNNNN, too.

2006-01-04, version 0.4: Broke 0.3, fixed it again. I hope.

2006-01-19, version 0.5: Updated to cope with LJ’s new URL formats. Changed how comments are stored internally so that the database does not grow without limit: the script now remembers comments for the last 500 entries you visited, and forgets the entries you’ve visited least. Also added “New” marker based on reply link as well as thread link, for styles which don’t have a thread link for every comment.

2006-01-19, version 0.6: Convert dashes I find in URLs to underscores internally, to preserve access to history from older versions of the script before LJ’s URL change.

2006-02-09, version 0.7: Work around the fact that Firefox leaks memory like a sieve. Never display negative number of new comments. Change licence to MIT as GPL is overkill for this script.

2006-02-09, version 0.8: There was a bug in the workaround code I got off the Greasemonkey mailing list. Fixed that.

2006-06-04, version 0.9: Enabled the “n” and “p” keys on the friends/journal view. Added the box around the current comment.

2007-02-20, version 1.0, baby: Try harder to draw a box around the current new comment. Applied legolas‘s fix for pressing CTRL at same time as the N or P keys (see comments).

2008-03-31, version 1.1: Make it work faster on entries with lots of comments. Altered behaviour of “NEW” link so it now selects the comment you’re clicking on, as that makes more sense.

2008-09-24, version 1.2: Support Russian keyboards thanks to mumi_0, make threads expand.

2009-01-27, version 1.3: Support for independentminds journals.

2009-05-04, version 1.4: Support for Dreamwidth.

2009-09-22, version 1.5: Amend support for Dreamwidth.

2010-08-09, version 1.6: Made syndicated journals work.

Finally got Mozex going on Firefox with Mac OS X. This means I can edit my comments on LiveJournal with Vim rather than messing about with LJ’s comment posting box and the less powerful editing facilities from my browser. I can also use Danny O’Brien’s marvellous Google linkification script. Which is nice. It’d be even nicer if Firefox’s process creation API worked properly on Mac OS X, though.

As a result on all this mucking about, I’ve not had time to respond to comments on the God Hates Hair entry. I’ll get around to it sooner or later, though.

Greasemonkey is an extension for the Firefox browser which lets you write little programs to change how websites appear. For example, ilishin has created a script which lets you expand collapsed LJ comment threads in place (that is, on the same page, rather than on a new one). It only seems to work with the standard comment layout at the moment, but I hope the author will fix that soon (if not, it doesn’t look so hard that I couldn’t do it myself).

I noticed that the later versions of Greasemonkey support a key/value database which persists when you shut down and restart your browser. This means that it’s probably possible to write something which remembers how many comments there are for an entry and will highlight items (on your Friends list, say) which have new comments. It might even be possible to highlight the new comments themselves, although it’s not clear how good the database is, so you’d want to avoid overloading it, I suppose. I was vaguely aware of Greasemonkey, but I don’t think I’d realised just how much it can do. Greasemonkey may be the thing which makes me switch from Safari to Firefox (it’s just a shame nobody has sorted out Mozex for the Mac, as that’d certainly clinch it for me, too: I miss being able to edit LJ comments in a proper text editor).

Think I’d better dance now.

Chiark is a Unix box on which a large number of Cambridge geeks have accounts (I’m not one of them, as it happens, but I know some of them by name and a few of them by sight). It runs some local newsgroups, which are only accessible to people with accounts. They’ve recently added a journals newsgroup, to which some people are publishing their LJs (it’s a one way street at the moment, by the sounds of it: entries and comments go from LJ to the newsgroup, but not vice-versa). This has caused some excitement on my friends and friends-of-friends lists. Of particular note are atriec‘s posting on what LJ’s are for, emperor‘s own views (I’m not sure why Chiark is “cabal” there, but it’s the same thing being discussed), and mobbsy‘s comparison of LJ and newsgroups. There are a couple of coupled problems here: LJ’s interface is not useful for having discussions (as opposed to simply pontificating) and some people don’t actually want to have discussions anyway.

LJ’s limitations do annoy me. As I said to livredor recently, I’m here for the people, not the interface. Compared to sites like Google or Flickr, LJ hasn’t done very well at making its stuff accessible by computer programs which might do useful things with it, such as re-presenting it in a way which is easier to to read, remembering what I’ve already seen and alerting me to new stuff, and so on. OK, so there’s RSS, but that’s no good for comments. OpenID is a step in the right direction, but largely solves the opposite problem, namely letting non-LJers put their stuff here. The client protocol is, again, designed to let people put stuff on LJ, not to take it out. LJ explicitly says that they don’t like screen scraping (that is, programs which extract information from the LJ pages which are designed to be read by humans) as lots of programs doing this will request lots of pages very rapidly and put more strain on their server more than they’d like.

LJ slowly getting better as a discussion forum, but the pace of change is slow. Tags are useful, OpenID is pretty cool, but on the whole LJ’s developers also seem to spend a lot of time on making it look pretty (a worthy goal, since newsgroups are pretty ugly by comparison, but probably not worth all that much time from the developers, who could just provide the users with the tools to do it themselves). That’s probably down to their target audience, I suppose: a few refreshes of the random journal link shows that LJ is largely populated by teenage girls (and by Russians, for some reason). See also the large number of people saying “actually, we want more user icons, not this OpenID thing” on the OpenID announcement.

There’s also the question of what a LiveJournal is for. livredor‘s posting on manners on LJ made the point that nobody is very sure what the etiquette is for making comments on other people’s postings. Having been brought up on newsgroups, I assume that anything I can see and which has comments enabled is fair game, although in deference to the fact that I’m entering someone’s personal space, I’ll usually introduce myself before diving in. But I suppose I could still end up horribly offending someone. It’s possible that most LJ users don’t want to have long discussions on their journals, in which case LJ would be wasting their time by making that easier, and I should just find somewhere better suited to that, which supports OpenID.

What would be the ideal, for me? The distribution system of Usenet (the network of servers which provides access to the public newsgroups) means that you can’t really recall postings once you’ve made them, and also makes it hard to make the equivalent of friends-only postings (you could do it, but it’d be hard to conceal the fact that you’d at least made a posting that someone else couldn’t see). So, I don’t object to LiveJournal’s centralisation in itself, because it helps me keep control (and now OpenID means I can entrust non-LJ people with my friends-only stuff, if I want). On the other hand, the interface sucks when you want to follow a discussion.

I’d like to see more machine readable stuff (especially comments) and a better API for clients to use to pull out comments and so on. I suppose I’d really like to see LJ run an NNTP (newsgroup) server which wouldn’t distribute stuff, but which would allow the same restricted amount of HTML that LJ itself does. A journal would be a group, an article would start a new thread, and the comments would be followups. Stuff that you weren’t meant to see just wouldn’t show up in the group, because you’d need to log in to the server to see it. I like this idea, although I can’t really see LJ implementing it. Maybe we should start a meme to campaign for it? We could call ourselves the Campaign for Real News.

Click for a bigger version

If you’re an English geek of a certain age, one Christmas you probably received a computer game which wasn’t like any games which had gone before. It came in an A5 box, big enough to hold the manual, the novella, the poster with all the ships on it, the leaflet listing all the keys you needed to fly your spaceship, and, I dunno, some other stuff, probably. Elite pitted you and your trusty Cobra Mk III against an open-ended universe, where you could make a living by trade, piracy, bounty-hunting or, as your reputation grew, by carrying out missions for people. It depicted the universe in glorious wire-frame 3D. As Francis Spufford explains in Backroom Boys, it was groundbreaking and absorbing. That particular Christmas, I played it so much that, as I slept, visions of Pythons danced in my head.

Oolite is a free, open source Elite clone in Objective C for Mac OS X. It’s faithful to the original, but there are some improvements (the targetting box which shows the distance, ship type and legal status in the picture above, for example), and some nice touches (notice the skull and crossbones on the Mamba?) You also feel more a part of the world than in the original, as you come across other ships engaged in combat, or a pirate caught by the long arm of the law. The nostalgia! Buy a Mac and play it.

Some while ago, Mark Pilgrim did a post on cool software tools he couldn’t do without. In the absence of any pending rants about religion, here’s mine. Probably only of interest to fellow geeks, so cut for length.
<lj-cut>
Mail:

I’m still using Pine, that staple from university days. As it’s terminal based, I can use it to read my mail by logging in to my machine from wherever I happen to be. It supports multiple incoming folders for mailing lists and the like, and multiple roles. It’ll invoke an external editor, so I can use Vim to write email. The address book is nice. It keeps mail in flat text files, which, despite being a somewhat broken format, is easily understood by grep and the like. What more do you want? I hear good things about Mutt and also Apple’s own Mail.app, but nothing which compels me to change.

I run Exim as a mail transport agent, and use its nice filtering lanuage to handle sorting stuff into folders, ditching HTML mail sent to my Usenet posting address, and that kind of thing. Fetchmail gets the mail to Exim. Exim calls dccproc, which checks for bulkiness, and rbfilter, which checks for blacklisted senders.

My pobox.com forwarding address has been around since 1998 and so gets a tonne of spam, but since I used their spam filtering options to block China, Korea, and Brazil; and also turned on their cunning “looks like a consumer broadband machine” test (which looks for bytes from the IP address in the machine’s hostname, as that’s a common naming convention for broadband addresses), spam is a solved problem for me.

News:

I was using trn, but gave that up after failing to compile it for OS X. slrn is a worthy replacement, with colour highlighting and a useful scoring language. As well as using that to killfile people, I can increase the score of posters or threads which interest me and sort by score in the thread view. slrn shares trn’s handy habit of doing the right thing when you just keep hitting space, which is handy for eating and reading news at the same time.

I use Leafnode to fetch news from a variety of servers (NTL groups from their server, news.individual.net for everything else). A tip for Mac users: Leafnode creates directories full of lots and lots of small files (one per article, in fact). HFS+, the native OS X filesystem, is dog slow at accessing these. Make a UFS disk image and put your news spool on there.

Editing:

I use Vim, which combines the usability of the old Unix vi with the startup time of Emacs. It does all the usual good stuff like syntax highlighting every language known to man (including quoted text in mail messages, which is nice), indenting automatically and all that jazz. A killer feature is the function which will complete words from occurrences in the same file, or from a tags file (a list of all the names defined in a program). Helpful for not getting variable names wrong and also in rants where you find yourself writing “evangelical” a lot. The interface to cscope is also very useful when writing C code (and more importantly, trying to understand other people’s C code).

Browsing:

Since I started using OS X, I’ve been happily using Safari as my web browser. When writing long comments here on LiveJournal, I occasionally miss the text entry box editing facility of Mozex, since I could then edit the comments with Vim, but since no-one’s ported Mozex to the Mac yet, I’ve not switched to Mozilla or Firefox. You Windows users should so switch, of course, because Firefox is nicer and a lot more secure than IE.

Uploading:

I maintain my websites with sitecopy, which replaces that Perl or Python script which everyone seems to have written at least once to FTP stuff to their provider’s web space. sitecopy is works with both NTL’s and Gradwell’s servers and can do useful stuff like uploading based on hash values rather than modification times.

I post to LiveJournal using Xjournal, a pretty and featureful client for OS X. If I want to post from the command line, I use Charm.

Scripting:

I prefer Python to Perl for scripting tasks. As Yoda says, you will know Python is better than perl when your code you try to read six months from now.

Mudding:

I use Mudwalker when I have the OS X GUI available, mainly because I’ve made it talk using the Mac’s speech synthesis stuff. I’d also recommend Crystal.

Coding:

From within Vim, I make heavy use of ctags and cscope for browsing around code and jumping to declarations and references to a symbol. You can do it with grep, but it’s not as nice (and a lot slower on big projects).

I’ve also used Smatch to write customised static checkers for C code. Smatch is a modified version of GCC which outputs an intermediate language which is readily processed by your scripting language of choice. If you’ve ever found yourself writing Perl or Python code to parse C directly, you probably should have used this instead. There are some scripts which come with it which can do useful things like attach state to particular code paths as your script parses the code, and allow you to describe what happens to that state when the paths merge (so you could check that all paths free anything they’ve allocated, say).

ladysisyphus writes that Jesus has laser beams. As does Aslan, which makes sense if you think about it.

It turns out that Macs have speech synthesis built in. It’s not bad, and it’s easily accessible to programmers. So I’ve spent an entertaining evening making my MUD client talk. That way, if the window is hidden, I still find out when someone interesting logs in. I’ve ended up using MudWalker, a free, open source MUD client for Mac OS X. It’s scriptable in Lua, and helpfully provides a speak function to Lua scripts. The thing prospective programmers will want to know is that your regular expression match groups (the things Perl would call $1 and so on) are arg[n] to the Lua scripts you can use to write triggers. For console use, I’d still recommend Crystal as a good MUD client, but it turns out to be a bugger trying to get that to talk (Crystal is supposedly scriptable in Lua, but my attempts killed it).

Also been looking at Twisted, Python’s marvelous asynchronous mouse organ networked application framework thingy. It seems that as well as being very clever, it’s actually reasonably well documented these days. The Perspective Broker and Avatar stuff seems to be a good fit for games where the players can write code which is not trusted by the system, since the choice of which methods allow remote access imposes some sort of capability based access control. If I ever wrote a MUD in Python, something I’ve been threatening for some years now, Twisted would be the way forward (indeed, it was originally created to provide multiplayer interactive fiction in the form of Twisted Reality, another addition to fine the fine Internet tradition of hugely ambitious, but largely unfinished, MUD servers).

It’d probably be easier just to do this in Java. Python’s restricted execution stuff is not really there, so if you wanted to allow players to program (which I think is essential for holding people’s interest once they’ve finished the game proper) you’d probably end up running untrusted code in another process and using PB to talk back to the server. Still, it’s a nice dream. I saw that the author of MudWalker has got a prototype MUD written in E, the capability-based security language, which might well be worth a look too.

I much prefer old, depressive-to-rival-Leonard-Cohen Counting Crows to new, happy Counting Crows.

It’s quieter now all those weddings and barbeques have subsided for a bit. Me and she had a lovely evening with Gareth and Emma the other night. Gareth is the Scourge of Uk.religion.christian, putting nutters to flight with his rapier-like logic. Or something.

I remember mentioning Leonard Richardson’s Guess the Verb interactive fiction game, Munchkin, and opportunitygrrl (whose interests include geology, interplanetary video feeds, Mars, and Christina Aguilera).

Speaking of interactive fiction, I remember I promised terriem some links to IF works recommended by S. terriem also pointed me at Kingdom of Loathing, which I’ve not tried yet. I played through Slouching Towards Bedlam and enjoyed it, although I did have to resort to the help a couple of times. Still, the story’s the thing in this one, not the puzzles so much.

S also recommended 9:05 (which is short and funny in a twisted sort of way), and Spider and Web (which I’ve played a little way, and which is apparently longer). Get them from Home of the Underdogs in their IF section. To play them, you’ll need an interpreter which runs the files. On the Mac, I got Frotz from Fink for the .z5 files and MaxTADs for the TADS ones. This page lists IF interpreters for Windows. There’s a selection of Beginner’s Guides to help with the conventions of the medium.

There is a confusing multitude of spam filters out there. I once wrote an article listing all the ways of filtering spam I could think of. If you’re confused by all this, here’s what I do, along with ways of doing the same thing on both Unix and Windows systems.

<lj-cut> My first line of defence is a bunch of blacklists. These don’t work on the From address of the spam, which is usually forged, but rather on the IP address of the machine sending the email. There are a multitude of blacklists available, too. They differ in their listing criteria from narrow listings of machines which have sent spam, to broad listings of entire networks, intended to help you boycott ISPs which support spam. Getting legitimate email is more important to me than filtering all the spam, so I choose narrowly focussed blacklists. I use:

  • The Spamhaus Blocklist, a manually edited list of the worst corners of the Internet. These days, spammers tend to host their websites in these places and exploit other people’s machines to actually send their spam. Which is why I also use…
  • The Spamhaus Exploits Blocklist, an automatically compiled list of machines which have been taken over by spammers, probably without their owners’ knowledge. Windows users with cable modems, usually.
  • The Open Relay Database, another list of machines which are exploitable in a different way (mostly not a way which is used by spammers these days, but it occasionally catches something).

If you want to filter your email using these blacklists, and you’re on Windows, you could try Spampal. It is completely free and very stable. It will work for you if you collect your mail using something like Thunderbird or Outlook Express (but don’t use OE unless you want to become one of the aforementioned exploited Windows owners). It works by sitting between your mail server and your mail program and marking suspect mail as it goes by. You then configure a filtering rule in your mail program to move the suspect mail into a separate folder. If you pare down the blacklists Spampal uses to just those listed above, it shouldn’t slow your mail downloads too much.

If you’re on Unix and you run your own mail server, receiving mail directly from the Internet, that server will probably have support for using these blacklists. If you pull mail from elsewhere, using fetchmail, say, so that your mail server doesn’t see the IP address of the machine which originated the mail, there’s a little Perl script called rblfilter which will help. It doesn’t seem to be maintained anymore, so I’ve put a copy here. You’ll need to work out how to tie it into your email system and edit the script according to the instructions in the comments.

The next line of defence is the Distributed Checksum Clearinghouse. The DCC works by sharing information about how many other copies of a particular email are floating around the Internet. If there are a lot of copies, it’s either something like a mailing list, or it’s spam. To use the DCC, you tell it where you expect to get legitimate bulk email from. Everything else you get which is bulk is therefore spam. The DCC is designed for Unix, so the web pages and Google will tell you how to get it set up there. There is a plugin for Spampal which will also let Windows people use the DCC. It’s beta software, that is, released to the public for testing, so it may contain some bugs: I’ve no idea how stable it is (despite getting a credit on that page, I didn’t actually write it).

If someone else manages your email for you, and you read it via a web interface, for example, then you should have a look a the spam filtering options you have available. I’ve just noticed that Pobox.com, who provide a forwarding address for me, now let people configure their service to reject mail based on those blacklists.

Fight the pink menace!

A couple of students in Another Place are in trouble for “hacking”. The news papers aren’t particularly specific about what they did, but it sounds like they installed a packet sniffer and listened in on traffic across their network.

Ethernet networks have everyone hanging off the same piece of wire. If you’re on an Ethernet network, your network card has a unique address. As the traffic for everyone on that piece of wire flows by, your computer picks up traffic addressed to it. It doesn’t listen to other people’s traffic because you usually don’t care about it. However, by running your network card in what is delightfully known as promiscuous mode, you can see other people’s traffic. Programs which do this and present the results to you are called packet sniffers. Ethereal is a popular free packet sniffer. Packet sniffers have legitimate uses, like diagnosing network problems or writing and debugging software which uses the network (I installed Ethereal the last time I was having problems with DNS lookups, for example). The remedies for undesired sniffing are encryption and restructuring the network so everyone’s packets don’t share the same piece of wire.

The Oxford students seem to have been disciplined for drawing attention to what they did, but none of what they found is news. A college network probably has everyone hanging off the same wire. There are encrypted versions of telnet, HTTP, IMAP and POP3 but not many people use them. There are a lot of clever people with time on their hands. You work it out.

People who know this have done some sort of risk calculation and come up with a solution that they’re happy with, which balances convenience against privacy. For example, I only permit encrypted logins to my machines and don’t send my password itself when fetching email (although the mail itself comes across the wire as plain text). Now you know what’s possible, you can do that calculation too.