On the ethics of feeds

Whenever I post a comment on LiveJournal, I get an email containing the text of it. I’ve written a Python program which turns these into an Atom feed, so that people could stalk me more easily by subscribing to the feed. I get into some interesting discussions on other people’s LJs, so I thought such a feed might be useful.

The program checks that the comment is on a public posting and doesn’t publish it if it isn’t (you can do this by submitting an HTTP HEAD request for the entry in question and seeing whether LJ redirects you to login or sends you a 4xx response, both of which I take to mean “don’t publish”). Edited to add: the program also periodically re-checks for posts changing their privacy settings (there’s a cache with an exponential backoff from a couple of hours to a month to avoid annoying LJ: the backoff is restarted if the entry’s privacy changes).

I’m not sure whether to do further checks before publishing the comment. On the one hand, all I’m doing is publishing my own words as they appear in someone’s public posting. On the other hand, sometimes people are quite surprised to find that people read stuff they’ve made public, and I don’t want to annoy my friends. Since I mostly comment on your journals, what do you think?

[ LJ Poll 1242038 ]

6 Comments on "On the ethics of feeds"


  1. (I originally voted for the Google check, but now that I come to write down my justification I think I’ve changed my mind so I’ve changed my vote too.)

    Opt-out is surely next to useless: the bad situation you’re trying to protect against is where someone writes something public and didn’t really intend it to be public. It seems overoptimistic to expect that they’ll have given you advance warning that that might happen.

    The Google check is probably overoptimistic too for a very similar reason. It’s (maybe) better than having an opt-out because (1) it’s more likely that someone concerned about privacy of their public posts will think of blocking Google than that they’ll think of asking you to refrain from publishing your responses, and (2) in any case where the check fails to be cautious enough you have the counterargument “well, you haven’t lost much because anyone could run across this with Google anyway”. But this isn’t really such a great counterargument because they can block Google after the fact, and maybe excising comments from your feed isn’t so easy.

    (Note: I have no idea whether LJ actually makes it possible to tell robots not to index your stuff, nor at what granularity you can do it if so. For the sake of argument I’m assuming that it can be done, with a granularity of a single user.)

    Clearly legally speaking you do indeed own your words. I have a lot of sympathy with that option, and just a little with one you didn’t list: “No, don’t do this; the failure mode where someone carelessly posts something publicly and your comment happens to give it away to everyone is just too awful even though it’s rare.”

    Reply

    1. (Note: I have no idea whether LJ actually makes it possible to tell robots not to index your stuff, nor at what granularity you can do it if so. For the sake of argument I’m assuming that it can be done, with a granularity of a single user.)

      Yes, it can, although it took me a while to find where the option has moved to since I last looked. It’s under “Viewing Options”.

      Reply

    2. I have no idea whether LJ actually makes it possible to tell robots not to index your stuff, nor at what granularity you can do it if so. For the sake of argument I’m assuming that it can be done, with a granularity of a single user

      That’s exactly what LJ does.

      I forgot to mention that the script caches the privacy of the entry for a limited time before it re-checks, and then starts backing off to longer times if nothing has changed.

      I expect that if someone makes something public by mistake, they will change it within a few hours of it being posted. So one thing the script should do is wait a few hours before publishing anything it sees from an entry which is new to it. This is a similar idea to LJ’s handling of notifications: you can get an email whenever someone posts a new entry, but the email doesn’t contain the text of the entry, to allow the “oh shit that should have been private… click-click” process to occur (you’re racing your stalker as you do this, but there’s not much to be done about that, it’s just giving you a head start).

      I’ll tell robots (like Google) not to index the feeds directory on my site. There’s no loss there as the other feeds are from my blog, which is indexed, and it’ll prevent further duplication of search results. I’ll also use the Feed Access Control standard to advise aggregators not to pass on the “my comments” feed (for instance, in search results from blog search engines). I hope that doing this won’t prevent web-based aggregators like Bloglines from aggregating the site for reading, but will prevent it from appearing in search results, as this article suggests.

      Reply

  2. There are lots of people on LJ with babies who post about baby stuff, so from the title of this I thought it was going to be a breast-versus-bottle discussion…

    Reply

  3. I think those are reasonable checks and balances to operate – if it’s public, then publish!

    (You do know what happens to the souls of people who publish, though, don’t you…?)

    Reply

Leave a Reply to pw201 Cancel reply

Your email address will not be published. Required fields are marked *