About Me

My photo
Web person at the Imperial War Museum, just completed PhD about digital sustainability in museums (the original motivation for this blog was as my research diary). Posting occasionally, and usually museum tech stuff but prone to stray. I welcome comments if you want to take anything further. These are my opinions and should not be attributed to my employer or anyone else (unless they thought of them too). Twitter: @jottevanger

Saturday, August 15, 2009

TwitsTwotsBitsBotsDeliciousDosAndDoNotsIfsAndButs

So I've got into using bit.ly for my short links, particularly on Twitter. I'm sure I don't need to explain why, but aside from these serving the obvious need for brevity within a tweet (but never here...), I appreciate the stats, which appear in real time at minute-scale granularity and so are in some ways clearly superior to what you get from Google Analytics. Here's an example: http://bit.ly/info/3NuA3P . I'm going to talk more about the stats later on but before that a brief digression about Delicious, which I think will repeat some of what I saw in a post by Tony Hirst recently, but it's been brewing so gotta get it out.
What's wong wiv bit.ly and Delicious
The problem with bit.ly is that the things I want to tweet I typically also want to bookmark using Delicious, for whilst bit.ly keeps hold of your tasty links it's not got tagging (and why not, I wonder? That would make it a much more useful and social service). It got the the point where I was wondering why Delicious wasn't offering an integrated short URL service, since right now if you want the full benefits of online/social bookmarking and short URLs neither bit.ly nor Delicious cuts it. Or should I say, right then, since about a week ago (and within a week of my tweeting my bemusement that Delicious wasn't doing this), it did. Bookmark with Delicious now and you get the option to share your link, which produces a short URL. Cool, and yet.... it's not good enough for me. You can only do it by letting Delicious e-mail or tweet the links for you, at which point you see the short URL. Bu you can't simply view the short code immediately so that you can cut-n-paste at will. Delicious should create one for every single bookmark, with the option of custom links. To do what bit.ly does and tempt me away from it, it must also create unique URLs for each person's version of a link, so that can be tracked individually (together with the shared one for aggregate data), and it must offer decent stats.
So, that's why Delicious isn't up to snuff yet for me to jump ship from bit.ly, even if that would mean just one operation for tweeting and bookmarking my fave URLs. Hmm, come to think of it perhaps bit.ly could offer OPML output or some simple export or integration with Delicious so you could just synchronise periodically? That might keep me using both services happily. I should say, I'm perfectly aware that there are alternatives to Delicious and that some of them offer better integration with Twitter, but that's my chosen poison and with social stuff the size of the network is vital to its gravity; ain't no bookmarking service with more gravity than Delicious.
Who follows?
But what about those stats about link followers that bit.ly offers? Let's dig into them. What do they really tell us?
I started to get suspicious that so many of the followers of links I tweeted were from the US, and often at a time when normal people would be a-bed across the Atlantic. The real-time stats showed that they were also very quick off the mark, and whilst the streaming nature of Twitter means that you expect responses to be quick or not at all, sometimes the click-throughs seemed to come even before I tweeted (via Spaz, the Air client I normally use). Super-quick, US-based (which only a few of my followers are), and very steady numbers for most tweeted links; were these clicks from real people at all?
Short answer
No, lots of them weren't; a steady residue of link follows came from bots of one sort or another.
Long answer: an experiment
I did a couple of experiments to test this. First I made a page on my own web space just fo' the bots, made a bit.ly link to it, and tweeted it asking humans NOT to click the link. This being a highly scientific an experiment I should here state that an explicit assumption was that my followers deem themselves to be human (though I know this was violated occasionally, including by myself. Doh!). I hoped that the stats from this web page would give me the answer as to how many click-throughs shown in the bit.ly stats were via browsers and how many were bots. Well, yes and no. I forgot how lame the stats on that web-space are. No breakdown by day nor details of users by page, only for the site. Nevertheless I get minimal visits to those pages and a massive peak on the day of that tweet, so I can probably tell enough. All the same I thought I'd better try Google Analytics too, so having set that up for my site I repeated my tweet. Then I thought, perhaps I should have used a new link? So I created a custom name for my URL and tweeted once again.
Some numbers
Of 17 link follows reported by bit.ly (http://bit.ly/info/3wQZ51), 15 were "direct", which would include bots but also most other applications, e-mail clients etc. One was from bit.ly itself (that was me, oops) and one from tweetdeck (Mike, that you?). 10 were from the UK and 7 from the US, and pretty much all of them happened within seconds or minutes of my tweets.
My original "for bots only" tweet on the 8th yielded 6 of the follows, plus that accidental click from me. According to PlusNet web stats package I had 51 hits and 22 visits that day, which were almost exclusively to that page (with some other pollution from yours truly, no doubt, as I clicked round the site setting stuff up). I guess that means that once they'd found the page via the bit.ly link some of the followers came back a few times. Now that's definitely not human.
Once I had my Google Analytics bit sorted out, on the 11th, I sent a second tweet. This resulted I think in one visit, by bit.ly's stats. This tweet used the original bit.ly short URL (http://bit.ly/3wQZ51), so presumably the bots figured it wasn't worth going there again. Looking back at the tweet, actually, I think I left the "http://" off so perhaps that's the real answer.
Anyway finally I did the same thing again but using a new custom link (http://bit.ly/bottest), which to the bots would appear to be a new link (all except bit.ly's own bots, perhaps?). This produced another 7 follows, and the next day there were two more when I wasn't watching. Google Analytics reported one visit to the target page, from a Firefox/Windows user in south London, so I presume that one of those 10 follows was via a browser. According to PlusNet, there were 28 hits and 15 visits on the 11th (1 visit is more normal).
So how many of the visits were bots? Well, putting GA together with bit.ly's stats I'd say only 1 out of 10 follows on the 11th/12th was not a bot, though it's possible that others were humans users that just didn't fire the GA code for one reason or another.
Overall, in August so far 13 hits in the web logs are attributed to the bitlybot user agent, 4 to the Tweetmemebot, 2 to twitturls.com's bot, 4 to Spaz, which I know as a user makes requests to something or other (bitly, or the target URL perhaps) to get some page info. A bunch of other bots and non-browser UAs are in there too but I can't say if they're related to the tweets.
Conclusions
I don't think I can squeeze much more from this paltry sample and the crappy and contradictoty web log stats, but clearly nearly all of the visits via Twitter/bit.ly were, as I hoped, not from humans and most likely came from bit.ly's own bot and that of Tweetmeme and Twitturl. From this evidence, if bit.ly reports that I get half a dozen "clicks" on a short URL I've tweeted then I can assume they're probably bots. More than that and they're probably at least partly human. Whether this applies to other twitterers I can't say, but you can do your own experiments. I'd like to repeat this using a site with a better stats package as bait, and perhaps using a few different twitterers to throw out the link, to see whether there's any relationship between numbers of followers and numbers of bots. Quite likely not, but who knows.
Is this any use? I dunno, but I'm a little better informed about the impact of my tweeted URLs now.

[[edit: ironically, looking for the custom link to this post that I made at the weekend, I found their user forums where there's more discussion of the bot problem e.g. http://feedback.bit.ly/pages/5239-suggestions/suggestions/126917-show-me-if-hits-are-bots-human-or-rss-readers-etc-]]

No comments: