Over the last few weeks, pointed questions have been raised about the failure of the word “Wikileaks” to appear in Twitter’s trending topics at a time when discussion of the Wikileaks revelations and associated controversies have dominated political discourse around the world. (I wrote three essays on the subject, the first largely dismissive of claims of shenanigans, the second and third much less so.)
On Wednesday Twitter finally released an official statement on the controversy.
In that statement, they answer the question of whether they’ve blocked Wikileaks from the lists with an “absolutely not,” then go on to provide an overview of what trending is and how it works.
What they have to say will be familiar to those who followed my back-and-forth with company representative Josh Elman on my blog last week, but here are the bullet points:
- “Twitter Trends are automatically generated by an algorithm that attempts to identify topics that are being talked about more right now than they were previously.”
- “Put another way, Twitter favors novelty over popularity.”
- “Sometimes a topic doesn’t break into the Trends list because its popularity isn’t as widespread as people believe.”
- “And, sometimes, popular terms don’t make the Trends list because the velocity of conversation isn’t increasing quickly enough, relative to the baseline level of conversation happening on an average day.”
This is all reasonable and plausible, as far as it goes. It’s the simplest explanation, for instance, for why “Lenon” trended on the anniversary of John Lennon’s murder, but “Lennon” didn’t. (More on that in a moment.)
What it doesn’t explain, though — what it doesn’t even begin to try to explain — is the data point that’s single-handedly responsible for more than ten percent of this blog’s total views ever: the weird way that the word “Sundays” trended last weekend.
“Sundays” wasn’t a novel term. Compared to “Wikileaks” it wasn’t a high-traffic term. There’s no indication that it had a particularly high velocity of adoption. And its baseline on an average day was actually quite high. At first glance, it doesn’t appear to fit any of the criteria Twitter lays out.
But there’s a hint there — the word “widespread.” I paraphrased it as “high-traffic” above, but let’s look at something Josh Elman said to me in our exchange the other day:
“Trends isn’t just about volume of a term but also the diversity of people and tweets.”
“Diversity” is a term Josh used several times in that exchange, and it provides a big clue to the “Sundays” trend. Commenter MrTiggr has hypothesized that Twitter conceptualizes its users as existing within “clusters” — groups of people connected by common connections on the service. The popularity of a term across clusters, he suggests, is likely to be much more significant to Twitter’s algorithms than its raw volume.
Both “diversity of people” and “diversity of tweets” metrics serve as ways of keeping people from manipulating the trending topics list. It’s harder to get strangers to tweet on a topic than people you know, and it’s harder to get people to tweet new content than to retweet something. So these rules make sense from that perspective.
But each of them also serves to boost certain kinds of trends at the expense of others.
Let’s look at “Sundays” again. It trended — according to this argument — because lots of people with no connection to each other tweeted lots of different things about it. But if you think about it, that’s just because “Sundays” isn’t a topic at all. A person who tweets “I love lazy Sundays” and one who tweets “are you coming to Sundays [sic] meeting” and one who tweets that a particular store “will be open Sundays in December” aren’t tweeting about a shared experience, or a shared interest, or a shared joke. They’re just using a common word.
And when you combine this bias in favor of diversity of people and tweets with the algorithm’s bias in favor of novelty, you get the Lenon/Lennon anomaly I mentioned above: Because a fair number of people talk about John Lennon on Twitter on any ordinary day, a bump in traffic for that name won’t register much. But since the “Lenon” misspelling is uncommon, when “Lennon” traffic rises — bringing “Lenon” traffic with it — “Lenon” will register as a novel topic and attract the attention of the Trending Topics gremlins.
Two other factors combined to make “Lenon” trendable, I’m guessing. First, there’s the fact that people who are less interested in John Lennon (and so less likely to tweet about him on an ordinary day) are less likely to know much about him (and so more likely to misspell his name when they tweet about him on a special day). Second, there’s the fact that the “Lenon” misspelling is one that Spanish-speakers are more likely to stumble into, which means that a higher proportion of “Lenon” than “Lennon” tweets are going to come from outside the anglophone Twittersphere.
So Lenon trended and Lennon didn’t, for reasons that are perfectly understandable. But it’s important to stop at this point and note that even though the reasons are understandable, they still make absolutely no sense.
When “Lenon” trended — and “Lennon” didn’t — the weakness of Twitter’s trending algorithm was revealed. Millions upon millions of people were tweeting Lennon’s name that day, and the vast majority of them were spelling it right, but because some tens of thousands of them had been interested in John Lennon the day before and the day before that — because John Lennon is a subject that people are actually interested in and care about — his name didn’t trend. Because “Lenon” is a meaningless term that nobody was using intentionally, it did.
The more I think about all this, and the more data I look at, the more convinced I become that Twitter has a weird valley in the middle of its algorithm. If a term is completely novel, it’s got a good shot at trending. If it rises in popularity really quickly, it’s got a good shot at trending. But if it’s a little less novel and rises a little more slowly, then it won’t trend, even if the volume and diversity of the traffic is pretty high. The only way to get over that barrier — as with “Sundays” — is with a trend that isn’t really a trend at all.
Well, I don’t know. That traffic really was insanely high. Just absurdly high. And as I’ve noted before, the term “oil spill” trended for weeks on end earlier this year under quite similar circumstances. It’s very very strange. But if I had to bet, I’d bet that the failure of Wikileaks to trend is not a result of a specific targeting of that particular term. Probably.
Having said that, though, I want to say something else.
This isn’t just a matter of “well, it’s the algorithm.” And it isn’t a matter of the people who run Twitter being idiots, either. When I said above that the algorithm makes no sense, I meant that it makes no sense from the perspective of the interests of the Twitter user, not that it makes no sense from the perspective of Twitter itself.
I think it’s safe to say that Twitter pretty much likes the algorithm the way it is. They’re tweaking it all the time, of course, and they’d be silly not to, but right now it’s presumably producing about the results they’d like it to produce. And what are those results? Lots of memes, for starters — lots of hashtags like #slapyourself and #haveuever and #ifsantawasblack. Also lots of quirky little flash-in-the-pan media stories. And celebrity deaths. Celebrity deaths are a perfect fit for the current algorithm.
What do these trends have in common, other than that they’re not “trends” as the word is most often used? Well, they’re light. They’re casual. They’re here and they’re gone.
It’s safe to say, I think, that Twitter didn’t want Wikileaks to trend. There are many different sensible ways to approach the construction of a trending topics algorithm, and the vast majority of them would have pushed “Wikileaks” to the top of the charts. That didn’t happen, and it didn’t happen on purpose.
But I don’t think, at the end of the day, that it’s all that likely that Wikileaks was targeted specifically. I think it’s just more likely that Twitter isn’t interested in having any topic like Wikileaks — an ongoing discussion of a major social or political issue, going through peaks and lulls and times of broader and narrower resonance — make the list.
Twitter’s trending topics aren’t intended to measure what people are interested in. They aren’t intended to measure what people are passionate about. They aren’t intended to measure what people are committed to. They aren’t even intended to measure what people are fascinated by.
They’re intended to measure “Ooh! Shiny!”
Update | There’s a lot I left out of this essay, as it was unwieldy enough as it stood, but this is worth mentioning, I think — sponsored trending topics are a revenue stream for Twitter, which means the company has a financial interest in drawing eyeballs to the trending topics list. It’s also worth mentioning that Twitter introduced sponsored trending topics in June of this year.