Friday, May 10, 2013

Cutting Through the Noise on Twitter

So you've been on Twitter for a while now, over time you've steadily followed more and more people, and now you've reached a point where you're stating to get overwhelmed by the endless barrage of Tweets,  what do you do?  It's time for a clean-up.

Slowly the people you follow change, and your own interests change, so once in a while it doesn't hurt to go through and re-evaluate whether or not you still want to follow someone.  Keeping your timeline manageable and full of relevant content will make your twitter experience more enjoyable, and valuable.  Clearing out the clutter also leaves room to follow new people whose interests are more aligned with your own.

One of the more subtle ways Twitter can become unmanageable is by following people who Tweet a lot of things your not interested in.  We're all guilty of it, sometimes our posts are irrelevant and aren't particularly interesting to others, but that's OK.  That's what I like about Twitter, you get to see the real person and get a better understanding of their personality.  It does however become a problem when people start tweeting 50 or so times a day and fill your timeline with things you don't care about.  I consider 10 tweets a day to be okay, but depending on the quality of the content I don't mind if people post more.

The following is a a quick and easy way to find out if someone is cluttering up your timeline.  Using the Linux command line, or cygwin if windows is your thing, you can find out who the culprits are with the following method to see how many tweets people make.

Open up twitter and scroll all the way to the bottom until you can't load any more tweets.  Select all the text by pressing Control-A.  You should see something like the image below.  Then press Control-C to copy all the text.  Paste the text into a text editor and save it as a file.

Twitter Timeline
Copying Twitter Information

Running the following command on the file you just created will find all the instances of Twitter users in the the timeline.  This basically means any time an at symbol appears followed by an alphanumeric character or an underscore.  The results are then sorted by name, counted, and then sorted by frequency.

cat Tweets.txt | grep -o '@[a-zA-Z0-9\_]*' | sort | uniq -c | sort -n

Linux Command Line
Looking for Excessive Tweeters

As the data I captured is over a period of about 2 days, you can see the majority of people tweet 10 times or less a day.  I'm not too worried about most of the ones over that limit as they're accounts that I find valuable.  You can however see that a lot of the tweets in my timeline are from 4 news services, and if I can get rid of any of these it will make a big impact to the number of tweets I get.

That should find all the user handles in your timeline, but you still need to find out about retweets, and that can be done with the next command.  It looks for instances of the phrase "Retweeted by" with any text after it and does the same counting and sorting as the last command did.

cat Tweets.txt | grep -o 'Retweeted by.*' | sort | uniq -c | sort -n

Linux Command Line
Looking for Excessive Retweeters

Because retweets are listed by name and not the user handle, the results will look a bit different.  Once again you can see that the news services are retweeting a lot, and although I find their information valuable, there's quite a bit of duplication between the news stories they report, particularly "7News Brisbane" and "Nine News Brisbane".  So I'm starting there.  I prefer 7 News so I'm dropping 9 News.  I plan to do this slowly over time and not rush into it, cutting only a couple of accounts at a time and see how it goes.  There may also be problems dealing with Unicode names, but I'll leave that as an exercise for the reader to sort out.

Ultimately what I really want is something like this.  Twitter volume controls.

Twitter Mockup
Twitter Volume Mockup

I'd like to be able to go into the list of people I follow and individually change how much of their feed I see.  I might be only interested in their tweets and not retweets or vice versa.  Alternatively I'd also like to be able to reduce the number of tweets I see.  An easy way to do this is to just randomly let through a certain percentage of tweets, but a heuristic method based on my twitter habits would be best.  I don't however want Twitter to decide for me, I want full control to tweak my timeline as I see fit.

This would also benefit Twitter.  They'd be able to get more fine grained feedback on what users think about the quality of a particular users tweets.  It would also improve the value of my timeline to me and I'd be more inclined to use their services.

No comments:

Post a Comment