twitter_scalpal code
Sample output showing the least active users first |
The tool is a simple Python script that takes a Twitter screen name of a target user as an input and then retrieves who they follow and who follows them. For each user found it displays three things.
How the users are related to each other
If the target user and the found user follow each other 'linked to' is displayed.
If the target user only follows the found user 'following' is displayed
If the target user is only followed by the found user 'followed by' is displayed.
The users screen name
The time that the user last updated their status
The time of the last update in local time
If the user is protected 'protected' is displayed
If the user hasn't tweeted 'no tweets' is displayed
The script uses parts of the Twitter API that don't require authentication. This means you can analyse any user that isn't protected but you will be unable to retrieve data on found users that are protected. As this is for my own use I'm not too worried about implementing authentication as I only follow one protected user. It shouldn't be too hard to adapt the script to use authentication. The code is a little rough around the edges and may fail under certain circumstances, but generally it does the job.
To make the results easier to analyse they are sorted by the time that the user last updated their status, with least active accounts at the top.
As the twitter API is rate limited the script may take a while to complete. A delay of 24 seconds is added after each request to prevent being blacklisted. As a rough guess the script should take 1.25 seconds per follower and followee.
The output showing most active, protected, and users without tweets |
The script acquiring data |
# twitter_scalpal.py # # given a Twitter user name, display their # followers and who they follow along with # how they are related and the last time # their status was updated #! /usr/bin/env python # -*- coding: utf-8 -*- import sys import argparse import os import shutil import urllib2 import json import math import time import email.utils import datetime #define constants time_delay = 24 #time betweet requests to the twitter API followerURL = "https://api.twitter.com" + \ "/1/followers/ids.json?cursor=-1&screen_name=" friendURL = "https://api.twitter.com/1/friends/ids.json?cursor=-1&screen_name=" lookupURL = "https://api.twitter.com/1/users/lookup.json?user_id=" entityURL = "&include_entities=true" #Create command line parser parser = argparse.ArgumentParser(description='Analyse twitter users') parser.add_argument('user', type=str, help='twitter user to analyse') args = parser.parse_args() userName = args.user #delete previous data for user and and create a location for new data shutil.rmtree(args.user,ignore_errors=True) try: os.mkdir(args.user); except OSError: print "System error" sys.exit(1) #Get followers print "" print "getting followers" try: followers_response = urllib2.urlopen((followerURL + userName)).read() except urllib2.HTTPError, e: print "HTTP Error " + str(e.code) + ". Check if user exists" sys.exit(1) except urllib2.URLError, e: print "URL Error " + str(e.args) sys.exit(1) time.sleep(time_delay) #Parse follower response and print follower count followdat = json.loads(followers_response); followers = followdat['ids'] print str(len(followers)) + " found" print "" #write followers to file try: with open("./" + args.user + '/followers.json','wb') as f: f.write(followers_response) except IOError: print "Couldn't write followers to file" sys.exit(1) #Get friends print "getting friends" try: friends_response = urllib2.urlopen((friendURL + userName)).read() except urllib2.HTTPError, e: print "HTTP Error " + str(e.code) + ". Check if user exists" sys.exit(1) except urllib2.URLError, e: print "URL Error " + str(e.args) sys.exit(1) time.sleep(time_delay) #Parse follower response and print follower count frienddat = json.loads(friends_response); friends = frienddat['ids'] print str(len(friends)) + " found" print "" #write friends to file try: with open("./" + args.user + '/friends.json','wb') as f: f.write(friends_response) except IOError: print "Couldn't write followers to file" sys.exit(1) #calculate number of unique contacts contacts = list(set(followers + friends)) num_of_contacts = len(contacts) print "calculating unique contacts" print str(num_of_contacts) + " found" print "" #initialise variable all_contacts = [] #calculate the number of contact requests to make (20 at a time) num_of_contact_requests = (num_of_contacts-1+20)/20 #lookup user information for i in range(num_of_contact_requests): #assemble a URL of 20 contacts to lookup in a single request namerequest = "" namerequest = str(contacts[i*20]) #first contact to lookup for j in range(1,20): #add remaining contacts. index = i*20 + j if(index < num_of_contacts): #prevent out of bounds access namerequest = namerequest + "," + str(contacts[index]) idurl = lookupURL + namerequest + entityURL #lookup contact information print "getting contact details " + str(i+1) try: contacts_response = urllib2.urlopen((idurl)).read() except urllib2.HTTPError, e: print "HTTP Error " + str(e.code) + ". Check if user exists" sys.exit(1) except urllib2.URLError, e: print "URL Error " + str(e.args) sys.exit(1) time.sleep(time_delay) #parse contacts and add them to a list of all contacts contactdat = json.loads(contacts_response) for single_contact in contactdat: all_contacts.append(single_contact) #collect and format data for display output_list = [] for single_contact in all_contacts: #Assemble a string for the last status date. #Can be 'no tweets' or 'protected' is_contact_protected = single_contact['protected'] status_count = single_contact['statuses_count'] if ((is_contact_protected == False) and (status_count > 0)): #The following converts the UTC time to local time created_at = single_contact['status']['created_at'] parsed_date = email.utils.parsedate_tz(created_at) date_string = datetime.datetime.fromtimestamp( email.utils.mktime_tz(parsed_date)).strftime( '%Y-%m-%d %H:%M:%S') elif (is_contact_protected == True): date_string = "protected" elif (single_contact['statuses_count'] == 0): date_string = "no tweets" #Create a space padded string of the contacts screen name contact_name = single_contact['screen_name'] contact_name = contact_name + " " * (17-contact_name.__len__()) #Create a string that describes the contacts relationship relationship_string = "following " if single_contact['id'] in followers: relationship_string = "followed by " if single_contact['id'] in friends: relationship_string = "linked to " #put the strings in a tuple and add it to a list contact_details = (date_string, relationship_string, contact_name) output_list.append(contact_details) #sort the output list. List sorted by first #element of the tuple, the date string output_list.sort() #print the results print "" for contact_strings in output_list: print contact_strings[1] + contact_strings[2] + contact_strings[0] # #output = open("./" + args.user + '/contacts.json','wb') #output.write(json.dumps(all_contacts)) #output.close() #write contacts to file try: with open("./" + args.user + '/contacts.json','wb') as f: f.write(json.dumps(all_contacts)) except IOError: print "Couldn't write contacts to file" sys.exit(1)
When I use this with python.exe the terminal opens for a moment and closes almost immediately (too quick for me to read the info)and nothing else happens so it is not working for me. I'll try and check this out...
ReplyDeleteI'll assume you're using Windows from the python.exe bit. I've had this trouble before too. I think if you run the program from a command line window that is already open you should be fine.
DeleteI think however you may run into trouble with this as Twitter have since changed how you can connect to them. Their API requires you to authenticate and I didn't include this feature in the code. I was just having a bit of fun and I think this would have been a lot of effort for not much reward.