Let's look at the first problem: discovery of potential community members. If I start with a set of seed members and discover that all of them have relationships with a particular individual, there is a relatively high chance that she is also part of the community. However, if only one of the seeds has a relationship with the individual, it is less likely (although not impossible depending on the nature of the network) that they are members of the community.
We'll start by looking at the Twitter 'friends' of our seeds. As Twitter relationships are unidirectional, these are individuals that the seeds consider to be their peers, or whose messages they value. Below is a function that gets all the friends of a particular screen_name passed in as 'user', which we'll run for all the seeds. Options lists the Oauth components, pretty much as I explained in this post. The function returns a list of tuples representing a relationship: [user, friend].
function getUserFriends(user, options)
{
// Get the user's friends id's
var URL = "https://api.twitter.com/1.1/friends/ids.json?"+
"screen_name="+user+"&stringify_ids=true";
var response = UrlFetchApp.fetch(URL,options).getContentText();
var idobject = Utilities.jsonParse(response);
var ids = idobject.ids;
// Get the detailed data about the user's friends
var data = [];
for(var j = 0; j<ids.length; j+=90)
{
// construct the url
var URL = "https://api.twitter.com/1.1/users/lookup.json?"+"user_id=";
for(var i = j; i<j+90; i++)
{ URL += ","+ids[i]; }
// query the API
var response = UrlFetchApp.fetch(URL,options).getContentText();
var object = Utilities.jsonParse(response);
// parse and store the response
for(var i in object)
{data.push([user, object[i].screen_name])}
}
return data;
}
The diagram on the the left shows the network of relationships between the seeds and their friends. In the right hand diagram, we can assign friend a rank according to the number of incoming connections, and we start to pay less attention to the seeds. The ranked friends are candidates for being the next round's seeds.
// use the new connections we've found to update the candidate lists
for(var i in relationshipList)
{
var currentCandidate = relationshipList[i][1];
// see if friend is in the done list
var doneListIndex = BinarySearch2D(doneList, 0, currentCandidate);
//if so, increment its rank
if(doneListIndex < doneList.length) {doneList[doneListIndex][1]++; }
else
{
// see if friend is in the candidate list
var candidateListIndex = BinarySearch2D(candidateList, 0, currentCandidate);
// if so increment there
if(candidateListIndex < candidateList.length){candidateList[candidateListIndex][1]++;}
else
{
//otherwise, add it to the candidate list
candidateList.push([currentCandidate, 1]);
candidateList.sort(function(a,b){
if(a[1] < b[1]){return 1;}
else if(a[1] < b[1]){return -1;}
else{return 0}});
}
}
}
In the next round, we replace the seeds with their friends of highest rank. We poll the new seeds for their friends, and on the left see links branching both to 'new' friends, and to the remaining candidates from the last round. Incoming arrows increase the ranking of the expanding batch of candidates.
// sort candidate list by rank
candidateList.sort(function(a,b){return b[1]-a[1]})
// get rank of highest candidate
var maxrank = candidateList[0][1];
// get the friends of all the highest ranking candidates
var relationshipList = [];
while(candidateList.length && candidateList[0][1] == maxrank)
{
var currentCandidate = candidateList.shift();
var currentFriends = getUserFriends(currentCandidate[0], options);
relationshipList = relationshipList.concat(currentFriends);
doneList.push(currentCandidate);
}
After the second round, we promote a new set of candidates to 'seeds' and run the process again. We continue in this manner, expanding outward from the initial seed, focusing on the individuals most connected to the community, until we reach either a desired number of connections, candidates, or cumulative seeds.
Here's a diagram of what a network structure looks like after a few iterations. You can clearly see which elements have become seeds and which haven't. I'll have to see if I can scale this a bit so that the effect is less pronounced over the community.
Some interesting other Twitter API projects are:
Twitter App for Gmail - uses google scripts, so it's a good example for our environment
Creating Twitter Lists from Hashtag Users with Apps Script