On KDD Cup-2012!



Well, the problem stated in track-1 seems really interesting and data-rich (Although Track-2 is more pertinent to my Ph.D research). Not yet sure what I will eventually focus on, but this time around, I am sorta hell-bent on giving it a shot.
The social network presented in the challenge is a twitter-like network (Tencent Weibo, one of the largest micro-blogging websites in China) that is, 'in-degree heavy' which hints towards the presence of super-users with large fan following. I have included the plots of the 'IN' and 'OUT'-degree distributions which amply illustrates this common user|super-user dichotomy.
In case, some one is interested about the exact stats, the graph has 2421058 Vertices (~2.5 million) and 50655143(~50 million) Edges. Here is an R snippet that will help you plot these.
PS:"user_sns.txt" is the edge list you can download with the data.

library(igraph)
g<-read.graph("user_sns.txt", format="edgelist")
summary(g)
plot(degree.distribution(g, mode="in"), log="xy")
plot(degree.distribution(g, mode="out"), log="xy")