“Mathematics for trend detection in Social Media”


In this talk I will discuss the problem of finding top k nodes with the largest degrees in very large on-line networks. If the adjacency list of the network is known (which is not the case, for example,in the Twitter follower graph), a deterministic algorithm to find the top k list of nodes with the largest degrees requires an average complexity of O(n), where n is the number of nodes in the network. Even this modest complexity can be very high for large complex networks such as We or Twitter. Instead, I will introduce randomized Monte Carlo methods, and show theoretically and by numerical experiments that these algorithms find good quality top lists of nodes with high probability and with computational savings of orders of magnitude. I will also quickly discuss a current research on prediction of trends in social media based on the network data, for example, the structure and dynamics of retweet graphs.