Dinsdag seminar

A random graph model for citation networks

Alessandro Garavaglia, TUE

Abstract:

Bibliometric indicators are used to evaluate the performance and the productivity of scientists and research groups.  Such indicators (e.g. the h-index), are all based on citation counts, without considering a paper as part of a large network. Viewing papers as part of a network would allow to use network metrics to evaluate their scientific impact.

In this talk, we model citation networks as directed networks, where nodes represent papers and edges represent citations from one paper to another. Using data from the extensive Web of Science database (more than 12000 journals and 1 billion cited references) we identify some of the characteristics of citation patterns, like the presence of power-law degree distribution.

We define a random graph model that can replicate such patterns, using continuous-time branching processes.  We describe the results on the tree case, i.e., when papers are allowed to cite only one existing paper. In this model,  we assume that the attractiveness of a paper depends on the past number of citations, the age of the paper and a fitness, which represents external factors that can influence citations (how known the author is, the field, the journal, etc).  In particular, we aim to characterize when power-law degree sequence occurs in the presence of these three factors.

Finally, we will discuss some ongoing work on how to extend the results on the tree case to multigraphs.