Descripció del projecte
An important challenge that arises in many data science problems is how to learn low dimensional representations of network data (data defined as a set of nodes and edges between them with possible some attributes in the nodes/edges). Recently, deep generative models have been proposed as a method to learn such representations [1,2]. These models can be useful, not only to encode compactly massive graphs, but also in tasks such as prediction. Moreover, some of these models can also be used as network formation processes, that is, as a mechanism to synthesize artificial networks with some (or better) properties than the real ones.
However, the applicability of these methods in real-world problems is currently limited by various reasons. In particular, current methods do not deal satisfactorily with issues such as node relabeling of the network data, or other type of symmetries. Also, many of these methods do not scale up to very large networks, of the scale of millions of nodes.
This PhD will be focused on the development and analysis of methods for learning this network embeddings and their application to user browsing behavior and web graph data. In particular, the following
– Characterization of typical real-world problems where network embeddings are needed. For example, web domain network both at a global scale or at individual browsing network scale.
– Development and analysis of a method that improves the state-of-the-art in learning successfully embeddings for the particular social media or web domain under consideration.
– Exploring novel applications of the method in problems such as influencing the natural formation of a network by means of the reinforcement learning / optimal control framework.
References:
[1] W. L. Hamilton, R. Ying, and J. Leskovec. Representation learning on graphs: Methods and applications. IEEE Data Engineering Bulletin, 2017[2] T. Lei, W. Jin, R. Barzilay, and T. Jaakkola. Deriving neural architectures from sequence and graph kernels. ICML, 2017.