I am mainly interested in graph representation learning, large-scale network science and network analytics. Previously I have worked on computational social science and combinatorial game theory. I studied Economic Policy at Central European University and Applied Economics at Corvinus University of Budapest.
PhD in Data Science, University of Edinburgh, 2017-Now.
MSc by Research in Data Science, University of Edinburgh, 2016-2017.
MA in Economic Policy, Central European University, 2014-2016.
with Distinction. Won the Stanislav Vidovic MA Thesis Award.
BSc in Applied Economics, Corvinus University of Budapest, 2011-2014.
Teaching assistant, Corvinus University of Budapest, 2012-2014.
I've tutored and/or marked the following courses: Programming for Mathematical Economics; Probability Theory; Calculus; Internatiomal Economics; Macroeconomics.
Modern graph embedding procedures can efficiently extract features of nodes from graphs with millions of nodes. The features are later used as inputs for downstream predictive tasks. In this paper we propose GEMSEC a graph embedding algorithm which learns a clustering of the nodes simultaneously with computing their features. The procedure places nodes in an abstract feature space where the vertex features minimize the negative log likelihood of preserving sampled vertex neighborhoods, while the nodes are clustered into a fixed number of groups in this space. GEMSEC is a general extension of earlier work in the domain as it is an augmentation of the core optimization problem of sequence based graph embedding procedures and is agnostic of the neighborhood sampling strategy. We show that GEMSEC extracts high quality clusters on real world social networks and is competitive with other community detection algorithms. We demonstrate that the clustering constraint has a positive effect on representation quality and also that our procedure learns to embed and cluster graphs jointly in a robust and scalable manner.
A graph embedding is a representation of the vertices of a graph in a low dimensional space, which approximately preserves properties such as distances between nodes. Vertex sequence based embedding procedures use features extracted from linear sequences of vertices to create embeddings using a neural network. In this paper, we propose diffusion graphs as a method to rapidly generate vertex sequences for network embedding. Its computational efficiency is superior to previous methods due to simpler sequence generation, and it produces more accurate results. In experiments, we found that the performance relative to other methods improves with increasing edge density in the graph. In a community detection task, clustering nodes in the embedding space produces better results compared to other sequence based embedding methods.