Data Management for Emerging Problems in Large Networks

Arijit Khan

Graphs are widely used in many application domains, including social networks, knowledge graphs, biological networks, software collaboration, geo‐spatial road networks, interactive gaming, among many others. One major challenge for graph querying and mining is that non‐professional users are not familiar with the complex schema and information descriptions. It becomes hard for users to formulate a query (e.g., SPARQL or exact subgraph pattern) that can be properly processed by the existing systems. As an example, Freebase that powers Google’s knowledge graph alone has over 22 million entities and 350 million relationships in about 5428 domains. Before users can query anything meaningful over this data, they are often overwhelmed by the daunting task of attempting to even digest and understand it. Without knowing the exact structure of the data and the semantics of the entity labels and their relationships, can we still query them and obtain the relevant results? In this talk, I shall give an overview of our user‐friendly, embedding‐based, scalable techniques for querying big graphs, including heterogeneous networks. I shall conclude by discussing our newest progress about solving emerging problems on uncertain graphs, graph mining, and machine learning on graphs.

Arijit Khan is an associate professor in the Department of Computer Science, Aalborg University, Denmark. He earned his PhD from the Department of Computer Science, University of California, Santa Barbara, USA, and did a post-doc in the Systems group at ETH Zurich, Switzerland. He has been an assistant professor in the School of Computer Science and Engineering, Nanyang Technological University, Singapore. Arijit is the recipient of the prestigious IBM PhD Fellowship in 2012-13. He published more than 60 papers in premier databases and data mining conferences and journals including ACM SIGMOD, VLDB, IEEE TKDE, IEEE ICDE, SIAM SDM, USENIX ATC, EDBT, The Web Conference (WWW), ACM WSDM, and ACM CIKM. Arijit co-presented tutorials on emerging graph queries and big graph systems at IEEE ICDE 2012, and at VLDB (2017, 2015, and 2014). He served in the program committee of ACM KDD, ACM SIGMOD, VLDB, IEEE ICDE, IEEE ICDM, EDBT, ACM CIKM, and in the senior program committee of WWW. Arijit served as the co-chair of Big-O(Q) workshop co-located with VLDB 2015, wrote a book on uncertain graphs in Morgan & Claypool’s Synthesis Lectures on Data Management. He contributed invited chapters and articles on big graphs querying and mining in the ACM SIGMOD blog, Springer Handbook of Big Data Technologies, and in Springer Encyclopedia of Big Data Technologies. He was invited to give tutorials and talks across 10 countries, including in the National Institute of Informatics(NII) Shonan Meeting on “Graph Database Systems: Bridging Theory, Practice, and Engineering”, 2018, Japan, Asia Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data (APWeb-WAIM 2017), International Conference on Management of Data (COMAD 2016), and in the Dagstuhl Seminar on graph algorithms and systems, 2014 and 2019, Schloss Dagstuhl - Leibniz Center for Informatics, Germany. Dr Khan is serving as an associate editor of IEEE TKDE 2019-now, proceedings chair of EDBT 2020, and IEEE ICDE TKDE poster track co-chair 2023.