Skip to main content

Workshop

SciLake at VLDB 2025 PhD Workshop

By Stefania Amodeo

Daan de Graaf from Eindhoven University of Technology recently presented his doctoral research at the VLDB 2025 PhD Workshop, held as part of the 51st International Conference on Very Large Data Bases in London, United Kingdom, from September 1–5, 2025. His work, conducted within the SciLake project, addresses the challenge of integrating graph algorithms into database systems.

VLDB is one of the premier international forums for database researchers, developers, and users. Being selected to present at the PhD Workshop is a significant accomplishment that recognizes both the quality and potential impact of Daan's research. This achievement highlights the innovative work being conducted within the SciLake project and demonstrates how our research advances the state of the art in managing and analyzing large-scale scientific knowledge graphs.

GraphAlg: A New Language for Graph Algorithms

Daan's presentation focused on GraphAlg, a new language that makes it possible to run graph algorithms directly inside database systems. This work has important implications for the SciLake project and the broader scientific community. As graph databases become more popular for complex data analysis, current tools often lack flexibility, speed, and user-friendliness. GraphAlg solves these problems by building on well-established mathematical principles from linear algebra.

The language is designed to be easily analyzed and optimized, and it can be converted into a format that databases already understand (relational algebra). This combination makes GraphAlg both powerful and practical for real-world use.

Why This Matters for SciLake

GraphAlg is being developed specifically in the context of the SciLake project. As part of this project, the graph query engine AvantGraph will host the OpenAIRE Graph, a large scientific knowledge graph containing hundreds of millions of publications. The OpenAIRE Graph currently integrates the BIP! Ranker tool to enrich publication data with research impact indicators based on the citation graph, using algorithms typically derived from PageRank or simple citation counts.

With GraphAlg, these indicators can be computed directly within AvantGraph, replacing a complex pipeline running on a large cluster with a simpler and more efficient query with an embedded algorithm. This means:

  • Significantly improved performance and reduced infrastructure requirements
  • Greater flexibility for project partners to experiment with custom algorithms

This work directly supports SciLake's mission to provide advanced analytics capabilities for the scientific community.

Key Achievements and Future Directions

During his PhD research, supervised by Dr. N. Yakovets, Daan has accomplished several important milestones:

  • Created the language structure and rules for GraphAlg, building on MATLANG (a mathematical framework for working with matrices)
  • Built a compiler that translates GraphAlg into executable code
  • Integrated GraphAlg into AvantGraph, a state-of-the-art graph query engine

In the future, Daan will work on making GraphAlg faster and more efficient, adding support for other database systems.

About the Presentation

The workshop paper, titled "Algorithm Support in a Graph Database, Done Right," was presented as part of the VLDB 2025 PhD Workshop program. The full paper is available at: https://www.vldb.org/2025/Workshops/VLDB-Workshops-2025/PhD/PhD25_5.pdf

We congratulate Daan on this achievement and look forward to following the continued development of GraphAlg as it enhances our capabilities for scientific knowledge graph analysis.