AffilGood: Enhancing Scientific Attribution
In the world of scientific research, accurate attribution is paramount. However, linking scientific works to research organizations has long been a challenge, primarily due to the scarcity of openly available annotated data describing institutional affiliations. This issue is particularly pronounced when dealing with complex and multilingual affiliation strings.
Our partners at SIRIS Academic have developed AffilGood, an innovative framework designed to enhance the identification and linking of institutions from raw, multilingual strings, significantly improving affiliation metadata in Scientific Knowledge Graphs.
What is AffilGood?
AffilGood is a multifaceted framework addressing the complexities of institution name disambiguation in scientific literature. It consists of two primary components:
- A robust collection of datasets for extracting information from raw affiliation strings
- An entity linking module that connects organisations mentioned in affiliations or research projects to ROR (Research Organization Registry) identifiers
Tackling Complex Challenges
AffilGood excels in handling various challenging scenarios such as:
- Processing noisy or incomplete input data
- Managing affiliations in languages other than English or with mixed languages
- Navigating complex affiliations involving diverse institution types (e.g., companies, universities, hospitals, research centres) at different hierarchical levels
Recent Developments
Our partners, Nicolau Duran-Silva and Pablo Accuosto, recently showcased AffilGood at the Fourth Workshop on Scholarly Document Processing. Their paper, "AffilGood: Building reliable institution name disambiguation tools to improve scientific literature analysis", offers in-depth insights into the framework's capabilities and its potential impact on scientific literature analysis.
For those interested in the technical details, the full paper is available in the workshop proceedings:
Duran-Silva, N., Accuosto, P., Przybyła, P. and Saggion, H., 2024, August. AffilGood: Building reliable institution name disambiguation tools to improve scientific literature analysis. In Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024) (pp. 135-144). URL: https://aclanthology.org/2024.sdp-1.13.pdf
SciLake Integration
We're excited to announce the integration of AffilGood into SciLake's "Knowledge Graph creation assistant" as an institution disambiguation pipeline. This integration will significantly enhance our ability to accurately identify and link institutional affiliations across the research landscape.
By improving the quality of affiliation metadata in scientific knowledge graphs, we're taking a major step forward in scientific literature analysis. As we refine this technology, we aim to streamline institution name disambiguation, ultimately enabling more precise and efficient attribution of scientific works.