Transportation
This pilot will create a domain-specific knowledge graph by relying on related internal or externaldata sources. On top of that knowledge space, transport research-inspired information retrieval scenarios will be implemented by tuning the use of individual SciLake services or combinations of them. More specifically, scenarios will be built around (but not limited to) the following use case directions:
- Discovery of proof of concepts (PoCs) and their correspondence to CCAM transport use cases (e.g., open source gitlab projects not necessarily connected with a publication, the ETSI technology evaluations and interoperability events and PoCs database),
- Discovery of associations among technical specifications, published standards, draft standards as well as their appearance in research work (e.g., in order to discover unsettled issues for further research on next generation emergency systems for road transport or system-user control transitions in automated driving systems),
- Assessment of the significance (related to impact or/and reproducibility) of research objects appearing in multiple domains to reveal background research (e.g., discover works lying on the intersection of ITS and AI using a very popular AI framework or safety and security co-engineering for vehicles),
- Annotating heterogeneous datasets (sensor data, road traffic data, road users’ mobility data, infrastructure IoT data, CAVs behavioral data, CAVs users acceptance survey data), per content/topic (e.g., the level of detail: provision of traffic driving row data vs. object-level data vs. aggregated data) or per the evaluation objective using categories meaningful for transport mobility data, and
- Reproducibility/reusability reports for publications and/or experiments in the transport research sector (that propose original optimization algorithms, benchmark datasets) which will be automatically identified and they will include information about the ease to reproduce/reuse them (according to data FAIRness, code openness etc.) and the extent to which the work has already been reproduced by meta-analysis studies.
Background
Intelligent transportation systems (ITS) research is an important part of smart cities’ planning. It provides data-driven innovative services relating to different modes of transport and traffic management and enables users and societies to be better informed and make safer, more coordinated, and 'smarter' use of transport networks. This field also includes the development and evaluation of collaborative, connected autonomous vehicles as viable transportation mean (of passengers or freight). Regarding transport research domain, different modes (i.e., road, rail, water, air, cross-modal) are covered and different types of mobility are touched upon (i.e., freight, passenger) and SciLake could support and provide an overall assessment of research scientific production against innovation uptakes.
Domain-specific data/metadata
The field combines information processing and engineering studies and technologies with human sciences studies while the produced research data span multiple domains (e.g., algorithms and their software proof of concepts (SW PoCs), methods for technological impact, user acceptance, inputs to draft standards and regulations for newly introduced technologies like IoT and autonomous driving systems). Due to the study and proposal of systems to be applied in real world conditions, extensive use of international standards’ databases and open source implementations sometimes also with a locality indication (e.g., city or national level) is significant. In addition, several domain-specific knowledge bases (i.e. TRID, ERTRAC, ERTICO, ECTRI, IRU, WEGEMT, EASA, ERRAC) could be exploited in order to enhance scientific quality, identify remarkable knowledge and share research activities across national boundaries.
Current needs & challenges
One of the main need and challenge in ITS research is linking studies found in publication data with case study-specific PoCs and discovering potential associations of the identified case studies with existing publicly available datasets as well as technical specification documents and guidelines coming from international standards’ or manufacturer’s organizations (e.g., EU-based ETSI, IETF, CLEPA; US-based SAE, UNECE; Japan-based jsae, JAMA). Another challenge lies in linking interdisciplinary research objects to ensure that research in ITS uses the latest outputs produced by other fields (e.g., linking AI with ITS, or Electronic systems cybersecurity with ITS). Although specific efforts towards this direction are made by EU projects (e.g. the TOPOS Observatory and ARCADE CAD Knowledge Base), independent organizations (e.g. the C-ITS deployment group[2]) and several standardization bodies (e.g. ITU FG-AI4AD), still the aspects raised above are not tackled in an integrated way and information retrieval offered is mostly focused on textual/published reports' data. Lastly, as the ITS research results are used by users of different profiles, it would be useful if information retrieval can also be tuned according to the users’ professional profile (e.g., returning more high level data like technical reports, guidelines and standards to a smart city authority/national ministry employee and more low level technical data like SW libraries and scientific publications for students and post-doc researchers). Moreover, transport researchers are interested to identify which publication to rely on in order to produce high-quality research outcomes. Therefore, services of text mining and knowledge extraction could enable transport researchers to comprehend easily large collections of unstructured text bodies presenting them in a structured format for identifying meaningful patterns and new insights.
Expected outcomes
- Creation of domain-specific KG integrating technical and scientific reports with datasets, PoCs and available standards.
- Offering of services for integrated KG smart browsing based on impact and reproducibility using AI by also serving users of different profiles (researchers, developers, regulators).
- Identification of the areas of largest interest in the research community, and gaps in data and knowledge.
- Identification of datasets or PoC used rarely but covering specific case study not covered elsewhere.
- Improving time spent in R&D cycles w.r.t background research and SW setup/toolchain design for a specific case study or applied research sub-domain.
- Enhancing credibility of publications and/or experiments