Search this website

Project Overview

Project Number
M10530
Total Grant Value
$24,355
Program Area
MRIWA Contribution
$24,355
Project Theme
Data Driven Decisions
Project Period
2023 - 2024

This project is part of the Co-operative Education for Enterprise Development (CEED) program with academic supervisors Prof. Melinda Hodkiewicz and Dr Caitlin Woods.

The Challenge

Retrieve research data and insights in large technical reports using generative artificial intelligence (AI). Reliance on keyword search and expert interpretation leaves aspects of MRIWA’s reports content untapped. The lack of a standardised format complicates use of them by large language models and impedes content retrieval.

Key Findings

GraphRAG responses on technical reports rich in domain-specific terms cannot currently be relied on by technical professionals.

However, retrieval can be improved by use of domain curated schema. A knowledge graph schema that is too specific may constrain the large language models and reduce recall, while one that is too generic may overlook important distinctions.

Striking the right balance appears to be a critical consideration in knowledge graph schema design for GraphRAG.

These results suggest that a domain-specific schema can assist GraphRAG with entity extraction on technical texts, leading to better response quality.

Benefit to WA

The value embedded in technical reports can be unlocked and integrated with current information to foster new insights for the benefit of the WA minerals industry.

Further research into minerals related reports is being supported by releasing the subject matter expert curated question and answers, assessment criteria and examples, code and a reports for other to improve on the baseline provided, harnessing the global community efforts.

Financial Assistance DOI

DOI: 10.71342/574784944778

Research Output DOI

DOI: https://doi.org/10.4230/TGDK.3.2.3

Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke. GraphRAG on Technical Documents – Impact of Knowledge Graph Schema. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 3:1-3:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025) https://doi.org/10.4230/TGDK.3.2.3

Technical Report

Retrieval Augmented Generation on Large Technical Reports

Similar Projects


Page was last reviewed 27 May 2026

Back to main content