Case Study - Making Internal Knowledge Accessible: Nuclear Decommissioning
- Pivigo

- Mar 17
- 2 min read

Purpose: Decommissioning nuclear facilities represents one of the most risk-sensitive activities in the energy sector, governed by thousands of fragmented PDF reports, regulatory filings, and technical scans. Traditional document retrieval systems often fail to recognise the critical, non-linear connections between these pieces of information, leading to costly research delays and safety risks. EPRI previously attempted to solve this issue using a standard vector-embedding-based RAG (Retrieval-Augmented Generation) system. However, this method struggled with the sheer density of the context, failing to identify how specific regulations in one document impacted technical protocols in another.
To bridge this gap, we were tasked with building a scalable, searchable Knowledge Graph (KG). By mapping the semantic relationships between entities, the tool transforms a static library into a dynamic single source of truth, allowing decommissioning teams to query complex regulatory relationships that traditional search engines miss.

Approach: Given a rapid five-week project timeframe, we developed an accelerated end-to-end pipeline in Python. This included automated schema generation, a dual-track extraction engine, and a custom GraphRAG retriever module integrated with Neo4j. Rather than a months-long expert consultation for ontology development, we used LLMs to generate a surrogate graph schema, providing an immediate foundation for development. We then built custom tools to allow this schema to be merged and refined as domain understanding grew.
To extract nodes and relationships, we ran two parallel strategies:
LLM-Based Extraction: Used for rapid prototyping and high-level relationship mapping.
Encoder-Based Extraction: Developed a workflow for deterministic, reliable entity recognition, providing a roadmap for EPRI’s long-term production needs.
We also built a custom GraphRAG retriever module leveraging Cypher queries, vector searches, and LLM prompts to answer complex questions. The primary limitation was the five-week limit, which we overcame by automating the schema generation and building a modular schema-merging tool. This enabled the project to transition from an AI-generated baseline to a refined, expert-validated system without restarting. The technical stack for this solution included Python, Neo4j, bespoke on-premises LLMs, and SpaCy.
Outcome: We delivered a functional prototype that converts unstructured nuclear decommissioning documents into structured, queryable knowledge. The system was tested against 24 expert-provided questions, where the GraphRAG retriever achieved 80% accuracy (30% fully correct and 50% partially correct) in its first iteration. The system demonstrated a high degree of grounding, with hallucinations limited to approximately 20% – a significant improvement over standard RAG for this domain.
Our framework provides a scalable foundation for energy utilities to reduce document search time and de-risk the decommissioning lifecycle. The modular nature of the code allows for ready integration into existing internal workflows, offering specific value to key stakeholders, such as:
Nuclear Engineers. Gain a tool that recognises connections across decades of reports.
R&D Teams. Receive a baseline implementation to build on, that includes de-duplication tools and extraction pipelines to continue the digital transformation of nuclear data.




Comments