Advancing Life Cycle Assessment of Sustainable Green Hydrogen Production Using Domain-Specific Fine-Tuning by Large Language Models Augmentation
Abstract
Assessing the sustainable development of green hydrogen and assessing its potential environmental impacts using the Life Cycle Assessment is crucial. Challenges in LCA, like missing environmental data, are often addressed using machine learning, such as artificial neural networks. However, to find an ML solution, researchers need to read extensive literature or consult experts. This research demonstrates how customised LLMs, trained with domain-specific papers, can help researchers overcome these challenges. By starting small by consolidating papers focused on the LCA of proton exchange membrane water electrolysis, which produces green hydrogen, and ML applications in LCA. These papers are uploaded to OpenAI to create the LlamaIndex, enabling future queries. Using the LangChain framework, researchers query the customised model (GPT-3.5-turbo), receiving tailored responses. The results demonstrate that customised LLMs can assist researchers in providing suitable ML solutions to address data inaccuracies and gaps. The ability to quickly query an LLM and receive an integrated response across relevant sources presents an improvement over manually retrieving and reading individual papers. This shows that leveraging fine-tuned LLMs can empower researchers to conduct LCAs more efficiently and effectively.