top of page

Enhancing Product Lifecycle Efficiency: Harnessing Natural Language Processing for Materials Insight and Optimization

Jun 24, 2025

New Publication: Leveraging NLP to Accelerate Sustainable Innovation in Material Science.
We’re excited to announce a new open-access conference paper developed within the framework of the ANALYST Project, now published in Springer Nature’s Lecture Notes in Mechanical Engineering, as part of the proceedings from the European Symposium on Artificial Intelligence in Manufacturing (ESAIM 2024).


📘 Title: Enhancing Product Lifecycle Efficiency: Harnessing Natural Language Processing for Materials Insight and Optimization

🖋️ Authors: Inés Pérez Couñago, Lara Suárez Casabiell, Andrea Gregores-Coto, Christian Eike Precker, Santiago Muiños-Landin (AIMEN Technology Centre)

📅 Published: 22 March 2025📚 Part of the book: Advances in Artificial Intelligence in Manufacturing II

🔗 DOI: https://doi.org/10.1007/978-3-031-86489-6_24


Research Overview

In today’s rapidly evolving industrial landscape, data plays a critical role in designing safer and more sustainable materials. However, accessing and interpreting this data, especially across the complex lifecycle of materials, remains a major challenge.


This study addresses that gap by exploring how Natural Language Processing (NLP) and Large Language Models (LLMs) can be used to streamline information retrieval across the chemical, environmental, health, social, and economic dimensions of materials, focusing specifically on the polyvinyl chloride (PVC) value chain.


The work presents an end-to-end pipeline that automates:

  • Data extraction from scientific literature, databases, and online content

  • Topic modeling using Latent Dirichlet Allocation (LDA) to organize and filter datasets

  • Retrieval and response generation with vector-based similarity search and LLMs (Llama2)

  • Performance evaluation using a mix of statistical and semantic scoring models


Tools and Technical Highlights

  • The research combines several open-source tools, including:

  • LangChain for PDF parsing and embedding workflows

  • Chroma vector store for managing document embeddings locally

  • XL-Instructor and Llama2 for semantic search and answer generation

  • BLEU, METEOR, BERTScore, Prometheus and others for evaluating output accuracy


The case study on PVC demonstrated that filtering irrelevant or noisy data using LDA significantly improved both the accuracy and speed of information retrieval—cutting processing time by over 94% in some scenarios.


Implications and Future Outlook

This work contributes to the broader goals of the ANALYST project: enabling Safe and Sustainable by Design (SSbD) through digital innovation. By improving access to structured, relevant lifecycle data, this NLP-driven approach can support:


  • Smarter material design

  • Faster sustainability assessments

  • Better decision-making for researchers, regulators, and industry


The authors also highlight future directions, including the integration of neural topic modeling, enhanced evaluation metrics, and applications beyond the PVC sector.


👉 Read the full article (Open Access): https://zenodo.org/records/15321832


This publication is part of the ANALYST Project, funded by the European Union’s Horizon Europe Research and Innovation Programme under Grant Agreement No. 101138548.

Contact

 

Avenida Afonso III, S/N – Edifício do Cais da Antiga Estação da CP, da União de Freguesias de Monção e Troviscoso, 4950-431, Monção, Viana de Castelo, Portugal

Follow us

linkedin background.png
twitter.png
  • Youtube

Legality

image (3).png

The ANALYST project has received funding from the European Union’s Horizon Europe Research and Innovation Programme under the Grant Agreement No 101138548

Designed and managed by Polymer Comply Europe

Public deliverables 

bottom of page