AI-Powered Product Metadata Enrichment Through a Hybrid Approach Combining Semantic Web and Machine Learning

Praveen Kumar Kanumarlapudi

doi:10.55124/jbid.v2i2.250

Articles

DOI: 10.55124/jbid.v2i2.250

Published: 2025-06-29

Induvidual

Journal of Business Intelligence and Data Analytics

ISSN 2998-3541

Download PDF

AI-Powered Product Metadata Enrichment Through a Hybrid Approach Combining Semantic Web and Machine Learning

Authors

Praveen Kumar Kanumarlapudi Induvidual

Keywords

Metadata Enrichment, Machine Learning, Random Forest Regression, E-Commerce, Image Quality, Tag Suggestion, Semantic Markup, Hybrid Approach, Data Integration

Abstract

: In the rapidly evolving world of e-commerce, metadata enrichment has become essential to improve the discoverability, structure, and value of product information. This study explores advanced methods for enriching product metadata using semantic tagging combined with machine learning. As online product catalogs expand in size and complexity, often containing random patterns and incomplete data, the need for structured, context-aware tags is more important than ever. Traditional tagging systems often face challenges such as sparse data, ambiguous labeling, and lack of standardization, which negatively impact search performance and recommendation accuracy.To address these limitations, this paper presents a hybrid approach that uses structured semantic markup (e.g., schema.org, RDFa, JSON-LD), user-generated content, and various machine learning regression models—including Random Forest, XGBoost, AdaBoost, Gradient Boosting, and Decision Tree regressors—to predict appropriate additional tags for product descriptions. These models were trained and tested on a dataset of 20 product entries, each of which was evaluated based on factors such as image quality, description length, and existing tag reliability.Statistical and correlation analyses revealed a strong positive relationship between the richness of visual and textual product content and the success of tag enrichment. Among the evaluated models, Random Forest Regression demonstrated the highest generalization ability, achieving an R² score of 0.9227 on the test set. It outperformed other models such as XGBoost (0.5527), Gradient Boosting (0.8324), AdaBoost (0.8999) and Decision Tree (0.7534), the latter two of which showed signs of overfitting – highlighting the importance of choosing models that maintain performance in unseen data.Visualization techniques, including scatterplot matrices and heatmaps, further supported these findings by illustrating the strong influence of image quality and description length on tag prediction outcomes. The study also examined the role of ontology association (e.g., AGROVOC) in improving semantic alignment and user personalization. The research highlights a balanced approach to improving metadata coherence, discoverability and adaptive personalization in dynamic e-commerce environments by integrating user-generated metadata with expert-curated vocabularies.

Make a Submission

Information

Current Issue

Browse

Published

2025-06-29

How to Cite

Kanumarlapudi, P. K. (2025). AI-Powered Product Metadata Enrichment Through a Hybrid Approach Combining Semantic Web and Machine Learning. Journal of Business Intelligence and Data Analytics, 2(2), 1–17. https://doi.org/10.55124/jbid.v2i2.250

Download Citation

Issue

Vol. 2 No. 2 (2025): Journal of Business Intelligence and Data Analytics

Section

Articles

ISSN 2998-3541

AI-Powered Product Metadata Enrichment Through a Hybrid Approach Combining Semantic Web and Machine Learning

Authors

Keywords

Abstract

Make a Submission

Information

Current Issue

Browse

Published

How to Cite

Issue

Section

Navigate

Digital Indexing

Crossref

Metadata

ISSN

Index

Google Scholar

Index

Contact Us

ISSN 2998-3541

AI-Powered Product Metadata Enrichment Through a Hybrid Approach Combining Semantic Web and Machine Learning

Authors

Keywords

Abstract

Make a Submission

Information

Current Issue

Browse

Published

How to Cite

Issue

Section

Latest Updates Subscribe To Our Newsletter

Crossref

Metadata

ISSN

Index

Google Scholar

Index