Skip to main navigation Skip to search Skip to main content

An iterative guided active learning-based approach for real-time sampling in smart manufacturing systems

  • Abdelrahman Farrag
  • , Nieqing Cao
  • , Mohammed Khalil Ghali
  • , Daehan Won
  • , Yu Jin
  • State University of New York Binghamton University
  • Xi'an Jiaotong-Liverpool University

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Smart manufacturing systems increasingly rely on data-driven decision-making, where timely and accurate sampling of process data is critical for predictive modeling and quality control. Conventional sampling approaches often demand large labeled datasets, incur high computational costs, and fail to capture representative instances. This paper introduces BURGAL (Batch-Uncertainty-Representativeness Guided Active Learning), a novel framework that iteratively selects both informative and representative data instances in batch mode. BURGAL reduces the amount of labeled data required while maintaining predictive accuracy and computational efficiency by integrating uncertainty-based sampling with a representativeness criterion. The framework is validated on two distinct datasets: (1) a high-volume Surface Mount Technology (SMT) dataset collected from a real-time production line, and (2) the SECOM semiconductor dataset, characterized by high dimensionality and severe class imbalance. In the SMT case study, BURGAL achieved real-time sampling by selecting 50% of a 3-million-instance dataset within 247 seconds, while preserving the population distribution (Cohen’s) and enabling predictive models with mean absolute errors ranging from 22% at a 10% sampling ratio to 10% at a 50% sampling ratio for identifying defective printed circuit board (PCB) regions. In the SECOM study, BURGAL was adapted as a feature selection strategy, integrated with a generative adversarial network (GAN) for minority-class augmentation and a Random Forest classifier. This pipeline achieved an average precision of 0.91 and a recall of 0.83 under 10-fold Leave-One-Out Cross-Validation, outperforming existing baseline methods in both predictive performance and sampling efficiency.

Keywords

  • Active learning
  • Feature selection
  • Machine learning
  • Sampling
  • Smart manufacturing

Fingerprint

Dive into the research topics of 'An iterative guided active learning-based approach for real-time sampling in smart manufacturing systems'. Together they form a unique fingerprint.

Cite this