Online Reinforcement Learning for Real-Time Monitoring and Control in Production Systems

B. Doskenov, O. Okuyelu
Pacific Seafood,
United States

Keywords: online reinforcement learning, real-time monitoring, production systems, dynamic scheduling, predictive maintenance, process optimization, industry 4.0

Summary:

Modern production systems are increasingly complex and variable, requiring adaptive and intelligent solutions for real-time monitoring and control. Traditional methods, such as linear models and static optimization, often fall short in addressing the dynamic, high-dimensional demands of industrial environments. Online reinforcement learning (RL) offers a compelling alternative by enabling systems to continuously learn and optimize decision-making through real-time interactions with their environment. This review explores the current advancements in online RL, focusing on its applications in predictive maintenance, dynamic scheduling, and process optimization. Key methodologies, including Deep RL, policy-based approaches, and hybrid frameworks, are examined for their ability to enhance scalability, adaptability, and efficiency in Industry 4.0 ecosystems. While online RL holds great promise, challenges such as computational demands, algorithmic stability, and limited real-world validation remain significant barriers to its widespread adoption. The lack of standardized benchmarks further hinders the evaluation and comparability of RL solutions across different industrial contexts. To address these gaps, this review identifies critical research directions, including the development of efficient algorithms, the integration of domain knowledge for improved stability, and the deployment of multi-agent RL systems for distributed manufacturing networks. By synthesizing recent advancements and identifying unresolved challenges, this paper underscores the transformative potential of online RL in creating intelligent, autonomous, and resilient production systems.