Intelligent Automatic Configuration Parameter Tuning for Big Data Processing Platforms Using Machine Learning, Deep Learning, and Reinforcement Learning

Author Details

Naga Charan Nandigama

Journal Details

Published

Published: 12 December 2023 | Article Type : Research Article

Abstract

The exponential growth of big data processing has necessitated efficient and intelligent parameter tuning mechanisms for distributed computing platforms such as Apache Hadoop and Apache Spark. Manual configuration optimization remains time-consuming and inefficient, while existing auto-tuning methods introduce unacceptable overhead (20-30% of job execution time). This paper presents a comprehensive intelligent online parameter tuning framework that strategically integrates Singular Value Decomposition (SVD) with collaborative filtering, deep learning neural networks (CNN-based feature extraction), stochastic gradient descent optimization, and reinforcement learning algorithms to automatically optimize critical Hadoop/Spark configuration parameters. The proposed framework incorporates three primary components: (1) a configuration repository generator using genetic algorithms and evolutionary computation, (2) a machine learning-based intelligent recommendation engine implementing SVD-based collaborative filtering with deep learning augmentation, and (3) an online adaptive learning module with reinforcement learning adaptation for dynamic cluster conditions. Comprehensive experimental evaluation conducted on a 4-node Hadoop 3.3.0 cluster demonstrates that our approach achieves performance improvements of 24.2% over default configurations while maintaining mean percentage error (MPE) of only 14.32% from theoretically optimal configurations. The framework reduces parameter optimization recommendation time by 88.3% (from 180 seconds to 21 seconds), achieves 13% average memory utilization improvement, and demonstrates robust scalability across diverse workloads (WordCount, Sort operations) with dataset sizes ranging from 1 GB to 16 GB.

Keywords: Big Data, Parameter Tuning, Collaborative Filtering, Singular Value Decomposition, Machine Learning, Deep Learning, Reinforcement Learning, Hadoop, Apache Spark, Distributed Computing, Online Learning.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Copyright © Author(s) retain the copyright of this article.

Statistics

2 Views

7 Downloads

Volume & Issue

Article Type

Research Article

How to Cite

Citation:

Naga Charan Nandigama. (2023-12-12). "Intelligent Automatic Configuration Parameter Tuning for Big Data Processing Platforms Using Machine Learning, Deep Learning, and Reinforcement Learning." *Volume 6*, 2, 9-19