“An Empirical Comparison of LSTM, SARIMAX, and Random Forest Models for Daily Web Traffic Time-Series Forecasting”
Chandana T H 4JN22IS046
Information Science and Engineering JNN college of Engineering
(VTU Affiliation)
Chinmayi R M 4JN22IS051
Information Science and Engineering JNN college of Engineering
(VTU Affiliation)
Deepthi G R 4JN22IS058
Information Science and Engineering JNN college of Engineering
(VTU Affiliation)
Jeevitha A P 4JN23IS409,Prathima L Asst. Professor
Information Science and Engineering JNN college of Engineering
(VTU Affiliation)
Abstract: - The fidelity of forecasting the traffic of the websites is vital to maintain the reliable services, to utilize server resources in the most effective manner, and to concentrate on the seamless user experience. Web traffic has certain trends such as seasonal variations, trend and sudden spikes and makes the traditional predictive tools ineffective. It is proposed in the paper to use the combination of time-series analysis and machine learning and deep learning methods to predict the number of visits to the site per day using the past data. It is a mixture of Random Forest (RF), Seasonal ARIMA and exogenous factors (SARIMAX) and Long Short-Term Memory (LSTM) models. The increase in the quality of prediction is carried out by some of the most significant steps of preprocessing of data including the missing values, the creation of the features of interest, and the normalization of data. Both models produce a prediction in 30 days and are both assessed using the MAE, MAPE, RMSE, accuracy, and precision metrics. A web-based dashboard can be used to visualize results, compare and export them to allow making practical decisions. It is determined that the LSTM and Rand Forrest models are closer to its predictions as compared to the SARIMAX which provides consistent and explainable predictions.
Keywords: Web Traffic Forecasting; Time-series analysis; machine learning; deep learning; LSTM networks; SARIMAX; random forest; model evaluation.