Flight Fare Prediction Using Machine Learning
Author: Kudithi Upendra1 (MCA student), Dr.G. Sharmila Sujatha2 (Asst.Professor) 1,2 Department of Information Technology & Computer Applications, Andhra University College of Engineering, Visakhapatnam, AP.
Corresponding Author: Kudithi Upendra
(email-id: upendrakudithi@gmail.com)
ABSTRACT:
The rapid growth of the aviation industry has led to highly dynamic and unpredictable flight pricing strategies, making it challenging for travelers to identify cost-effective options. This project aims to develop a reliable flight fare prediction system using machine learning techniques to forecast airline ticket prices based on historical and real-time data. The primary objective is to assist travelers and airline agencies in making informed decisions by predicting fares with high accuracy. A dataset containing features such as airline name, flight number, source city, departure time, number of stops, arrival time, destination city, travel class, flight duration, days left until departure, and actual price was utilized. To preprocess the data, OneHotEncoder was applied to handle categorical variables, and StandardScaler was used to normalize the numerical features. Linear Regression was then employed as the core predictive algorithm due to its simplicity and effectiveness in modeling continuous variables. The model was trained and validated using appropriate train-test splits, and performance was evaluated using metrics such as Mean Squared Error and R-squared score. Results showed that the model was able to predict flight fares with reasonable accuracy, highlighting the potential of linear models when supported by proper feature engineering. In conclusion, this study demonstrates that a machine learning pipeline combining preprocessing and linear regression can be an efficient solution for forecasting flight ticket prices, potentially enhancing decision-making for both service providers and customers.
Keywords: Flight Fare Prediction, Linear Regression, OneHotEncoder, StandardScaler, Machine Learning, Categorical Variables, Feature Engineering.