Lufthansa increases on-time flights by wind forecasting with Google Cloud ML
Senior Machine Learning Specialist Engineer, Google
The magnitude and direction of wind significantly impacts airport operations, and Lufthansa Group Airlines are no exception. A particularly troublesome kind is called BISE: it is a cold, dry wind that blows from the northeast to southwest in Switzerland, through the Swiss Plateau. Its effects on flight schedules can be severe, such as forcing planes to change runways, which can create a chain reaction of flight delays and possible cancellations. In Zurich Airport, in particular, BISE can potentially reduce capacity by up to 30%, leading to further flight delays and cancellations, and to millions in lost revenue for Lufthansa (as well as dissatisfaction among their passengers).
Being able to predict this kind of wind well in advance lets the Network Operations Control team schedule flight operations optimally across runways and timeslots, to minimize disruptions to the schedule. However, predicting speed and magnitude can be incredibly difficult to model and thus to predict— which is why Lufthansa reached out to Google Cloud.
Machine learning (ML) can help airports and airlines to better anticipate and manage these types of disruptive weather events. In this blog post, we’ll explore an experiment Lufthansa did together with Google Cloud and its Vertex AI Forecast service, accurately predicting BISE hours in advance, with more than 40% relative improvement in accuracy over internal heuristics, all within days instead of the months it often takes to do ML projects of this magnitude and performance.
“Being impressed with Google’s technology and prowess in the field of AI and machine learning, we were certain that my working together with their expert, to combine our technology with their domain expertise, we would achieve the best results possible,“ said Christian Most, Senior Director, Digital Operations Optimization at Lufthansa Group.
Collecting and preparing the dataset
The goal of Lufthansa and Google Cloud’s project was to forecast the BISE wind for Zurich’s Kloten Airport using deep learning-based ML approaches, then to see if the prediction surpasses internal heuristics-driven solutions and gauge the ease of use and practicality of the deep learning approach in production.
Since deep learning-based techniques require large datasets, the project relies on Meteoswiss simulation data, a dataset consisting of multiple meteorological sensor measurements collected from several weather stations across Switzerland over the past five years. By using this dataset, we obtained data on factors like wind direction, speed, pressure, temperature, humidity and more, at a 10 min resolution, along with some information about the location of the weather stations, such as altitude. These factors, which we hypothesized to be predictive of the BISE, ended up carrying valuable signals, as we would see later.
This collected data was next subjected to an extensive cleaning and feature engineering process using Vertex AI Workbench, in order to prepare the final dataset for training. The cleaning phase included steps to drop the features, or rows, that contained too many missing values, or failed statistical tests for entropy, etc. Since the direction of wind is a circular feature (between 0 and 360 degrees), this column/feature was replaced with two features: the corresponding sine and cosine embedding. The dataset was then flattened such that the columns contained all the relevant features and sensor measurements from all the weather stations at a particular 10-minute interval.
Since the target variable — i.e,. BISE — was not directly available, we engineered a proxy target variable for BISE called “tailwind speed around runway,” which above a certain threshold indicates the presence of BISE along the runway.
Forecasting wind in the Cloud
Once the dataset was ready, Lufthansa and Google Cloud evaluated several options before deciding to experiment and tune Vertex AI Forecast, Google’s AutoML-powered forecasting service, in order to achieve optimum results. Vertex Forecast is capable of the required feature engineering, neural architecture search, and hyper parameter tuning, and it is managed by Google Cloud to score in the top 2.5% in the M5 Forecasting Competition on Kaggle, in a completely automated fashion. These qualities made it an excellent choice for Lufthansa, to reduce the manual overhead of creating, deploying, and maintaining top performing deep learning models.
The raw data files were loaded from cloud storage, preprocessed on Vertex AI Workbench. Then, a training pipeline was initiated on Vertex AI Pipelines, which performed the following steps in sequence:
The .csv data file was loaded from Cloud Storage into a Vertex AI managed dataset.
A Vertex AI forecasting training job was initiated with the dataset, and it was also registered as a model in the Vertex AI Model Registry.
Upon completion, the model was evaluated on the test set, and the model’s predictions and the input features and ground truth of the test set, were stored in a user-defined table in BigQuery. Several test metrics were also available on the service and model dashboards.
One of the biggest challenges was the severe imbalance in the dataset, as measurements with BISE were very far and few in between. In order to account for this, instances where BISE occurred, as well the occurrences temporally close to them, were upweighted using weights calculated with methods including Inverse of Square Root of Number of Samples (ISNS), Effective Number of Samples (ENS), and Gaussian reweighting. The formulas for the methods are given below. These weights were supplied as separate columns in the dataset, and were iteratively used thereafter by the service as the “weight” column.
Results and next steps
Fig 1. Recall for 2-hour horizon
Fig 2. F1 Score for a 2-hour Horizon
In the above figures, the x-axis represents the forecast horizon and the Y-axis shows the respective metrics (Recall/F1-score). As shown after multiple experiments, we can see Vertex AI Forecast achieved higher recall and precision t (red bar), outperforming Lufthansa’s internal baseline heuristics, with the performance gap widening steadily as the forecast horizon extends further into the future. At the two-hour mark, our custom-configured Vertex AI Forecast model improved by 40% relative to the internal heuristics and 1700% compared to the random guess baseline. As we saw with other experiments, at a six-hour forecast horizon, the performance gap widens even more, with Vertex AI Forecast in the lead. Since forecasting BISE a few hours in advance is very beneficial to prevent flight delays for Lufthansa, this was a great solution for them.
“We are very excited to be able to not only do accurate long term forecasts for the BISE, but also that Vertex AI Forecasting makes training and deploying such models much easier and faster, allowing us to innovate rapidly to serve our customers and stakeholders in the best possible manner,” said Swiss Oliver Rueegg, Product Owner, Swiss International Airlines.
Lufthansa plans to explore productionizing this solution by integrating it into their Operations Decision Support Suite, which is used by the network controllers in the Operations Control Center in Kloten, as well as to work closely with Google’s specialists to integrate both Vertex AI Forecast and other of Google’s AI/ML offerings for their use cases.