
When is alder pollen the highest?
Alder is a common tree here in Estonia. It is pollinating in late winter or early spring, causing allergic reactions in many people. Let's see if we can predict the levels of alder pollen based on weather data.
The pollen levels are monitored by the Allergy and Asthma Federation. On their webpage, they have data from the past 6 years.
Below you can see what the alder pollen levels have been over those years. As you can see, the peak is usually around the 90th day of the year which corresponds to the end of March. Also, the levels vary a lot - in 2016 and 2021, they reached thousands, whereas usually, they are in the hundreds. The overall mean of the dataset is 56, the standard deviation is 179.
The model was build using XGBoost. Hyperparameter tuning was done using GridSearchCV.
Below is a chart that shows the actual versus predicted levels of pollen for the test data set. As you can see, the predictions are not super. The mean absolute error is very high - 51 (the average was 56).
And here are the top 10 most important features of the model, based on SHAP values.