Weather Forecasting using Python¶
In Data Science, weather forecasting is an application of Time Series Forecasting where we use time-series data and algorithms to make forecasts for a given time. If you want to learn how to forecast the weather using your Data Science skills, this article is for you. In this article, I will take you through the task of weather forecasting using Python.
Weather Forecasting¶
Weather forecasting is the task of forecasting weather conditions for a given location and time. With the use of weather data and algorithms, it is possible to predict weather conditions for the next n number of days.
For forecasting weather using Python, we need a dataset containing historical weather data based on a particular location. I found a dataset on Kaggle based on the Daily weather data of New Delhi. We can use this dataset for the task of weather forecasting. You can download the dataset from here.
dataset download: https://www.kaggle.com/datasets/sumanthvrao/daily-climate-time-series-data
In the section below, you will learn how we can analyze and forecast the weather using Python.
Analyzing Weather Data using Python¶
Now let’s start this task by importing the necessary Python libraries and the dataset we need:
import cmdstanpy
cmdstanpy.install_cmdstan()
cmdstanpy.install_cmdstan(compiler=True) # only valid on Windows
import kaggle
# check api for dataset downloading from the website:
!kaggle datasets download -d sumanthvrao/daily-climate-time-series-data
# unzip dataset
from zipfile import ZipFile
dataset = "./daily-climate-time-series-data.zip"
with ZipFile(dataset, "r") as zip:
zip.extractall()
print("The dataset is extracted")
daily-climate-time-series-data.zip: Skipping, found more recently modified local copy (use --force to force download)
The dataset is extracted
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
data = pd.read_csv("./DailyDelhiClimateTrain.csv")
print(data.head())
date meantemp humidity wind_speed meanpressure
0 2013-01-01 10.000000 84.500000 0.000000 1015.666667
1 2013-01-02 7.400000 92.000000 2.980000 1017.800000
2 2013-01-03 7.166667 87.000000 4.633333 1018.666667
3 2013-01-04 8.666667 71.333333 1.233333 1017.166667
4 2013-01-05 6.000000 86.833333 3.700000 1016.500000
Let’s have a look at the descriptive statistics of this data before moving forward:
data.describe()
meantemp | humidity | wind_speed | meanpressure | |
---|---|---|---|---|
count | 1462.000000 | 1462.000000 | 1462.000000 | 1462.000000 |
mean | 25.495521 | 60.771702 | 6.802209 | 1011.104548 |
std | 7.348103 | 16.769652 | 4.561602 | 180.231668 |
min | 6.000000 | 13.428571 | 0.000000 | -3.041667 |
25% | 18.857143 | 50.375000 | 3.475000 | 1001.580357 |
50% | 27.714286 | 62.625000 | 6.221667 | 1008.563492 |
75% | 31.305804 | 72.218750 | 9.238235 | 1014.944901 |
max | 38.714286 | 100.000000 | 42.220000 | 7679.333333 |
Now let’s have a look at the information about all the columns in the dataset:
# Checking information of data type
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1462 entries, 0 to 1461
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 date 1462 non-null object
1 meantemp 1462 non-null float64
2 humidity 1462 non-null float64
3 wind_speed 1462 non-null float64
4 meanpressure 1462 non-null float64
dtypes: float64(4), object(1)
memory usage: 57.2+ KB
The date column in this dataset is not having a datetime data type. We will change it when required. Let’s have a look at the mean temperature in Delhi over the years:
figure = px.line(data, x="date", y="meantemp",
title = "Mean Temperature in Delhi Over the Years")
figure.show()

Now let’s have a look at the humidity in Delhi over the years:
figure = px.line(data, x ="date", y = "humidity",
title = "Humidity in Delhi Over the Years")
figure.show()

Now let’s have a look at the wind speed in Delhi over the years:
figure = px.line(data, x="date", y="wind_speed",
title = "Wind Speed in Delhi Over the Years")
figure.show()

Till 2015, the wind speed was higher during monsoons (August & September) and retreating monsoons (December & January). After 2015, there were no anomalies in wind speed during monsoons. Now let’s have a look at the relationship between temperature and humidity:
figure = px.scatter(data_frame=data, x="humidity",
y="meantemp", size="meantemp",
trendline="ols",
title="Relationship Between Temperature and Humidity")
figure.show()

There’s a negative correlation between temperature and humidity in Delhi. It means higher temperature results in low humidity and lower temperature results in high humidity.
Analyzing Temperature Change¶
Now let’s analyze the temperature change in Delhi over the years. For this task, I will first convert the data type of the date column into datetime. Then I will add two new columns in the dataset for year and month values.
Here’s how we can change the data type and extract year and month data from the date column:
data["date"] = pd.to_datetime(data["date"], format="%Y-%m-%d")
data["year"] = data["date"].dt.year
data["month"] = data["date"].dt.month
data.head()
date | meantemp | humidity | wind_speed | meanpressure | year | month | |
---|---|---|---|---|---|---|---|
0 | 2013-01-01 | 10.000000 | 84.500000 | 0.000000 | 1015.666667 | 2013 | 1 |
1 | 2013-01-02 | 7.400000 | 92.000000 | 2.980000 | 1017.800000 | 2013 | 1 |
2 | 2013-01-03 | 7.166667 | 87.000000 | 4.633333 | 1018.666667 | 2013 | 1 |
3 | 2013-01-04 | 8.666667 | 71.333333 | 1.233333 | 1017.166667 | 2013 | 1 |
4 | 2013-01-05 | 6.000000 | 86.833333 | 3.700000 | 1016.500000 | 2013 | 1 |
Now let’s have a look at the temperature change in Delhi over the years:
plt.style.use('fivethirtyeight')
plt.figure(figsize=(15, 10))
plt.title("Temperature Change in Delhi Over the Years")
sns.lineplot(data = data, x = "month", y = "meantemp", hue = "year");

Although 2017 was not the hottest year in the summer, we can see a rise in the average temperature of Delhi every year.
Forecasting Weather using Python¶
Now let’s move to the task of weather forecasting. I will be using the Facebook prophet model for this task. The Facebook prophet model is one of the best techniques for time series forecasting. If you have never used this model before, you can install it on your system by using the command mentioned below in your command prompt or terminal:
The prophet model accepts time data named as “ds”, and labels as “y”. So let’s convert the data into this format:
forcast_data = data.rename(columns = {"date": "ds","meantemp": "y" })
forcast_data
ds | y | humidity | wind_speed | meanpressure | year | month | |
---|---|---|---|---|---|---|---|
0 | 2013-01-01 | 10.000000 | 84.500000 | 0.000000 | 1015.666667 | 2013 | 1 |
1 | 2013-01-02 | 7.400000 | 92.000000 | 2.980000 | 1017.800000 | 2013 | 1 |
2 | 2013-01-03 | 7.166667 | 87.000000 | 4.633333 | 1018.666667 | 2013 | 1 |
3 | 2013-01-04 | 8.666667 | 71.333333 | 1.233333 | 1017.166667 | 2013 | 1 |
4 | 2013-01-05 | 6.000000 | 86.833333 | 3.700000 | 1016.500000 | 2013 | 1 |
... | ... | ... | ... | ... | ... | ... | ... |
1457 | 2016-12-28 | 17.217391 | 68.043478 | 3.547826 | 1015.565217 | 2016 | 12 |
1458 | 2016-12-29 | 15.238095 | 87.857143 | 6.000000 | 1016.904762 | 2016 | 12 |
1459 | 2016-12-30 | 14.095238 | 89.666667 | 6.266667 | 1017.904762 | 2016 | 12 |
1460 | 2016-12-31 | 15.052632 | 87.000000 | 7.325000 | 1016.100000 | 2016 | 12 |
1461 | 2017-01-01 | 10.000000 | 100.000000 | 0.000000 | 1016.000000 | 2017 | 1 |
1462 rows × 7 columns
Now below is how we can use the Facebook prophet model for weather forecasting using Python:
from prophet import Prophet
from prophet.plot import plot_plotly, plot_components_plotly
model = Prophet()
model.fit(forcast_data)
forcasts = model.make_future_dataframe(periods=365)
predictions = model.predict(forcasts)
plot_plotly(model, predictions)

So this is how you can analyze and forecast the weather using Python.
Summary¶
Weather forecasting is the task of forecasting weather conditions for a given location and time. With the use of weather data and algorithms, it is possible to predict weather conditions for the next n number of days. I hope you liked this article on Weather Analysis and Forecasting using Python. Feel free to ask valuable questions in the comments section below.
'파이썬 > project' 카테고리의 다른 글
[파이썬/proj] Screen Time Analysis (0) | 2022.11.21 |
---|