Category: Descriptive and predictive analytics using machine learning algorithms
-
Forecasting of ARIMA
Forecasting of rainfall from 2016 to 2025 can be done with low and high levels of expectation. The confidence levels are 80% and 95%. Figure 2.14 shows the graphical representation of forecasting (Table 2.5). Year Point forecast Low 80 High 80 Low 95 High 95 2016 1103.422 969.6137 1237.231 898.7797 1308.065 2017 1113.926 979.6177 1248.235 908.5191 1319.333…
-
Measuring goodness of fit
The components AIC, AICc and BIC are the estimators of best fit model of ARIMA. These coefficients are supported to maximize the log likelihood value of ARIMA model. Rainfall data provide best fit along with (0, 1, 1) × (1, 0, 0) model and its coefficients are smaller than the other ARIMA models (Table 2.4). ARIMA model…
-
Autocorrelation and partial autocorrelation functions
Figure 2.12 shows the autocorrelation and partial autocorrelation of rainfall data. These are the plots used to display the correlated data with the significant level. In the figure, the data are correlated within the boundary level with 95% confidence interval significant level. Partial autocorrelation is the relationship between the observed data which has applied time series and…
-
Predictive analytics
Prediction of rainfall data was done with the help of time series analysis. ARIMA is one of the models used for time series analysis. Figure 2.10 shows the normal flow of time series. Figure 2.11 explains the decomposition of additive and multiplicative time series of rainfall data. To make the data stationary, the components such as trend, seasonality,…
-
Nature of data: skewness and kurtosis
Figure 2.7 shows the skewness and kurtosis of the dataset. The skewness of the rainfall data is 0.01999941. It shows that the distribution of data is positively skewed. Kurtosis value is 2.763914. This is less than 3 which means the data contain low-level outliers. Table 2.3 explains the overall summary of data with parameters like minimum, maximum,…
-
Results and discussion
Descriptive analytics Figure 2.6 shows the overall rainfall level of India from 1901 to 2015. Maximum and minimum rainfall values are highlighted. Maximum rainfall was 1480.3 mm, which occurred in 1917, and minimum rainfall was 920.8 mm, which occurred in 2002.
-
Predictive analytics
There are various models that can be used for prediction purpose. Time series is a sequence of data points which is well-ordered based on time. Time series can be expressed asYt=f(t),(2.1) where Yt is the variable’s value in the study at time t. The components of time series analysis are Trend: It refers to increasing or…
-
Model planning
Descriptive analytics: Descriptive analytics is an analysis based on descriptive statistical methods such as mean, median, mode, standard deviation, variability, skewness and kurtosis. These are collectively called measures of central tendency and dispersion of data.
-
Data cleansing
The dataset contains null values in the some of the rows. To identify the null values isnull() function is used and the missing values are filled by the mean values of the rows. Applying the mean values is one of the methods used to handle the missing values.