Descriptive analytics: Descriptive analytics is an analysis based on descriptive statistical methods such as mean, median, mode, standard deviation, variability, skewness and kurtosis. These are collectively called measures of central tendency and dispersion of data.

  • Mean and median: Average value of a series of data is called mean, and the middle value of the series is called median.
  • Standard deviation and variability: The sequence depends on the mean of a series. The amount of difference from the mean for each data item is called standard deviation. This is the square root of the variance. A squared difference from the mean is called variance. The variability expresses the level scattering of data throughout the series.
  • IQR: The difference between the 75th percentile of series and 25th percentile of the data series is IQR. It provides the middle value, which is 50% of the data. The expression Q3−Q1. Q3 gives 75th percentile data and Q1 gives 25th percentile data.
  • Skewness and kurtosis: Skewness is a representation of the data. It provides the lack of proportion in data allocation. There are two types of skewness. One is positive skewness which means the mode value is less than the mean and median values. In negative skewness, mode value is greater than the mean and median values.
    • If −0.5 ≤ skewness ≤0.5, then data are symmetrical.
    • If −1 ≤ skewness ≤ −0.5, then data are negatively skewed.
    • If 0.5 ≤ skewness ≤ 1, then data are positively skewed.
  • Kurtosis is the method to find the outliers in the sequence of data. This can be divided into high and low kurtosis. If the kurtosis is high, the data are having high amount of outliers. It should be considered for further processes. If the kurtosis is low, then the data are having low outliers. Based on the value of kurtosis, it can be classified into following three types:
    1. Mesokurtic: The kurtosis of the distribution is same as the normal distribution value.
    2. Leptokurtic: If kurtosis > 3, then it is high than the mesokurtic which means the data contains high outliers.
    3. Platykurtic: If kurtosis < 3, then it is meant by shorter distribution and data are having low level of outliers (Kumar and Choudhry 2010).

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *