Statistics is a branch of mathematics based on mathematical techniques applied to data. From the prehistoric times, statistics has provided efficient problem-solving methodologies. The major operations of statistics are collecting and reviewing data, analysing and interpreting data and showing results in summarized manner (Kumar and Choudhry 2010). Some of the statistical methods useful in machine learning are given in Table 2.1.
Description | Formulae | Explanation |
---|---|---|
Arithmetic mean | X¯ | Average value of the samples |
Standard deviation | s=∑(x−X)2n−1 | Average distance from the mean. N is number of samples and every value x, mean value X¯ |
Coefficient of variation | cv=s¯x¯×100 | It is measure of relative variability. S is standard deviation, X¯ is mean |
Coefficient of skewness | sk=x−mos | Mean and mode values are used and s is standard deviation. |
Correlation | r=l(n−l)[Σ(x−x¯)(y−y¯)SxSy] | Measure of association between variables x and y. |
Regression | y=b0+b1×x1 | Relationship among dependent and independent variables. There are some other formulas available for regression line and slope etc. |
Machine learning is an emerging area which is successor of artificial intelligence. It implies process of training a machine without explicit programming. There are various kinds of techniques used to handle various types of problems:
- Supervised learning: In this kind of learning, the users have prerequisite knowledge about the output of a problem. The labelled data are evaluated with the help of an expert. Prediction can be made along with labelled data. Algorithms for the supervised learning are classification and regression.Example: SVM, discriminant analysis, naive Bayes, K-nearest neighbour (KNN), linear regression, logistic regression, neural networks, decision trees.
- Unsupervised learning: In this kind of learning, the users cannot have a prerequisite knowledge of output of problem. Unlabelled data are trained without the supervision of an expert. Algorithms for unsupervised learning are clustering.Example: K-means, K-mediods, fuzzy c-means, hidden Markov, Gaussian mixture, hierarchical.
- Reinforcement learning:This is another kind of learning method which is somewhat different from the aforementioned methods. Many disciplines such as operation research, game theory and control theory use reinforcement learning method (Daume 2012).Example: Autonomous vehicles, games.
Leave a Reply