What is Machine Learning?
Machine Learning is a scientific discipline that addresses the following question: ‘How can we program systems to automatically learn and to improve with experience? ’Learning in this context is not learning by heart but recognizing complex patterns and make intelligent decisions based on data. The difficulty lies in the fact that the set of all possible decisions given all possible inputs is too complex to describe. To tackle this problem the field of Machine Learning develops algorithms that discover knowledge from specific data and experience, based on sound statistical and computational principles.
The field of Machine Learning integrates many distinct approaches such as probability theory, logic, combinatorial optimization, search, statistics, reinforcement learning and control theory. The developed methods are at the basis of many applications, ranging from vision to language processing, forecasting, pattern recognition, games, data mining, expert systems and robotics.
The history of Machine Learning
The history of the field of Machine Learning is a fascinating story. In 1946 the first computer system ENIAC was developed. At that time the word ‘computer’ meant a human being that performed numerical computations on paper and ENIAC was called a numerical computing machine. This machine was manually operated, i.e. a human would make connections between parts of the machine to perform computations. The idea at that time was that human thinking and learning could be rendered logically in such a machine.
In 1950 Alan Turing proposed a test to measure its performance. The Turing test is based on the idea that we can only determine if a machine can actually learn if we communicate with it and cannot distinguish it from another human. Although, there have not been any systems that passed the Turing test many interesting systems have been developed.
Around 1952 Arthur Samuel (IBM) wrote the first game-playing program, for checkers, to achieve sufficient skill to challenge a world champion. Samuel’s machines learning programs worked remarkably well and were helpful in improving the performance of checkers players. Another milestone was the ELIZA system developed in the early 60’s by Jospeph Weizenbaum. ELIZA simulated a psychotherapist by using tricks like string substitution and canned responses based on keywords. When the original ELIZA first appeared, some people actually mistook her for human.
The illusion of intelligence works best, however, if you limit your conversation to talking about yourself and your life. Although the overall performance of ELIZA was disappointing, it was a nice proof of concept. Later on many other systems have been developed. Important was the work of the group of Ted Shortliffe on MYCIN (Stanford). They demonstrated the power of rule-based systems for knowledge representation and inference in the domain of medical diagnosis and therapy. This system is often called the first expert system.
At the same time when the expert systems were developed, other approaches to Machine Learning emerged. In 1957 Frank Rosenblatt invented the Perceptron at the Cornell Aeronautical Laboratory. The Perceptron is a very simple linear classifier but it was shown that by combining a large number of them in a network a powerful model could be created.
Neural network research went through many years of stagnation after Marvin Minsky and his colleagues showed that neural networks could not solve problems such as the XOR problem. However, several modifications have been produced later on that solve XOR and many more difficult problems.
In the early 90’s Machine Learning became very popular again due to the intersection of Computer Science and Statistics. This synergy resulted in a new way of thinking in AI: the probabilistic approach. In this approach uncertainty in the parameters is incorporated in the models. The field shifted to a more data-driven approach as compared to the more knowledge-driven expert systems developed earlier. Many of the current success stories of Machine Learning are the result of the ideas developed at that time.
Statistical AI is a center piece of Big Data analysis: as a result of the exponential growth in the amount of data that is available for scientific research, the sciences are on the brink of huge changes. That applies to all disciplines. In biology, for example, there will shortly be around 1 Exabyte of genomics data (10 to the power of 18 bytes) in the world. In 2024 the next generation of radio telescopes will produce in excess of 1 Exabyte per day. To deal with this data deluge, a new scientific discipline is taking shape. Big Data Science aims to develop new methods to store those substantial amounts, and to quickly find, analyze and validate complex patterns in Big Data.
The study of Machine Learning has grown from the efforts of a handful of computer engineers exploring whether computers could learn to play games and mimic the human brain, and a field of statistics that largely ignored computational considerations, to a broad discipline that has produced fundamental statistical-computational theories of learning processes.
Many of the new learning algorithms, such as support vector machines and Bayesian networks, are routinely used in commercial systems. We envisage that in the upcoming years Machine Learning will play a major role in the discovery of knowledge from the wealth of data that is currently available in a diverse amount of application areas.