Machine Learning: what it is and why it matters
"Machine learning" could perhaps be best described as the operational rather than cognitive process of a programme learning from experience gained in performing a task or class of tasks and, by logging that experience, future performance of that task improves in proportion to the accrued experience.
Machine learning has evolved from the study of pattern recognition and learning theory in artificial intelligence, enabling computers to learn without being explicitly programmed. The process of machine learning enables construction of algorithms that can learn from, and then make accurate predictions about, data. Such algorithms operate by building a model from example inputs in order to create rules. This enables the computer to make data-driven predictions or decisions expressed as outputs, which are extrapolated from the patterns learned from the example data.
Types of problems and tasks
Machine learning tasks are typically classified into a number of broadly construed categories, grouped according to the nature of the learning 'signal' or 'feedback' available to a learning system. These are:
Supervised learning: the computer is presented with example inputs and their desired outputs, given by a 'teacher', in order to establish a rule that matches a predictable output to a given input. Supervised learning can be the quickest way for the machine to begin yielding consistently accurate results but, being entirely teacher-led in the early stages, it is particularly susceptible to inductive bias.
Unsupervised learning: the learning algorithm processes the example input data to find trends. This differs from supervised learning, during which the data is labelled when input. This process of discovering hidden patterns in data can be an end in itself, or a means to an end. Data mining is an example of this.
Semi-supervised learning: the programme receives a partial or incomplete example data set, so that certain target outputs cannot be obtained.
Reinforcement learning: a computer programme develops in a real-world environment to perform a certain task (such as a self-drive car), without a teacher manipulating it to govern correct performance of that task.
Inductive bias is perceived to be a potentially major problem with machine learning, and different models have been developed in an attempt to minimise and solve such problems. A machine learns by creating rules based on the data to which the teacher chooses to expose it, and processes that data according to the rules with which it was programmed. However, while the machine can apply these processes with more consistent accuracy each time it does so (as knowledge is cumulatively banked as 'experience') it remains a challenge to 'teach' the machine to develop genuinely new skills. The next stage is to enable machines to develop these new skills autonomously through a combination of supervised and unsupervised learning.
The iterative aspect of machine learning is important because models are able to adapt autonomously as they are exposed to new data: this is the "learning" aspect of their function. They learn from equivalent previous calculations (experiences) to produce reliable, repeatable decisions and results.
Automated model building is key to successful machine learning. Thought leaders comment that an intelligent machine set up to learn and create models can produce thousands of models in a week- in comparison, a human programmer can produce only one or two during that time.
While the theories that underpin many machine learning algorithms are nothing new, developments in the fields of unsupervised and reinforced learning have unlocked new capabilities to teach a machine to automatically apply complex mathematical calculations to big data repeatedly and more quickly. Factors such as the growing volume and variety of data and the ease, accuracy and cheapness of computational processing and data storage mean that machine learning is a tool of growing importance for businesses.
Why machine learning matters for business
Machine learning has many benefits for businesses that are ready to wake up to its potential. Machine learning models are already employed in a range of computing tasks where development of algorithms is results- and experience-driven, such as spam filtering, optical character recognition (OCR), search engines and computer vision. Machine learning is sometimes conflated with data mining, as exploratory data analysis overlaps with some tasks achievable through unsupervised learning.
Within the field of data analytics, machine learning is a method being used increasingly to devise complex models and algorithms through knowledge-based prediction. These models allow analysts to produce reliable, repeatable decisions and results based on, and made increasingly accurate by, repeated experience, by retaining knowledge acquired from historical trends in data sets.
Many online activities in business and everyday life rely on algorithms powered by machine learning, including:
'Interest-based' or 'behavioural targeting' web adverts;
deriving pricing models;
search engine results;
pattern and image recognition;
email spam filtering;
prediction of failures in equipment, such as hard drives;
credit scoring and next-best offers; and
network intrusion detection.
Beyond theoretical problem-solving and pattern recognition, the swifter and more accurate results achieved through machine learning are paying business-practical dividends with many real world applications.
Supervised learning has already gained popularity against more traditional statistical modelling in building pricing models, and the ability to apply this to iterative predictions for future variables more quickly and accurately will save costs and deliver more cutting-edge models. One notable analyst in the insurance industry has published tests showing improvement in loss ratio performance of at least 1% for pricing models derived through machine learning.
An application of machine learning that most people will have seen in their personal lives would be in interest- or behaviour-based targeting in advertising or viewing preferences. Learning algorithms are used by media companies which suggest programmes or films that a subscriber may be interested in, and by advertisers who track information on internet shoppers' browsing from one website to the next and record whether or not a purchase is made. These algorithms harvest and aggregate data and, in so doing, they extrapolate trends and then target specific adverts at particular users.
A fascinating yet very simple example of the power of reinforced learning is presented in the automotive industry: a leading car manufacturer in the US has collected data on nearly 800 million miles of driving. This data is fed into an "autopilot" assisted driving programme that enables the car to steer, manoeuvre, and navigate through traffic with little input from a human driver. Current projections indicate that the age of the entirely autonomous car is only months away.
Changing times for intelligent machines
For those businesses that are grasping the technology, the machine learning revolution has already begun, bringing swift change in analysis of high volumes of data, delivering ever more accurate results, ever more quickly. With the ever-growing value and volume of big data, unlocking this capability has never been more timely; thus data gathering and smart learning will be both the brains and the value behind new applications. As demand for this technology grows, we may see a shift in the machine learning paradigm with an industry growth in platforms offering off-the-shelf or tailored machine learning models-as-a-service, enabling businesses of all sizes to benefit from machine learning, enabling them to build, develop and grow applications more rapidly.
Where difficulties arise, a root cause of such obstacles may lie in societal attitudes and the ability of companion technologies to keep pace with smart machines. Machine learning enables online advertising and television viewing suggestions to be directed at individuals with accuracy that would have been the stuff of advertisers' dreams a decade ago. However, as prediction of consumers' preferences becomes more accurate, the public's concern for its privacy, and a fear of being "spied on" while browsing also grows.
Likewise, where the march of smart technology is not matched by that of the industries to which it is applied, the leveraging of data may be stymied, with potentially disastrous results. Only weeks ago, the first known fatal crash of a self-drive car delivered a severe blow to public and industry confidence in the safety of automotive cars. While the ability to drive and manoeuvre a car taught by banked data has been achieved, the machine has yet to learn how to function safely in relation to the myriad hazards presented by the real world environment.