Introduction To Machine Learning - Definition, History, Prospective & Future

The machine learning (in English  : machine learning , . Lit. '  machine learning"), machine learning 1 or machine learning is a field of study of artificial intelligence that is based on mathematical approaches and statistics to give to computers the ability to "learn" from data, that is to say to improve their performance in solving tasks without being explicitly programmed for each. More broadly, it concerns the design, analysis, optimization, development and implementation of such methods.
Machine learning generally has two phases. The first consists in estimating a model from data, called observations, which are available and in finite number, during the system design phase. Model estimation involves solving a practical task, such as translating a speech, estimate a probability density, recognize the presence of a cat in a photograph or participate in driving an autonomous vehicle. This so-called “learning” or “training” phase is generally carried out before the practical use of the model. The second phase corresponds to the start of production: the model being determined, new data can then be submitted in order to obtain the result corresponding to the desired task. In practice, some systems can continue to learn once in production, provided they have a way of obtaining feedback on the quality of the results produced.
According to the information available during the learning phase, learning is qualified in different ways. If the data is labeled (i.e. the task response is known for this data), this is supervised learning . We speak of classification or classification 3 if the labels are discrete, or of regression if they are continuous. If the model is learned incrementally based on a reward received by the program for each of the actions undertaken, this is called reinforcement learning . In the most general case, without a label, we seek to determine the underlying structure of the data (which can be a probability density) and it isunsupervised learning . Machine learning can be applied to different types of data, such as graphs , trees , curves , or more simply feature vectors , which can be continuous or discrete.

Machine Learning History

HISTORY
Since antiquity, the subject of thinking machines has preoccupied minds. This concept is the basis of thoughts for what will become artificial intelligence , as well as one of its sub-branches: machine learning.
The concretization of this idea is mainly due to Alan Turing (British mathematician and cryptologist) and to his concept of the “universal machine” in 1936 4 , which is at the base of the computers of today. He continued to lay the foundations for machine learning, with his article on "The computer and intelligence" in 1950 5 , in which he developed, among other things, the Turing test .
In 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts published an article describing the functioning of neurons by representing them using electrical circuits. This representation will be the theoretical basis of neural networks 6 .
Arthur Samuel , a pioneering American computer scientist in the artificial intelligence sector, was the first to use the term machine learning (in French, "machine learning") in 1959 following the creation of his program for IBM in 1952. The program played Checkers and improved while playing. Eventually, he managed to beat the th best player in the United States 7 , 8 .
A major advance in the machine intelligence sector is the success of IBM's computer, Deep Blue , which was the first to defeat world chess champion Garry Kasparov in 1997. The Deep Blue project will inspire many others in the context of artificial intelligence, particularly another big challenge: IBM Watson , the computer whose goal is to win the Jeopardy game 9 . This goal was reached in 2011, when Watson won at Jeopardy! by answering the questions by natural language processing 10 .
During the following years, mediated machine learning applications succeeded much faster than before.
In 2012, a neural network developed by Google managed to recognize human faces as well as cats in YouTube videos 11 , 12 .
In 2014, 64 years after Alan Turing's prediction, the dialoguer Eugene Goostman is the first to pass the Turing test by convincing 33% of human judges after five minutes of conversation that he is not a computer. , but a 13 year old Ukrainian boy 13 .
In 2015, another milestone was reached when the computer "  AlphaGo  " Google wins against one of the best players in the game of Go , board game regarded as the toughest in the world 14 .
In 2016, an artificial intelligence system based on machine learning called LipNet managed to read the lips with a great success rate .

Principles 

The algorithms used allow, to a certain extent, a computer-controlled system (possibly a robot), or computer-assisted, to adapt its analyzes and behaviors in response, based on the analysis of empirical data from 'a database or sensors.
The difficulty lies in the fact that the set of all possible behaviors taking into account all possible inputs quickly becomes too complex to describe (we speak of a combinatorial explosion ). Programs are therefore entrusted with the task of adjusting a model to simplify this complexity and using it operationally. Ideally, learning will aim to be unsupervised , i.e. the nature of the training data is not known 17 .
These programs, depending on their level of development, may include probabilistic data processing capacities, analysis of data from sensors, recognition (voice recognition, shape, writing ...), data mining , theoretical computing …

Applications 

Machine learning is used to equip computers or machines with: perception of their environment ( vision , recognition of objects such as faces, diagrams, natural languages, handwritten characters , shapes); search engine ; assistance with diagnostics, especially medical, bioinformatics , chemoinformatics  ; brain-machine interfaces  ; credit card fraud detection, cybersecurity , financial analysis , including stock market analysis  ; classification of DNA sequences; Game ; software engineering ; website adaptation; locomotion of robots; predictive analysis in legal and judicial matters…
Examples:
  • a machine learning system can allow a robot with the ability to move its limbs, but initially knowing nothing about the coordination of movements allowing walking, to learn to walk. The robot will start by performing random movements, then, by selecting and privileging the movements allowing it to move forward, will gradually set up an increasingly efficient step;
  • recognizing handwritten characters is a complex task because two similar characters are never exactly equal. We can design a machine learning system that learns to recognize characters by observing "examples", that is, known characters.

Types of learning 

Learning algorithms can be categorized according to the learning mode they use:
Supervised learning
If the classes are predetermined and the examples known, the system learns to classify according to a classification or classification model ; this is called supervised learning (or discriminant analysis ). An expert (or oracle ) must label examples beforehand. The process takes place in two phases. During the first phase (offline, called learning ), it is a question of determining a model from the labeled data. The second phase (online, called test) consists in predicting the label of a new data, knowing the previously learned model. Sometimes it is preferable to associate a piece of data not with a single class, but with a probability of belonging to each of the predetermined classes (this is called probabilistic supervised learning).
ex. : The linear discriminant analysis or SVM are typical examples. Another example: according to common points detected with the symptoms of other known patients (the examples ), the system can categorize new patients on the basis of their medical analyzes in estimated risk ( probability ) of developing such or such disease.
Unsupervised learning
When the system or operator has only examples, but no label, and the number of classes and their nature have not been predetermined, we speak of unsupervised learning or clustering in English. No expert is required. The algorithm must discover for itself the more or less hidden structure of the data. The data partitioning , data clustering in English, is an unsupervised learning algorithm.
Here the system must - in the description space (the sum of the data) - target the data according to their available attributes, to classify them into homogeneous groups of examples. The similarityis generally calculated according to a distance function between pairs of examples. It is then up to the operator to associate or deduce meaning for each group and for the patterns ( patterns in English) of the appearance of groups, or groups of groups, in their "space". Various mathematical tools and software can help him. We also speak of analysis of regression data (adjustment of a model by a least squares type procedure or other optimization of a cost function ). If the approach is probabilistic(that is to say that each example, instead of being classified in a single class, is characterized by a set of probabilities of belonging to each of the classes), we then speak of "  soft clustering  " (by opposition the "  hard clustering  ").
This method is often a source of serendipity .
ex. : For an epidemiologist who would like in a fairly large group of victims of liver cancer to try to bring out explanatory hypotheses, the computer could differentiate different groups, which the epidemiologist would then seek to associate with various explanatory factors, geographical origins, genetic , consumption habits or practices, exposure to various potentially or actually toxic agents ( heavy metals , toxins such as aflatoxin ,  etc. ).
Semi-supervised learning
Performed probabilistically or not, it aims to show the underlying distribution of the examples in their description space. It is implemented when data (or "labels") are missing ... The model must use unlabeled examples which can nevertheless provide information.
ex. : In medicine, it can constitute an aid to diagnosis or to the choice of the least expensive means of diagnostic tests.
Partially supervised learning
Probabilistic or not, when the labeling of the data is partial 18 . This is the case when a model states that a piece of data does not belong to a class A , but perhaps to a class B or C ( A, B and C being three diseases for example mentioned in the context of a differential diagnosis ):
Reinforcement learning 19
the algorithm learns a behavior given an observation. The action of the algorithm on the environment produces a return value which guides the learning algorithm.
ex. : The Q-learning algorithm 20 is a classic example.
Transfer learning 21
Transfer learning can be seen as the ability of a system to recognize and apply knowledge and skills, learned from previous tasks, on new tasks or areas that share similarities. The question is: how to identify the similarities between the target task (s) and the source task (s), then how to transfer knowledge of the source task (s) (s) to the target task (s)?

Algorithms used 

These are:
These methods are often combined to obtain various learning variants. The use of this or that algorithm strongly depends on the task to be solved (classification, estimation of values, etc.).
Machine learning is used for a wide spectrum of applications, for example:

Relevance and effectiveness factors 

The quality of learning and analysis depends on the need upstream and a priori the skill of the operator to prepare the analysis. It also depends on the complexity of the model (specific or generalist), its adequacy and its adaptation to the subject to be treated. Ultimately , the quality of the work will also depend on the mode (of visual highlighting) of the results for the end user (a relevant result could be hidden in an overly complex diagram, or poorly highlighted by an inappropriate graphic representation).
Before that, the quality of the work will depend on initial constraining factors linked to the database  :
  1. Number of examples (the fewer there are, the more difficult the analysis, but the more there are, the higher the need for computer memory and the longer the analysis);
  2. Number and quality of attributes describing these examples. The distance between two numerical "examples" (price, size, weight, light intensity, noise intensity, etc.) is easy to establish, that between two categorical attributes (color, beauty, usefulness ...) is more delicate;
  3. Percentage of data completed and missing;
  4. “Noise”  : the number and “localization” of doubtful values ​​(potential errors, outliers…) or naturally not in conformity with the general distribution pattern of “examples” on their distribution space will impact on the quality of the analysis.

Steps in a machine learning project

Machine learning is not just a set of algorithms but follows a succession of steps 25 .
  1. Data acquisition  : the algorithm which feeds on the input data is an important step. The success of the project depends on collecting relevant and sufficient data.
  2.  Data preparation and cleaning : the data collected must be edited before use. Indeed, certain attributes are useless, others must be modified in order to be understood by the algorithm, and certain elements are unusable because their data are incomplete. Several techniques such as data visualization , the data transformation  (in) or the standards are then used.
  3. The creation of the model .
  4. Evaluation  : once the machine learning algorithm is trained on a first set of data, it is evaluated on a second set of data in order to verify that the model does not over - learn .
  5. Deployment  : the model is deployed in production to make predictions, and potentially use the new input data to re-train and be improved.

Technical controversies 

Machine learning requires large amounts of data to function properly. It can be difficult to control the integrity of data sets, especially in the case of data generated by social networks 26 .
The quality of the “decisions” made by an AA algorithm depends not only on the quality (therefore of their homogeneity, reliability, etc.) of the data used for training but above all on their quantity. So, for a set of social data collected without particular attention to the representation of minorities, the AA is statistically unfair towards them. In fact, the capacity to make "good" decisions depends on the size of the data, which will be proportionately lower for minorities.
The AA does not currently distinguish cause and correlation by its mathematical construction, and is unable to go beyond the framework imposed by its data, it therefore has no capacity for extrapolation . An example: if we teach an algorithm to return the number we give it ( identity application ) by training it only with the numbers 1 to 5, it will be unable to correctly answer 6. It is therefore unable to extrapolate .
The use of machine learning algorithms therefore requires awareness of the data framework that was used for learning during their use. It is therefore pretentious to attribute too great virtues to machine learning algorithms 27 .

Application to the autonomous car 

The autonomous car seems feasible thanks to machine learning and the huge amounts of data generated by the increasingly connected automobile fleet. Unlike conventional algorithms (which follow a set of predetermined rules), machine learning learns its own rules 28 .
The main innovators in the field insist that progress comes from process automation. This has the flaw that the machine learning process becomes privatized and obscure. Privatized because the algorithms of AA constitute gigantic economic opportunities, and obscure because their understanding passes behind their optimization. This development has the potential to damage public confidence in machine learning, but especially the long-term potential of very promising techniques 29 .
The autonomous car presents a test framework for confronting machine learning with society. Indeed, it is not only the algorithm that is formed in road traffic and its rules, but also the reverse. The principle of responsibility is called into question by machine learning, because the algorithm is no longer written but learns and develops a kind of digital intuition. The creators of algorithms are no longer able to understand the “decisions” made by their algorithms, this by mathematical construction of the machine learning algorithm 30 .
In the case of AA and autonomous cars, the question of liability in the event of an accident arises. Society must provide an answer to this question, with different possible approaches. In the United States, there is a tendency to judge a technology by the quality of the result it produces, while in Europe the precautionary principle is applied, and there is more tendency to judge a new technology compared to the previous ones. , by assessing the differences from what is already known. Risk assessment processes are underway in Europe and the United States 29 .
The question of responsibility is all the more complicated since the priority for designers lies in the design of an optimal algorithm, not in understanding it. The interpretability of algorithms is necessary to understand the decisions, especially when these decisions have a profound impact on the lives of individuals. This notion of interpretability, that is to say the ability to understand why and how an algorithm works, is also subject to interpretation.
The issue of data accessibility is controversial: in the case of autonomous cars, some defend public access to data, which would allow better training in algorithms and would not concentrate this "digital gold" in the hands of 'a handful of individuals, moreover, campaign for the privatization of data in the name of the free market, without neglecting the fact that good data constitutes a competitive and therefore economic advantage 29 , 31 .

Prospective 

In the years 2000-2010, machine learning was still an emerging, but versatile technology, which was theoretically capable of accelerating the pace of automation and self-learning itself. Combined with the emergence of new means of producing, storing and circulating energy, as well as ubiquitous computing, it could disrupt technologies and society (as did the steam engine and electricity , then oil and IT during the industrial revolutionsprevious. Machine learning could generate unexpected innovations and capabilities, but with the risk, according to some observers, of loss of control on the part of humans on many tasks that they will no longer be able to understand and that will be done routinely by IT entities and robotic. This suggests specific complex impacts which are still impossible to assess on employment, work and more broadly the economy and inequalities.
According to the journal Science at the end of 2017:

“The effects on employment are more complex than the simple question of replacement and substitutions highlighted by some. Although the economic effects of BA are relatively limited today and we are not facing an imminent “end of work” as is sometimes proclaimed, the implications for the economy and the workforce are far-reaching. ” 

It is tempting to draw inspiration from living beings without naively copying them 33 to design machines capable of learning. The notions of percept and concept as physical neural phenomena have been popularized in the French - speaking world by Jean-Pierre Changeux . Machine learning remains primarily a sub-domain of computer science , but it is operationally closely linked to cognitive science , neuroscience , biology and psychology , and could be at the crossroads of these fields,nanotechnologies, biotechnologies, computer science and cognitive sciences , leading to artificial intelligence systems with a broader base. Public lessons have notably been provided at the Collège de France , one by Stanislas Dehaene 34 focusing on the Bayesian aspect of neuroscience, and the other by Yann LeCun 35 on the theoretical and practical aspects of deep learning .

We are just beginning!😊

No comments:

Powered by Blogger.