2 Examples Of Machine Learning Shows How Machine Learning Works For Very Real Applications

Colin de la Higuera writes that Examples Of Machine Learning represents a break between programming - that is to say detailing the sequence of operations to perform to accomplish a given task - and training - it to say to endow the machine with the capacity to learn and obtain the behavior desired by the example.

 Examples Of Machine Learning

The drivers of this rupture are the permanent increases in the volume of available data and the speed of computers. As the volume of data increases, it becomes more difficult for the programmer to consider all possible cases and to design a correct program. But it also means that there are many examples to show to a machine capable of learning. This means that machine learning becomes more interesting every day. In practice, this rupture spans several decades and takes the form of a series of application successes which are becoming more and more frequent and noticed.

The following two examples illustrate this break and show how machine learning works for very real applications.

1. The Categorization Of Texts

One of the basic tasks of information management is to classify documents written in natural language by examining their content. For example, an educated reader can easily separate political news from other news. On the other hand, this same reader will have many difficulties if asked to formulate a general criterion allowing to recognize documents whose subject is political. It is therefore difficult to program a computer to accomplish this separation automatically.

It was hoped in the 1970s to solve these kinds of problems by accumulating the knowledge of human experts. A reader can often explain why they think the subject of a particular document is political, for example by pointing out that the document contains names of known politicians. Each of these explanations suggests a rule that can be programmed by a computer scientist and executed by the computer. Unfortunately, we always find documents that escape the rules that we already know. In addition, human experts are constantly proposing new rules which sometimes contradict those they stated a few hours earlier. To manage these contradictions, we had the idea of ​​associating weights with the rules. For  Examples Of Machine Learning, there is a positive weight if the document contains the words "economic situation" and a negative weight if the document contains the words "police investigation". If the sum of all these weights exceeds a certain threshold, the document is classified among those dealing with politics. The problem then is to manually define the weights associated with each rule. It quickly becomes tricky because the number of rules inevitably increases.

Machine learning replaces the explanations of human experts with the volume of data. The first step therefore consists in collecting data, for example a few hundred thousand documents that will have been labeled as political or non-political. Then we give a weight to each word of the dictionary. At the start all these weights are initialized to zero. Learning consists in launching a program which adjusts all the weights by repeating for example the following operations until all the learning documents are categorized without errors:

Choose a document at random from the learning set.
Calculate the sum of the weights corresponding to the words it contains.
If the sum is positive while the document is not political,
subtract a small amount from the weights associated with all the words present in the document (leaving the other weights unchanged.)
If the sum is negative while the document is political,
add a small amount to the weights associated with all the words present in the document (leaving the other weights unchanged.)

This simple algorithm is called the Perception algorithm and was invented by Frank Rosenblatt in 1957: each time a document is mis classified, we adjust all the weights concerned so as to move their sum in the desired direction. If there is a set of weights which allows to correctly separate all the learning documents, we can prove that this algorithm will find it.

Figure 1 above summarizes the system: we have defined a machine which takes a newspaper article and produces a score which measures its political character. What this machine does depends on a large number of parameters - the weights - which a learning algorithm adjusts so as to obtain the desired behavior on a large number of examples.

We can obviously add many refinements. In addition to the weights associated with each word in the dictionary, weights can be associated with all the sequences of two or three words which appear often enough in learning documents. If despite this there is no set of weights which correctly categorize all the learning documents, it suffices to progressively reduce the quantity that we add or subtract from the weights, which allows them to stabilize near the values. which minimize the number of errors.

There are also important subtleties. For example, it is likely that a sequence of thirty words appearing in one learning document does not appear in any other. If we associate weights with such sequences, it is easy to categorize all the learning materials correctly, but that means recognizing the question after learning all the answers by heart. Such a system is undesirable since it is unlikely to properly categorize a new document. This is why we often work with several distinct sets of documents: a training set to determine the weights, a validation set to assess how the system works on new documents and compare several variants,

These learning techniques are simple but widely used and very well understood. It is with these kinds of techniques that search engines score their responses. It is also with this kind of technique that social networks select the advertisements published by our relatives. It is often with this kind of techniques that merchant sites recommend products.

2. Visual Recognition Of Objects

The automatic recognition of objects in digital photographs is obviously an important task.

For Examples Of Machine Learning, imagine a program that signals the presence of a flower in a photograph and works for all flower species, in all lighting conditions, whether the flower is whole or partially hidden by another object, etc.

Figure 2 below, however, shows what a modern object recognition system that works by learning can do. Basically, it is also a machine whose operation depends on a large number of weights calculated by a learning algorithm. But instead of having a few tens of thousands of weights as a text categorization system, this machine uses a few tens of millions of weights organized in a clever way which respects the nature of images and which is called a convolutional network. .

Figure 2 - Location and identification of objects (Oquab et al., CVPR 2015).  This particular system is trained in two stages.  During the first step, the training images contain a single object located in the center of the image and whose category is known.  During the second stage, the learning images can contain several objects with various positions.  Each image is labeled with the list of objects present but without specifying the position.

In a computer's memory, a digitized photographic image is a grid whose boxes are called pixels . Each pixel represents a point of the image with three numbers representing the intensities of the three channels associated with the elementary colors red, green, and blue. One can also imagine "images" containing an arbitrary number of channels. For example, recognized objects can be represented by an "image" whose number of channels is equal to the number of categories of objects: the intensity for channel C of a pixel P then indicates whether an object of the type corresponding to channel C has been recognized at the position corresponding to pixel P. Figure 3 below shows six of the channels calculated by a convolutional network for the image displayed on the left.

Figure 3 - Output layers of a convolutional network (Oquab et al., CVPR 2014)


The calculation performed by a convolutional network is in fact a succession of operations which each transform an input image into a result image. Each of these operations depends on weights which will be determined during training of the network, so that applying this sequence of operations to a photographic image (three channels) produces an "image" representing the recognized objects.

Each of these operations has an important property. Imagine a small machine which takes as inputs a small grid of 3 × 3 neighboring pixels in a given image and calculates a pixel of a result image. As shown in Figure 4 below, applying this machine to all positions of the initial image produces a result image having roughly the same number of pixels. But since the same weights are used for all the positions, each pixel of the result image is calculated by taking the corresponding pixels in the input image and performing the same calculation for all the positions of the image. This is very clever because the appearance of an object in an image does not depend on its position. It is this property that makes the convolutional network and that has revolutionized computer vision in the last three years.

Figure 4 - A convolution ( Examples Of Machine Learning)

The next step is to determine the weights that define the successive convolutions using a few million images representing known objects. The basic algorithm is very similar to that described in the previous section because it consists of repeating the following operations:

Choose a few random images from the learning images. Calculate the output images of the convolution network.

If these output images do not describe the objects present in the learning images well enough, slightly adjust the weights of the convolutions so as to make the output images more in line with our wishes.

Doing this on this scale is obviously much more difficult than running a learning algorithm for categorizing text because the volume of calculations is considerable. Some use clusters made up of hundreds of computers. Others use fewer machines but equip them with powerful graphics processors that can be used for these calculations. Despite these difficulties, the recognition of objects (and people) in digitized photographs is now a reality and is the subject of significant investments by large technology companies ...

Some find it disturbing that nothing in the structure of a convolution network says how it recognizes a cat from a dog, whatever their positions and whatever the lighting conditions. All this is entirely determined by the learning procedure on the basis of the examples given to it. This is clearly a break between programming a digital computer and training a machine capable of learning.

We are just beginning!😊

No comments:

Powered by Blogger.