It might be tempting to define machine learning as the process by which computers learn known observations to be able to make predictions about unseen data. However, this definition only reflects a subset of machine learning called supervised learning, which is only possible when we have access to some labeled data that the algorithm can learn from. If this is not the case, we have to rely on unsupervised learning.
When labeled data exists, it means we have some observations with a known output. From there, we can train an algorithm to recognize this output based on other characteristics of the observations – that is, the features. After this training phase, the algorithm will have adjusted its parameters in order to have the best possible predictions on the training data. These tuned parameters can then be used to make predictions about unseen observations – that is, data that does not have a label. This is how you can predict the...