Missing data is another topic data scientists have to deal with in real life. Some fields may not be filled due to oversight, lack of information, or just because the information is not relevant.
While some machine learning algorithms are able to deal with missing data, most of them will not be able to process the data properly and raise errors if your dataset contains such values. It is safer, therefore, to find a way to remove them. If your dataset is large and the amount of missing data represents only a small proportion of it, you could simply drop the observations containing incomplete information. However, in most cases, it is good practice to keep all of the information and instead try to compensate for what is missing. One way of doing this is to use the mean value of all the known observations as a default value for the observations with missing data. In this way, the observations are retained and, in most cases, the fake data that we integrate into the model will...