Machine learning pipeline
In the last section, we spent a lot of time discussing machine learning models and how they correspond to frameworks for medical decision making. But how does one actually train a machine learning model? In healthcare, machine learning usually consists of a pattern of stereotyped tasks. We can refer to the collection of these tasks as a pipeline. While no two pipelines are exactly the same for any two machine learning applications, pipelines allow us to describe the machine learning process. In this section, we describe a generalized pipeline that many simple machine learning projects tend to follow, particularly when dealing with structured data, or data that can be organized into rows and columns.
Loading the data
Before we can make computations on the data, it must be loaded from a storage location (usually a database or a real-time data feed) into a computing workspace. Workspaces allow the user to manipulate the data and build models using popular languages including...