The Machine Learning project is basically a series of steps. My point is data is the key to any ML project. And the Terminology people call is something different in the context of ML. So I am sharing the useful Machine learning Dataset terminology. Here is a Quiz on Machine Learning.
Sample ML Input Data.
|instance-1||M||10,000||Eligible for Bonus|
|Instance-2||F||20,000||Not Eligible for Bonus|
In the above table, the predictable value (Eligibility) is pre-defined. So if you know the Target upfront, it is called Supervised Learning. In the case of unsupervised learning, the target is unknown; It is something to find hidden predictions.
ML Dataset Terminology.
|instance or example||A single object, observation, transaction, or record.|
|target or label||The numerical or categorical (label) attribute of interest. This is the variable to be predicted for each new instance.|
|features||The input attributes that are used to predict the target. These also may be numerical or categorical.|
|model||A mathematical object describing the relationship between the features and the target.|
|training data||The set of instances with a known target to be used to fit an ML model.|
|recall||Using a model to predict a target or label.|
|supervised machine learning||Machine learning in which, given examples for which the output value is known, the training process infers a function that relates input values to the output.|
|unsupervised machine learning||Machine-learning techniques that don’t rely on labeled examples, but rather try to find hidden structure in unlabeled data.|
|ML workflow||The stages in the ML process: data preparation, model building, evaluation, optimization, and prediction.|
|online machine learning||A form of machine learning in which predictions are made, and the model is updated, for each new example.|