16 Machine Learning Data Terminology

The Machine Learning project is basically a series of steps. My point is data is the key to any ML project. And the Terminology people call is something different in the context of ML. So I am sharing the useful Machine learning Dataset terminology. Here is a Quiz on Machine Learning.

Sample ML Input Data.

Instances
(Examples)
M/F
(Attribute-1)
Salary
(Attribute-2)
Target
(Label)
instance-1M10,000Eligible for Bonus
Instance-2F20,000Not Eligible for Bonus
Sample Data

In the above table, the predictable value (Eligibility) is pre-defined. So if you know the Target upfront, it is called Supervised Learning. In the case of unsupervised learning, the target is unknown; It is something to find hidden predictions.

ML Dataset Terminology.

WordDefinition
instance or exampleA single object, observation, transaction, or record.
target or labelThe numerical or categorical (label) attribute of interest. This is the variable to be predicted for each new instance.
featuresThe input attributes that are used to predict the target. These also may be numerical or categorical.
modelA mathematical object describing the relationship between the features and the target.
training dataThe set of instances with a known target to be used to fit an ML model.
recallUsing a model to predict a target or label.
supervised machine learningMachine learning in which, given examples for which the output value is known, the training process infers a function that relates input values to the output.
unsupervised machine learningMachine-learning techniques that don’t rely on labeled examples, but rather try to find hidden structure in unlabeled data.
ML workflowThe stages in the ML process: data preparation, model building, evaluation, optimization, and prediction.
online machine learningA form of machine learning in which predictions are made, and the model is updated, for each new example.
Machine Learning Terminology on Data

Related Posts

Author: Srini

Experienced software developer. Skills in Development, Coding, Testing and Debugging. Good Data analytic skills (Data Warehousing and BI). Also skills in Mainframe.

Comments are closed.