Machine Learning Glossary

A glossary is a list of words with their meanings about a specific subject or topic. When learning machine learning, you go through a lot of terms that are not easy to remember, but if you have a machine learning glossary, it will help you easily remember the meaning of terms you don’t remember. So in this article, I will introduce you to a machine learning glossary that I have created to make it easy for you to remember the meaning of difficult terms in machine learning.

Machine Learning Glossary

Below is a machine learning glossary that contains all the important machine learning terms in alphabetical order:

  1. A/B testing: A/B testing means comparing two techniques to see which is more effective and efficient.
  2. Accuracy: Accuracy is the ratio of the True predicted values to the Total predicted values.
  3. Activation Function: Activation Function is a function that decides whether a neuron should be activated or not.
  4. Agglomerative Clustering: Agglomerative clustering is one of the clustering algorithms where the process of grouping similar instances starts by creating multiple groups where each group contains one entity at the initial stage, then it finds the two most similar groups, merges them, repeats the process until it obtains a single group of the most similar instances.
  5. Anomaly Detection: Anomaly detection means identifying unlikely and rare events.
  6. AUC: AUC stands for Area Under the Curve. It is used to measure the entire area under the ROC curve. It shows an aggregate measure of the performance of a machine learning model across all classification thresholds.
  7. Backpropagation: Backpropagation means fine-tuning the weights of an artificial neural network based on the error rate. It results in low error rates, which makes the model more accurate.
  8. Bag of Words: A bag of Words is a representation of words, where a text is represented as a group of words irrespective of the grammar and the order of the words.
  9. Batch: Set of instances used in an iteration while training a model.
  10. Batch normalization: Batch normalization is the process of normalizing the input and output of the activation function in a hidden layer.
  11. Batch size: The number of instances in a batch is the batch size of a model.
  12. BERT: BERT stands for Bidirectional Encoder Representations From Transformers. It is a pre-trained model for Natural Language Processing tasks which is developed by Google.
  13. Binary Classification: Binary classification is one of the classification problems in machine learning where we have to classify between two mutually exclusive classes.
  14. Boosting: Boosting is a method of amplifying the accuracy of weak learners.
  15. Bounding box: A bounding box is an imaginary box to mark a point of reference in an image. It is used in computer vision applications such as object detection.
  16. Bucketing: Bucketing is a data pre-processing technique used to convert a feature into multiple binary features.
  17. Categorical data: Categorical data represents a discrete set of possible values. All the features in your dataset that can be divided into groups are categorical variables.
  18. Centroid: The center of a cluster as determined by clustering algorithms.
  19. Class: One of the values in the labels.
  20. Classification: Classification is the task of categorizing among two or more discrete classes.
  21. Class-imbalance: Class imbalance is a challenge in binary classification problems when the classes in the labels are in different frequencies.
  22. Clustering: Clustering is the task of identifying similar instances based on similar features and assigning them to clusters.
  23. Collaborative Filtering: Collaborative filtering is a recommendation system method that is formed by the collaboration of multiple users. The idea behind it is to recommend products or services to a user that their peers have appreciated.
  24. Confusion Matrix: A confusion matrix is a performance evaluation metric used for summarizing the performance of a classification model.
  25. CNN: CNN stands for Convolutional Neural Network. It consists of one or more convolutional layers, often with a subsampling layer, which are followed by one or more fully connected layers in a standard neural network.
  26. Data analysis: Data analysis is the process of inspecting and exploring data generated by a particular population to find the information needed to make decisions and draw conclusions.
  27. Data augmentation: Data augmentation means increasing the amount of data by adding similar data to the already existing data. It helps in reducing the overfitting of a machine learning model.
  28. Decision tree: A decision tree is an algorithm, that predicts the label associated with an instance by travelling from a root node of a tree to a leaf.
  29. Early stopping: Early stopping is a regularization method used to avoid overfitting.
  30. Ensemble: Ensemble is a merger of the predictions made by multiple models.
  31. Epoch: An epoch indicates one cycle through the complete training data.
  32. False-negative rate: False Negatives/False Negatives + True Positives
  33. False-positive rate: False Positives/False Positives + True Negatives
  34. Feature engineering: Feature engineering is the process of determining the most important features for training a machine learning model.
  35. Fully connected layer: When each node in a hidden layer is connected to every node in the subsequent layer then it is known as a fully connected layer.
  36. Generalization: Generalization refers to the ability of a machine learning model to make correct predictions on an unseen dataset.
  37. Hidden layer: The layer between the input and output layer of a neural network is known as a hidden layer.
  38. Image recognition: The process of classifying objects in an image is known as image recognition.
  39. Input layer: The input layer is the first layer of a neural network that receives training data as an input.
  40. K-Means: K-Means is a clustering algorithm in machine learning that can group an unlabeled dataset very quickly and efficiently in just a few iterations.
  41. Learning rate: The number of weights updated at each iteration during the training of a model is known as the learning rate.
  42. Linear Regression: Linear regression is a machine learning algorithm that predicts the values of a dependent variable by using the values of an independent variable.
  43. Logistic Regression: Logistic regression is arguably the simplest machine learning algorithm for classification. It extends linear regression with a logistic function to make it suitable for classification.
  44. LSTM: LSTM stands for Long Short Term Memory. It is a type of neural network architecture that is used in deep learning applications where data is processed with memory gaps.
  45. Multiclass Classification: Classification with more than two classes is known as multiclass classification.
  46. Normalization: Normalization means to change the values in a way that they end up being in a normal distribution.
  47. One-hot encoding: One-hot encoding is used to produce a vector of length equal to the number of categories in the dataset.
  48. Overfitting: Overfitting means the machine learning model performed very well on the training data but does not generalize well.
  49. Perceptron: Perceptron is a type of neural network architecture that falls under the category of the simplest form of artificial neural networks.
  50. Pipeline: A machine learning pipeline means gathering data, preparing data, training models, and exporting the models to production.
  51. ROC: ROC stands for Receiver Operating Characteristic curve. It is a graph that shows the performance of a machine learning model on a classification problem by plotting the true positive rate and the false positive rate.
  52. Test set: The set of data used to test the performance of a model.
  53. Training set: The set of data used to train the model.
  54. Transformer: Transformer is a popular neural network architecture developed at Google which can be viewed as a stack of attention layers.
  55. True positive rate: True Positives/ True Positives + False Negatives
  56. Underfitting: Underfitting is the opposite of overfitting. It happens when the model is too easy to learn from the underlying structure of the data.

Final Words

So there were some of the most important machine learning terms you need to know when learning machine learning. Hope this machine learning glossary helps you easily remember the meanings of terms you can’t remember. Hope you liked this machine learning glossary. Feel free to drop more terms in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of dataūüďą.

Articles: 1535

Leave a Reply