Optimizing AI and Machine Learning Solutions

Mirza Rahim Baig

doi:9789355519818

Home / Catalog / Optimizing AI and Machine Learning Solutions

Optimizing AI and Machine Learning Solutions

Your ultimate guide to building high-impact ML/AI solutions (English Edition)

Author(s): Mirza Rahim Baig

Published by BPB Publications

Publication Date: Wed Apr 03 00:00:00 UTC 2024 Available in all formats

ISBN: 9789355519818

Pages: 392

https://doi.org/9789355519818

- Facebook
- Twitter
- Linkedin
- Whatsapp
- Copy URL

EBOOK (EPUB)

ISBN: 9789355519818 Price: INR 899.00

Add to cart Buy Now

Description

Table of contents

User Reviews

Subject(s): Machine Learning, Deep Learning, Artificial Intelligence, Natural language processing, Optimization, ChatGPT, Computer vision, Transfer learning

This book approaches data science solution building using a principled framework and case studies with extensive hands-on guidance. It will teach the readers optimization at each step, whether it is problem formulation or hyperparameter tuning for deep learning models. This book keeps the reader pragmatic and guides them toward practical solutions by discussing the essential ML concepts, including problem formulation, data preparation, and evaluation techniques. Further, the reader will be able to learn how to apply model optimization with advanced algorithms, hyperparameter tuning, and strategies against overfitting. They will also benefit from deep learning by optimizing models for image processing, natural language processing, and specialized applications. The reader can put theory into practice with hands-on case studies and code examples, reinforcing their understanding. With this book, the reader will be able to create high-impact, high-value ML/AI solutions by optimizing each step of the solution building process, which is the ultimate goal of every data science professional.

Cover Page
Title Page
Copyright Page
Dedication
About the Author
About the Reviewer
Acknowledgement
Preface
Table of Contents
1. Optimizing a Machine Learning /Artificial Intelligence Solution
- Introduction
- Structure
- Objectives
- Case study: Text deduplication for online fashion
- Understanding machine learning
- Machine learning styles
  - Supervised machine learning
  - Unsupervised machine learning
  - Reinforcement learning
  - Choosing the ML style
- Challenges in ML/AI
  - Poor formulation
  - Invalid/poor assumptions
  - Data availability and hygiene
  - Representative data (lack of)
  - Model scalability
  - Infeasible consumption
  - Misalignment with business outcomes
- ML / AI models vs. end-to-end solutions
- CRISP-DM framework
- Optimization at each step of solution development
  - Business understanding
  - Data understanding
  - Data preparation
  - Model building
  - Evaluation
  - Deployment
- Conclusion
2. ML Problem Formulation: Setting the Right Objective
- Introduction
- Structure
- Objectives
- Identifying a Machine Learning problem
- Choosing the ML style
- Problem 1: Fraud detection
  - Approach 1: Supervised classification
  - Approach 2: Unsupervised clustering
- Problem 2: Predicting high-selling products
  - Approach 1: Supervised Regression
  - Approach 2: Supervised Classification
- Problem 3: Question de-duplication for an e-commerce website
  - Approach 1: Supervised classification using Deep Learning
  - Approach 2: Unsupervised clustering of questions
- Blending ML styles
  - For better accuracy
  - For dimensionality reduction
  - For better data representation
- Choosing the right dependent variable
- Business objective vs. ML objective
- Conclusion
3. Data Collection and Pre-processing
- Introduction
- Structure
- Objectives
- Building a machine learning solution
  - The data collection process
  - The nature of the data
  - Domain-specific aspects
  - Influence of the task at hand
- The case study
- The pre-processing process
  - Step 1: Gather data + basic checks
    - Exercise 3.1: Loading and exploring bank marketing dataset
    - Exercise 3.2: Fixing formats and identifying missing values
  - Step 2: Separate into train and test datasets
    - Exercise 3.3: Splitting the data into train and test sets
  - Step 3: Outlier treatment
    - Percentile threshold approach
    - Z score approach
    - Box plots/Tukey criterion
    - Exercise 3.4: Outlier detection and treatment
  - Step 4: Missing value treatment
  - Step 5: Categorical feature handling
    - Solving our case study
    - Exercise 3.5: Handling remaining categorical features
  - Step 6: Transformation of numeric features
    - Exercise 3.6: Transforming numerical variables:
- Case study steps review
- Conclusion
4. Model Evaluation and Debugging
- Introduction
- Structure
- Objectives
- Ad click prediction case study
  - Exercise 4.1: Load and prepare the data
- Model evaluation considerations
  - Exercise 4.2: Building the competing models
- Model evaluation metrics
  - Metrics for classification problems
    - Accuracy
    - Confusion matrix
  - Exercise 4.3: Confusion matrix for click prediction models
    - Precision, Recall, and F1 Score
    - Area Under ROC curve (AUC)
    - Area under PR curve
  - Metrics for regression
- Model evaluation schemes
  - Exercise 4.4: Model evaluation using cross-validation
- Model debugging
  - Overfitting and underfitting
    - Overfitting
    - Underfitting
    - Validation curves and goodness of fit
    - Exercise 4.5: Validation curves for Ad click prediction
  - Handling overfitting
    - Feature selection
    - Hyper-parameter control
    - Regularization
    - Data augmentation
    - Optimal training (early stopping)
- Conclusion
5. Imbalanced Machine Learning
- Introduction
- Structure
- Objectives
- Understanding the business problem
  - Exercise 5.1: Understanding and staging the data
- Difficulty in model evaluation
  - Exercise 5.2: Accuracy for imbalanced classes
- Evaluation metrics for imbalanced classes
  - F1 score
  - Precision-Recall Curve
  - Summary of model evaluation
- Handling imbalance
  - Adjusting the cost function
  - Exercise 5.3: Adjusting class_weight
  - Data balancing
- Undersampling
  - Exercise 5.4: Random undersampling of insurance data
  - Model evaluation with re-balancing
  - Using Pipelines for evaluating re-balanced data
  - Undersampling methods: Tomek Links
  - Exercise 5.5: Tomek Links applied to insurance data
  - Undersampling methods: edited nearest neighbour
  - Exercise 5.6: Edited nearest neighbour undersampling
- Oversampling
  - Synthetic Minority Oversampling Technique
  - Exercise 5.7: Synthetic Data Generation with SMOTE
  - ADASYN - Adaptive Synthesis
  - Exercise 5.8: Synthetic Data Generation with ADASYN
- Combination of under and oversampling
  - Exercise 5.9: Using a combination of over and undersampling
- Comparing the approaches
- Conclusion
6. Hyper-parameter Tuning
- Introduction
- Structure
- Objectives
- Parameters and hyper-parameters
- Importance of hyper-parameters
- Hyper-parameter tuning
- Manual looping over hyper-parameters
  - Perform your own manual hyper-parameter tuning
- GridSearchCV - Grid Search with Cross-Validation
  - Using GridSearchCV for hyper-parameter tuning
  - GridSearchCV with multiple hyper-parameters
  - The effect of each hyper-parameter
- Limitations of exhaustive methods (looping, GridSearchCV)
- RandomizedSearchCV
- Grid Search versus Randomized Search
- Choosing a hyper-parameter search approach
- Choosing the ‘best’ model
- Training on the entire data and making predictions
- The complete process
- Conclusion
7. Parameter Optimization Algorithms
- Introduction
- Structure
- Objectives
- Optimization in machine learning
- Fundamentals of mathematical optimization
  - Objective/loss functions
  - Local and global optima
  - Convex vs. non-convex loss functions
- Numerical optimization for machine learning
  - Regression
  - Classification
  - Regularization
- A general solving approach
  - Gradient descent
  - Stopping condition
- Faster gradient descent variants
  - Momentum
  - AdaGrad optimizer
  - RMSProp optimizer
  - Adam optimizer
  - Choosing the optimization approach
- Conclusion
8. Optimizing Deep Learning Models
- Introduction
- Structure
- Objectives
- The image processing case study
  - The overall modeling approach
- Building a baseline model
- Hyper-parameters in neural networks
  - Activation function
    - Learning rate
    - Tuning the learning rate
    - Network structure
    - Optimizer
    - Other hyper-parameters
- Coarse plus fine tuning
- Practical tips for tuning Neural Networks
- Beyond hyper-parameters: other approaches to boost performance
  - Regularization
  - More data
  - Data augmentation
  - Ensembles
  - Inject Domain knowledge through features
- Conclusion
- Additional reads
9. Optimizing Image Models
- Introduction
- Structure
- Objectives
- Applications of image processing
- Case study: Auto image classification
- Feature detection using convolutions
- Traditional Convolutional Neural Networks
- Parameter regularization
  - Spatial Dropout
    - Exercise 9.1: Convolution model with dropout
  - Batch normalization
  - Data augmentation
    - Exercise 9.2: Conv model with data augmentation
- Popular architectures
  - VGG design approach
    - Exercise 9.3: Creating VGG16 from scratch
  - ResNet design approach
- Guidelines for designing image models
- Conclusion
10. Optimizing Natural Language Processing Models
- Introduction
- Structure
- Objectives
- Key tasks in NLP
- Importance of sequence processing
  - A text classification case study
- Text pre-processing
- Text representation
  - Count-based features
  - Text embeddings
    - Embedding layer in Keras
- Architectures for NLP
  - Recurrent architectures
    - Long Short Term Memory
    - Gated recurrent units
    - Bi-directional recurrent architectures
    - Recurrent architectures with attention
  - Transformer architecture
- Tuning network hyper-parameters
  - Exercise 10.1: Hyper-parameter tuning of GRU model
- Using convolutions for NLP
  - Exercise 10.2: Using 1D convolutions and RNNs
- Using pre-trained embeddings
  - Benefits
  - Drawbacks
  - Exercise- Text classification with pre-trained embeddings
  - Notable pre-trained embeddings
- Conclusion
11. Transfer Learning
- Introduction
- Structure
- Objectives
- Motivation for transfer learning
- Using a SOTA image classification model
- What is transfer learning?
  - Applications of transfer learning
    - Ready to use pre-trained models
    - Fine tuning models
    - Data representation/feature extraction
  - Benefits of transfer learning
  - Limitations of transfer learning
  - Sources for pre-trained models
    - Keras Applications
    - Tensorflow Hub
    - Hugging Face
    - Kaggle
    - GitHub
    - Word embeddings for text
- General transfer learning workflow
- Transfer learning for images
  - ImageNet database
  - Choosing a Transfer Learning approach
    - Fine tuning approaches
    - Fine tuning a SOTA image model
- Transfer learning for text
  - Applications of transfer learning for text
    - Prediction using pre-trained models
    - Feature extraction
    - Fine tuning
    - Hugging face for transfer learning
    - Sentiment analysis
    - Exercise 11.1: RoBERTa for tweet sentiment classification
    - Text generation
    - Zero-shot classification
    - Translation
  - Fine tuning a SOTA NLP model
    - Exercise 11.2: Fine tuning a BERT model
- Conclusion
Index

Comments should not be blank

Rating

Description

Subject(s): Machine Learning, Deep Learning, Artificial Intelligence, Natural language processing, Optimization, ChatGPT, Computer vision, Transfer learning

Table of contents

Cover Page
Title Page
Copyright Page
Dedication
About the Author
About the Reviewer
Acknowledgement
Preface
Table of Contents
1. Optimizing a Machine Learning /Artificial Intelligence Solution
- Introduction
- Structure
- Objectives
- Case study: Text deduplication for online fashion
- Understanding machine learning
- Machine learning styles
  - Supervised machine learning
  - Unsupervised machine learning
  - Reinforcement learning
  - Choosing the ML style
- Challenges in ML/AI
  - Poor formulation
  - Invalid/poor assumptions
  - Data availability and hygiene
  - Representative data (lack of)
  - Model scalability
  - Infeasible consumption
  - Misalignment with business outcomes
- ML / AI models vs. end-to-end solutions
- CRISP-DM framework
- Optimization at each step of solution development
  - Business understanding
  - Data understanding
  - Data preparation
  - Model building
  - Evaluation
  - Deployment
- Conclusion
2. ML Problem Formulation: Setting the Right Objective
- Introduction
- Structure
- Objectives
- Identifying a Machine Learning problem
- Choosing the ML style
- Problem 1: Fraud detection
  - Approach 1: Supervised classification
  - Approach 2: Unsupervised clustering
- Problem 2: Predicting high-selling products
  - Approach 1: Supervised Regression
  - Approach 2: Supervised Classification
- Problem 3: Question de-duplication for an e-commerce website
  - Approach 1: Supervised classification using Deep Learning
  - Approach 2: Unsupervised clustering of questions
- Blending ML styles
  - For better accuracy
  - For dimensionality reduction
  - For better data representation
- Choosing the right dependent variable
- Business objective vs. ML objective
- Conclusion
3. Data Collection and Pre-processing
- Introduction
- Structure
- Objectives
- Building a machine learning solution
  - The data collection process
  - The nature of the data
  - Domain-specific aspects
  - Influence of the task at hand
- The case study
- The pre-processing process
  - Step 1: Gather data + basic checks
    - Exercise 3.1: Loading and exploring bank marketing dataset
    - Exercise 3.2: Fixing formats and identifying missing values
  - Step 2: Separate into train and test datasets
    - Exercise 3.3: Splitting the data into train and test sets
  - Step 3: Outlier treatment
    - Percentile threshold approach
    - Z score approach
    - Box plots/Tukey criterion
    - Exercise 3.4: Outlier detection and treatment
  - Step 4: Missing value treatment
  - Step 5: Categorical feature handling
    - Solving our case study
    - Exercise 3.5: Handling remaining categorical features
  - Step 6: Transformation of numeric features
    - Exercise 3.6: Transforming numerical variables:
- Case study steps review
- Conclusion
4. Model Evaluation and Debugging
- Introduction
- Structure
- Objectives
- Ad click prediction case study
  - Exercise 4.1: Load and prepare the data
- Model evaluation considerations
  - Exercise 4.2: Building the competing models
- Model evaluation metrics
  - Metrics for classification problems
    - Accuracy
    - Confusion matrix
  - Exercise 4.3: Confusion matrix for click prediction models
    - Precision, Recall, and F1 Score
    - Area Under ROC curve (AUC)
    - Area under PR curve
  - Metrics for regression
- Model evaluation schemes
  - Exercise 4.4: Model evaluation using cross-validation
- Model debugging
  - Overfitting and underfitting
    - Overfitting
    - Underfitting
    - Validation curves and goodness of fit
    - Exercise 4.5: Validation curves for Ad click prediction
  - Handling overfitting
    - Feature selection
    - Hyper-parameter control
    - Regularization
    - Data augmentation
    - Optimal training (early stopping)
- Conclusion
5. Imbalanced Machine Learning
- Introduction
- Structure
- Objectives
- Understanding the business problem
  - Exercise 5.1: Understanding and staging the data
- Difficulty in model evaluation
  - Exercise 5.2: Accuracy for imbalanced classes
- Evaluation metrics for imbalanced classes
  - F1 score
  - Precision-Recall Curve
  - Summary of model evaluation
- Handling imbalance
  - Adjusting the cost function
  - Exercise 5.3: Adjusting class_weight
  - Data balancing
- Undersampling
  - Exercise 5.4: Random undersampling of insurance data
  - Model evaluation with re-balancing
  - Using Pipelines for evaluating re-balanced data
  - Undersampling methods: Tomek Links
  - Exercise 5.5: Tomek Links applied to insurance data
  - Undersampling methods: edited nearest neighbour
  - Exercise 5.6: Edited nearest neighbour undersampling
- Oversampling
  - Synthetic Minority Oversampling Technique
  - Exercise 5.7: Synthetic Data Generation with SMOTE
  - ADASYN - Adaptive Synthesis
  - Exercise 5.8: Synthetic Data Generation with ADASYN
- Combination of under and oversampling
  - Exercise 5.9: Using a combination of over and undersampling
- Comparing the approaches
- Conclusion
6. Hyper-parameter Tuning
- Introduction
- Structure
- Objectives
- Parameters and hyper-parameters
- Importance of hyper-parameters
- Hyper-parameter tuning
- Manual looping over hyper-parameters
  - Perform your own manual hyper-parameter tuning
- GridSearchCV - Grid Search with Cross-Validation
  - Using GridSearchCV for hyper-parameter tuning
  - GridSearchCV with multiple hyper-parameters
  - The effect of each hyper-parameter
- Limitations of exhaustive methods (looping, GridSearchCV)
- RandomizedSearchCV
- Grid Search versus Randomized Search
- Choosing a hyper-parameter search approach
- Choosing the ‘best’ model
- Training on the entire data and making predictions
- The complete process
- Conclusion
7. Parameter Optimization Algorithms
- Introduction
- Structure
- Objectives
- Optimization in machine learning
- Fundamentals of mathematical optimization
  - Objective/loss functions
  - Local and global optima
  - Convex vs. non-convex loss functions
- Numerical optimization for machine learning
  - Regression
  - Classification
  - Regularization
- A general solving approach
  - Gradient descent
  - Stopping condition
- Faster gradient descent variants
  - Momentum
  - AdaGrad optimizer
  - RMSProp optimizer
  - Adam optimizer
  - Choosing the optimization approach
- Conclusion
8. Optimizing Deep Learning Models
- Introduction
- Structure
- Objectives
- The image processing case study
  - The overall modeling approach
- Building a baseline model
- Hyper-parameters in neural networks
  - Activation function
    - Learning rate
    - Tuning the learning rate
    - Network structure
    - Optimizer
    - Other hyper-parameters
- Coarse plus fine tuning
- Practical tips for tuning Neural Networks
- Beyond hyper-parameters: other approaches to boost performance
  - Regularization
  - More data
  - Data augmentation
  - Ensembles
  - Inject Domain knowledge through features
- Conclusion
- Additional reads
9. Optimizing Image Models
- Introduction
- Structure
- Objectives
- Applications of image processing
- Case study: Auto image classification
- Feature detection using convolutions
- Traditional Convolutional Neural Networks
- Parameter regularization
  - Spatial Dropout
    - Exercise 9.1: Convolution model with dropout
  - Batch normalization
  - Data augmentation
    - Exercise 9.2: Conv model with data augmentation
- Popular architectures
  - VGG design approach
    - Exercise 9.3: Creating VGG16 from scratch
  - ResNet design approach
- Guidelines for designing image models
- Conclusion
10. Optimizing Natural Language Processing Models
- Introduction
- Structure
- Objectives
- Key tasks in NLP
- Importance of sequence processing
  - A text classification case study
- Text pre-processing
- Text representation
  - Count-based features
  - Text embeddings
    - Embedding layer in Keras
- Architectures for NLP
  - Recurrent architectures
    - Long Short Term Memory
    - Gated recurrent units
    - Bi-directional recurrent architectures
    - Recurrent architectures with attention
  - Transformer architecture
- Tuning network hyper-parameters
  - Exercise 10.1: Hyper-parameter tuning of GRU model
- Using convolutions for NLP
  - Exercise 10.2: Using 1D convolutions and RNNs
- Using pre-trained embeddings
  - Benefits
  - Drawbacks
  - Exercise- Text classification with pre-trained embeddings
  - Notable pre-trained embeddings
- Conclusion
11. Transfer Learning
- Introduction
- Structure
- Objectives
- Motivation for transfer learning
- Using a SOTA image classification model
- What is transfer learning?
  - Applications of transfer learning
    - Ready to use pre-trained models
    - Fine tuning models
    - Data representation/feature extraction
  - Benefits of transfer learning
  - Limitations of transfer learning
  - Sources for pre-trained models
    - Keras Applications
    - Tensorflow Hub
    - Hugging Face
    - Kaggle
    - GitHub
    - Word embeddings for text
- General transfer learning workflow
- Transfer learning for images
  - ImageNet database
  - Choosing a Transfer Learning approach
    - Fine tuning approaches
    - Fine tuning a SOTA image model
- Transfer learning for text
  - Applications of transfer learning for text
    - Prediction using pre-trained models
    - Feature extraction
    - Fine tuning
    - Hugging face for transfer learning
    - Sentiment analysis
    - Exercise 11.1: RoBERTa for tweet sentiment classification
    - Text generation
    - Zero-shot classification
    - Translation
  - Fine tuning a SOTA NLP model
    - Exercise 11.2: Fine tuning a BERT model
- Conclusion
Index