Building Transformer Models with PyTorch 2.0

Prem Timsina

doi:9789355517494

Home / Catalog / Building Transformer Models with PyTorch 2.0

Building Transformer Models with PyTorch 2.0

NLP, computer vision, and speech processing with PyTorch and Hugging Face (English Edition)

Author(s): Prem Timsina

Published by BPB Publications

Publication Date: Sat Aug 03 00:00:00 UTC 2024 Available in all formats

ISBN: 9789355517494

Pages: 310

https://doi.org/9789355517494

- Facebook
- Twitter
- Linkedin
- Whatsapp
- Copy URL

EBOOK (EPUB)

ISBN: 9789355517494 Price: INR 799.00

Add to cart Buy Now

Description

Table of contents

User Reviews

Subject(s): Machine Learning, Deep Learning, Natural language processing, Computer vision, Transformer Models, HuggingFace, Speech Processing

This book covers transformer architecture for various applications including NLP, computer vision, speech processing, and predictive modeling with tabular data. It is a valuable resource for anyone looking to harness the power of transformer architecture in their machine learning projects. The book provides a step-by-step guide to building transformer models from scratch and fine-tuning pre-trained open-source models. It explores foundational model architecture, including GPT, VIT, Whisper, TabTransformer, Stable Diffusion, and the core principles for solving various problems with transformers. The book also covers transfer learning, model training, and fine-tuning, and discusses how to utilize recent models from Hugging Face. Additionally, the book explores advanced topics such as model benchmarking, multimodal learning, reinforcement learning, and deploying and serving transformer models. In conclusion, this book offers a comprehensive and thorough guide to transformer models and their various applications.

Cover
Title Page
Copyright Page
Dedication Page
About the Author
About the Reviewer
Acknowledgement
Preface
Table of Contents
1. Transformer Architecture
- Introduction
- Structure
- Objectives
- Chronology of NLP model development
  - Recurrent neural network
  - Limitation of RNN
  - LSTM
  - Limitation of LSTM
  - Cho’s (2014) RNN encoder decoder
  - Bahdanau’s (2014) attention mechanism
- Transformer architecture
  - Embedding
  - Positional encoding
  - Model input
  - Encoding layer
  - Attention mechanism
    - Self-attention
    - Multiheaded attention
    - Decoder layer
- Training process of transformer
- Inference process of transformer
- Types of transformers and their applications
  - Encoder only model
  - Decoder-only model
  - Encoder-decoder model
- Conclusion
- Quiz
- Answers
2. Hugging Face Ecosystem
- Introduction
- Structure
- Objectives
  - System resources
- Overview of Hugging Face
  - Key component of Hugging Face
  - Tokenizers
  - Create your custom Tokenizer
    - Training
    - Inference
  - Use pre-trained tokenizer from Hugging Face
- Datasets
  - Using Hugging Face dataset
  - Using the Hugging Face dataset on PyTorch
- Models
  - Environmental setup
  - Training
    - Inference
- Sharing your model on Hugging Face
- Model
- Spaces
- Conclusion
- Quiz
- Answers
3. Transformer Model in PyTorch
- Introduction
- Structure
- Objectives
- System resources
- Transformer components in PyTorch
- Embedding
  - Example
- Positional encoding
- Masking
- Encoder component of a transformer
- Decoder component of a transformer
- Transformer layer in PyTorch
- Conclusion
- Quiz
- Answers
4. Transfer Learning with PyTorch and Hugging Face
- Introduction
- Structure
- Objectives
- System requirements
- Need of transfer learning
- Using transfer learning
- Where can you get pre-trained model
- Popular pre-trained model
  - NLP
  - Computer vision
  - Speech processing
- Project: Develop classifier by fine tuning BERT-base-uncased
  - Custom dataset class
  - DataLoader
    - Inference
- Conclusion
- Quiz
- Answers
5. Large Language Models:BERT, GPT-3, and BART
- Introduction
- Structure
- Objectives
- Large language model
- Key determinants of performance
  - Size of network: Number of encoder and decoder layers
    - Number of model parameters
    - Max-sequence length
    - Size of embedding dimension
    - Pre-training dataset size and types
- Pioneering LLMs and their impact
  - BERT and its variants
    - BERT pre-training
    - BERT fine-tunning
    - BERT Variations
    - Applications
  - Generative pre-trained Transformer
    - Pre-training of GPT
    - Applications
  - Bidirectional and Auto-Regressive Transformers
    - Pre-training
    - Application
- Creating your own LLM
  - Clinical-Bert
- Conclusion
- Quiz
- Answers
6. NLP Tasks with Transformers
- Introduction
- Structure
- Objectives
- System requirements
- NLP tasks
- Text classification
  - Most appropriate architecture for text classification
  - Text classification via fine-tunning transformer
  - Handling long sequence
  - Project 1: Document chunking
  - Project 2: Hierarchical attention
- Text generation
  - Project 3: Shakespeare like text generation
    - Data preparation
    - Training
- ChatBot with transformer
  - Project 4: Clinical question answering transformer
    - Data preparation
    - Model declaration
    - Creating prompt and tokenization
- Training with PEFT and LORA
- Conclusion
- Quiz
- Answers
7. CV Model Anatomy: ViT, DETR, and DeiT
- Introduction
- Structure
- Objectives
- System requirements
- Image pre-processing
  - Example of image pre-processing
- Vision transformer architecture
  - Project 1: AI eye doctor
- Distillation transformer
  - Advantages of DeiT
    - Exercise
- Detection transformer
  - Project 2: Object detection model
- Conclusion
- Quiz
- Answers
8. Computer Vision Tasks with Transformers
- Introduction
- Structure
- Objectives
- System requirements
- Computer vision task
  - Image classification
    - Exercise
  - Image segmentation
  - Project 1: Image segmentation for our diet calculator
- Diffusion model: Unconditional image generation
  - Forward diffusion
  - Backward diffusion
  - Inference process
  - Learnable parameters
  - Project 2: DogGenDiffusion
- Conclusion
- Quiz
- Answers
9. Speech Processing Model Anatomy: Whisper, SpeechT5, and Wav2Vec
- Introduction
- Structure
- Objectives
- System requirements
- Speech processing
  - Example of speech pre-processing
- Whisper
  - Project 1: Whisper_Nep
    - Task
    - Approach
- Wav2Vec
  - Applications of Wav2Vec
- SpeechT5
  - Input/Output representation
  - Cross-modal representation
  - Encoder-decoder architecture
  - Pre-training
  - Fine-tuning and applications
- Comparing Whisper, Wav2Vec 2.0 and Speech T5
- Conclusion
- Quiz
- Answers
10. Speech Tasks with Transformers
- Introduction
- Structure
- Objectives
- System requirements
- Speech processing tasks
  - Speech to text
  - Project 1: Custom audio transcription with ASR using Whisper
- Text-to-speech
  - Project 2: Implementing text-to-Speech
- Audio to audio
  - Project 3: Audio quality improvement through noise reduction
- Conclusion
- Quiz
- Answers
11. Transformer Architecture for Tabular Data Processing
- Introduction
- Structure
- Objectives
- System requirements
- Tabular data representation using transformer
  - TAPAS architecture
    - Pretraining objective
    - Fine-tuning
    - Applications
    - Example
- TabTransformer architecture
- FT transformer architecture
  - Feature tokenizer
  - Concatenation of numerical and categorical feature
  - Transformer
- Conclusion
- Quiz
- Answers
12. Transformers for Tabular Data Regression and Classification
- Introduction
- Structure
- Objectives
- System requirements
- Transformer for classification
  - Dataset
  - Target
  - Pre-process the data
  - Declare the configuration
  - Train and evaluate the model with three models
  - Evaluation
  - Analysis
- Transformer for regression
  - The dataset
  - Pre-process the data
  - Define model configuration
  - Train and evaluate
- Conclusion
- Quiz
- Answers
13. Multimodal Transformers, Architectures and Applications
- Introduction
- Structure
- Objectives
- System requirements
- Multimodal architecture
  - ImageBind
    - Demonstration
  - CLIP
    - Pre-training objective
    - Applications and usage
- Multimodal tasks
  - Feature extraction
  - Text-to-image
  - Image to-text
  - Visual question answering
- Conclusion
- Quiz
- Answers
14. Explore Reinforcement Learning for Transformer
- Introduction
- Structure
- Objectives
- System requirements
- Reinforcement learning
- Important techniques in PyTorch for RL
  - Stable Baseline3
  - Gymnasium
- Project 1: Stock Market Trading with RL
- Transformer for reinforcement learning
  - Decision transformer
  - Trajectory transformer
    - Input
- Conclusion
- Quiz
- Answers
15. Model Export, Serving, and Deployment
- Introduction
- Structure
- Objectives
- System resources
- Model export and serialization
  - PyTorch model export and import
    - torch.save
    - torch.load
    - torch.nn.Module.load_state_dict
  - Saving multiple models
- Exporting model on ONNX Format
- Serving model with FastAPI
  - Benefits of FastAPI
  - Application of FastAPI for model serving
  - Project: FastAPI for semantic segmentation model serving
- Serving Pytorch model in mobile devices
- Deploying HuggingFace’s Transformers model on AWS
  - Deployment using Amazon SageMaker
  - Deployment using AWS Lambda and Amazon API gateway
- Conclusion
- Quiz
- Answers
16. Transformer Model Interpretability, and Experimental Visualization
- Introduction
- Structure
- Objectives
- Explainability vs. interpretability
  - Interpretability
  - Explainability
- Tools for explainability and interpretability
- CAPTUM for interpreting Transformer prediction
  - Model loading
  - Input preparation
    - Why Baseline Tensor
  - Layer Integrated Gradients
  - Visualization
- TensorBoard for PyTorch models
- Conclusion
- Quiz
- Answers
17. PyTorch Models: Best Practices
- Introduction
- Structure
- Objectives
- Best practices for building transformer models
  - Working with Hugging Face
  - General consideration with Pytorch model
- The art of debugging in PyTorch
  - Syntax errors
  - Runtime errors
    - Shape mismatch
    - CUDA errors
    - Loss computation issues
    - Mismatched configuration
    - Memory error
    - Library/Dependency errors
  - Logical errors
  - General guidelines for debugging PyTorch ML models
- Conclusion
- Quiz
- Answers
Index

Comments should not be blank

Rating

Description

Subject(s): Machine Learning, Deep Learning, Natural language processing, Computer vision, Transformer Models, HuggingFace, Speech Processing

Table of contents

Cover
Title Page
Copyright Page
Dedication Page
About the Author
About the Reviewer
Acknowledgement
Preface
Table of Contents
1. Transformer Architecture
- Introduction
- Structure
- Objectives
- Chronology of NLP model development
  - Recurrent neural network
  - Limitation of RNN
  - LSTM
  - Limitation of LSTM
  - Cho’s (2014) RNN encoder decoder
  - Bahdanau’s (2014) attention mechanism
- Transformer architecture
  - Embedding
  - Positional encoding
  - Model input
  - Encoding layer
  - Attention mechanism
    - Self-attention
    - Multiheaded attention
    - Decoder layer
- Training process of transformer
- Inference process of transformer
- Types of transformers and their applications
  - Encoder only model
  - Decoder-only model
  - Encoder-decoder model
- Conclusion
- Quiz
- Answers
2. Hugging Face Ecosystem
- Introduction
- Structure
- Objectives
  - System resources
- Overview of Hugging Face
  - Key component of Hugging Face
  - Tokenizers
  - Create your custom Tokenizer
    - Training
    - Inference
  - Use pre-trained tokenizer from Hugging Face
- Datasets
  - Using Hugging Face dataset
  - Using the Hugging Face dataset on PyTorch
- Models
  - Environmental setup
  - Training
    - Inference
- Sharing your model on Hugging Face
- Model
- Spaces
- Conclusion
- Quiz
- Answers
3. Transformer Model in PyTorch
- Introduction
- Structure
- Objectives
- System resources
- Transformer components in PyTorch
- Embedding
  - Example
- Positional encoding
- Masking
- Encoder component of a transformer
- Decoder component of a transformer
- Transformer layer in PyTorch
- Conclusion
- Quiz
- Answers
4. Transfer Learning with PyTorch and Hugging Face
- Introduction
- Structure
- Objectives
- System requirements
- Need of transfer learning
- Using transfer learning
- Where can you get pre-trained model
- Popular pre-trained model
  - NLP
  - Computer vision
  - Speech processing
- Project: Develop classifier by fine tuning BERT-base-uncased
  - Custom dataset class
  - DataLoader
    - Inference
- Conclusion
- Quiz
- Answers
5. Large Language Models:BERT, GPT-3, and BART
- Introduction
- Structure
- Objectives
- Large language model
- Key determinants of performance
  - Size of network: Number of encoder and decoder layers
    - Number of model parameters
    - Max-sequence length
    - Size of embedding dimension
    - Pre-training dataset size and types
- Pioneering LLMs and their impact
  - BERT and its variants
    - BERT pre-training
    - BERT fine-tunning
    - BERT Variations
    - Applications
  - Generative pre-trained Transformer
    - Pre-training of GPT
    - Applications
  - Bidirectional and Auto-Regressive Transformers
    - Pre-training
    - Application
- Creating your own LLM
  - Clinical-Bert
- Conclusion
- Quiz
- Answers
6. NLP Tasks with Transformers
- Introduction
- Structure
- Objectives
- System requirements
- NLP tasks
- Text classification
  - Most appropriate architecture for text classification
  - Text classification via fine-tunning transformer
  - Handling long sequence
  - Project 1: Document chunking
  - Project 2: Hierarchical attention
- Text generation
  - Project 3: Shakespeare like text generation
    - Data preparation
    - Training
- ChatBot with transformer
  - Project 4: Clinical question answering transformer
    - Data preparation
    - Model declaration
    - Creating prompt and tokenization
- Training with PEFT and LORA
- Conclusion
- Quiz
- Answers
7. CV Model Anatomy: ViT, DETR, and DeiT
- Introduction
- Structure
- Objectives
- System requirements
- Image pre-processing
  - Example of image pre-processing
- Vision transformer architecture
  - Project 1: AI eye doctor
- Distillation transformer
  - Advantages of DeiT
    - Exercise
- Detection transformer
  - Project 2: Object detection model
- Conclusion
- Quiz
- Answers
8. Computer Vision Tasks with Transformers
- Introduction
- Structure
- Objectives
- System requirements
- Computer vision task
  - Image classification
    - Exercise
  - Image segmentation
  - Project 1: Image segmentation for our diet calculator
- Diffusion model: Unconditional image generation
  - Forward diffusion
  - Backward diffusion
  - Inference process
  - Learnable parameters
  - Project 2: DogGenDiffusion
- Conclusion
- Quiz
- Answers
9. Speech Processing Model Anatomy: Whisper, SpeechT5, and Wav2Vec
- Introduction
- Structure
- Objectives
- System requirements
- Speech processing
  - Example of speech pre-processing
- Whisper
  - Project 1: Whisper_Nep
    - Task
    - Approach
- Wav2Vec
  - Applications of Wav2Vec
- SpeechT5
  - Input/Output representation
  - Cross-modal representation
  - Encoder-decoder architecture
  - Pre-training
  - Fine-tuning and applications
- Comparing Whisper, Wav2Vec 2.0 and Speech T5
- Conclusion
- Quiz
- Answers
10. Speech Tasks with Transformers
- Introduction
- Structure
- Objectives
- System requirements
- Speech processing tasks
  - Speech to text
  - Project 1: Custom audio transcription with ASR using Whisper
- Text-to-speech
  - Project 2: Implementing text-to-Speech
- Audio to audio
  - Project 3: Audio quality improvement through noise reduction
- Conclusion
- Quiz
- Answers
11. Transformer Architecture for Tabular Data Processing
- Introduction
- Structure
- Objectives
- System requirements
- Tabular data representation using transformer
  - TAPAS architecture
    - Pretraining objective
    - Fine-tuning
    - Applications
    - Example
- TabTransformer architecture
- FT transformer architecture
  - Feature tokenizer
  - Concatenation of numerical and categorical feature
  - Transformer
- Conclusion
- Quiz
- Answers
12. Transformers for Tabular Data Regression and Classification
- Introduction
- Structure
- Objectives
- System requirements
- Transformer for classification
  - Dataset
  - Target
  - Pre-process the data
  - Declare the configuration
  - Train and evaluate the model with three models
  - Evaluation
  - Analysis
- Transformer for regression
  - The dataset
  - Pre-process the data
  - Define model configuration
  - Train and evaluate
- Conclusion
- Quiz
- Answers
13. Multimodal Transformers, Architectures and Applications
- Introduction
- Structure
- Objectives
- System requirements
- Multimodal architecture
  - ImageBind
    - Demonstration
  - CLIP
    - Pre-training objective
    - Applications and usage
- Multimodal tasks
  - Feature extraction
  - Text-to-image
  - Image to-text
  - Visual question answering
- Conclusion
- Quiz
- Answers
14. Explore Reinforcement Learning for Transformer
- Introduction
- Structure
- Objectives
- System requirements
- Reinforcement learning
- Important techniques in PyTorch for RL
  - Stable Baseline3
  - Gymnasium
- Project 1: Stock Market Trading with RL
- Transformer for reinforcement learning
  - Decision transformer
  - Trajectory transformer
    - Input
- Conclusion
- Quiz
- Answers
15. Model Export, Serving, and Deployment
- Introduction
- Structure
- Objectives
- System resources
- Model export and serialization
  - PyTorch model export and import
    - torch.save
    - torch.load
    - torch.nn.Module.load_state_dict
  - Saving multiple models
- Exporting model on ONNX Format
- Serving model with FastAPI
  - Benefits of FastAPI
  - Application of FastAPI for model serving
  - Project: FastAPI for semantic segmentation model serving
- Serving Pytorch model in mobile devices
- Deploying HuggingFace’s Transformers model on AWS
  - Deployment using Amazon SageMaker
  - Deployment using AWS Lambda and Amazon API gateway
- Conclusion
- Quiz
- Answers
16. Transformer Model Interpretability, and Experimental Visualization
- Introduction
- Structure
- Objectives
- Explainability vs. interpretability
  - Interpretability
  - Explainability
- Tools for explainability and interpretability
- CAPTUM for interpreting Transformer prediction
  - Model loading
  - Input preparation
    - Why Baseline Tensor
  - Layer Integrated Gradients
  - Visualization
- TensorBoard for PyTorch models
- Conclusion
- Quiz
- Answers
17. PyTorch Models: Best Practices
- Introduction
- Structure
- Objectives
- Best practices for building transformer models
  - Working with Hugging Face
  - General consideration with Pytorch model
- The art of debugging in PyTorch
  - Syntax errors
  - Runtime errors
    - Shape mismatch
    - CUDA errors
    - Loss computation issues
    - Mismatched configuration
    - Memory error
    - Library/Dependency errors
  - Logical errors
  - General guidelines for debugging PyTorch ML models
- Conclusion
- Quiz
- Answers
Index