Modern Computer Vision with PyTorch: Explore deep learning concepts and implement over 50 real-world image applications
by V Kishore Ayyadevara and Yeshwanth Reddy
Print Length 页数: 824 pages
ISBN-10: 1839213477
ISBN-13: 9781839213472
Publisher finelybook 出版社: Packt Publishing (November 27,2020)
Language 语言: English
Book Description
Packed with hands-on implementations of deep learning techniques to build image processing applications using PyTorch. Each chapter is accompanied by a GitHub folder with code notebooks and questions to cement your understanding.
Deep learning for computer vision (CV) has had a considerable positive impact on several applications.
First you will learn to implement a neural network (NN) from scratch using both NumPy,PyTorch and then learn the best practices of tweaking a NN’s hyper-parameters.
As we progress,you will learn about CNNs,transfer-learning with a focus on classifying images. You will also learn about the practical aspects to take care of while building a NN model.
Next you will learn about multi-object detection,segmentation and implement them using R-CNN family,SSD,YOLO,U-Net,Mask-RCNN architectures. You will then learn to use Detectron2 framework to simplify the process of building a NN for object detection and human-pose-estimation. Finally you will implement 3-D object detection.
Subsequently,you will learn about auto-encoders and GANs with a strong focus on image manipulation and generation. Here,you will implement VAE,DCGAN,CGAN,Pix2Pix,CycleGan,StyleGAN2,SRGAN,Style-Transfer.
You will then learn to combine NLP and CV techniques while performing OCR,Image Captioning,object detection with transformers. Next,you will learn to combine RL with CV techniques to implement a self-driving car agent.
Finally,you’ll wrap up with moving a NN model to production and learn conventional CV techniques using open-cv library.
What you will learn
Train a neural network from scratch in NumPy and then in PyTorch
Implement 2D,3D multi-object detection and segmentation
Generate digits,DeepFakes,HD-Faces with autoencoders and advanced GANs
Manipulate images using CycleGAN,Pix2PixGAN,StyleGAN2 and SRGAN
Combine CV,NLP to perform OCR,image captioning,object detection
Combine CV,RL to build agents that play pong and self-drive a car
Deploy a Deep Learning model on AWS server using FastAPI,Docker
Dive deep and implement over 35 NN architectures and common OpenCV utilities