18CSE484T - Deep Learning Unit 4 & 5 (4 MARKS)

April 29, 2024

4M:

Regularized autoencoder:

Undercomplete autoencoders can fail to learn anything useful if the encoder and decoder are given too much capacity
Unlike those, Regularized autoencoders use a loss function that encourages the model to have other properties besides the ability to copy its input to its output

Sparsity of the representation
Robustness to noise or to missing inputs
Smallness of the derivative of the representation

Types of regularized autoencoder:

Sparse autoencoder
Denoising autoencoder

Compare and contrast autoencoder vs CNN

Autoencoders:

Learn a compact representation of the dat for reconstruction
Typically consists of fully connected layers
It is an unsupervised learning technique
Applicable to general purpose tasks

CNN:

Extract hierarchical features from images for classification or other tasks
Consists of convolutional, pooling and fully connected layers
It is a supervised learning technique
Specialized for image related tasks

While both autoencoders and CNNs learn representations, autoencoders focus on data compression and reconstruction whereas CNNs specialize in extracting features from images for classification and other visual tasks

Applications of autoencoders

An autoencoder is a type of artificial neural network used to learn efficient dat codings in an unsupervised manner
Applications:

Image reconstruction
Image colorization
Image generation
Image denoising
Image compression
Anomaly detection
Dimensionality reduction
Sequence to sequence prediction
Recommendation system

Undercomplete autoencoder

An autoencoder whose code dimension is smaller than the input dimension is called under complete
When the encoder and decoder are linear and L is the mean squared error, an under complete autoencoder learns to span the same subspace as PCA
Under complete autoencoders do not need any regularization as they maximize the probability of the data rather than copying the input to the output
Has a smaller dimension for hidden layer compare to the input layer and this helps in obtaining important features from the data
Objective is to minimize the loss function by penalizing the g(f(x))

Dimensionality reduction by autoencoder

An autoencoder whose code dimension is smaller than the input dimension is called under complete
The size of the hidden layer is smaller than the input layer
We force the network to learn important features by reducing the hidden layer size
Dimensional reduction methods are based on the assumption that dimension of data is artificially inflated and its intrinsic dimension is much lower

Siamese networks:

A Siamese neural network is also called a twin neural network
It is an ANN which contains two or more identical subnetworks which means they have the same configuration with the same parameters and weights
Usually, we only train one of the subnetworks and use the same configuration for other sub-networks
These networks are used to find the similarity of the inputs by comparing their feature vectors
We will have two encodings F(A) and F(B) and we will compare them to know how similar they are

RCNN - Object Detection

Region Based Convolutional Neural Network uses a Selective Search algorithm to detect possible locations of an object in an image

Steps followed in RCNN to detect objects:

Take a pre-trained CNN
Then this model is restrained. We train the last layer of the neural network based on the number of classes that need to be detected
Next is to get the Region of Interest for each image and reshape the region to match with CNN input size
After getting the regions, we train the SVM to classify the objects and the background
Finally we train a linear regression model to generate tighter bounding boxes for each identified object in the image

3D CNN – Event detection

3D-CNNs are an extension of traditional 2D-CNNs specifically designed for video and spatiotemporal data
Key features:

Temporal axis
3D filters
Integration of motion information
Feature extraction and classification

Video event detection:

3D-CNNs are used to extract spatial features from video frames
Subsequent layers predict events and their locations

Search This Blog

loveandhate

18CSE484T - Deep Learning Unit 4 & 5 (4 MARKS)

Comments

Post a Comment

Popular posts from this blog

18ECO124T - HUMAN ASSIST DEVICES UNIT 2 & 3 - 12M

18CSE483T - INTELLIGENT MACHINING UNIT 1 CLASS NOTES

18CSE483T - INTELLIGENT MACHINING UNIT 4 & 5 - 4M