Tải bản đầy đủ (.pdf) (30 trang)

Xử lý ảnh (giảm nhiễu bằng autoencoder) (có code)

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.12 MB, 30 trang )

<span class="text_page_counter">Trang 1</span><div class="page_container" data-page="1">

<b>UNIVERSITY OF INFORMATION TECHNOLOGY COMPUTER SCIENCE </b>

<b>PROJECT REPORT </b>

<b>CS406.O11.KHCL </b>

<b>PROJECT: Image Denoising using AutoEncoder </b>

Full Name: Nguyễn Hà Anh Vũ ID: 21520531

Ho Chi Minh City, January 10, 2024

</div><span class="text_page_counter">Trang 2</span><div class="page_container" data-page="2">

<i><b>Acknowledgment </b></i>

First and foremost, I would like to express my deep gratitude to you, Professor, for your outstanding support and guidance throughout the Image Processing and Applications course (CS406.O11.KHCL). This has truly enriched my learning experience and made it more meaningful.

You not only imparted the course material with clarity but also shared profound knowledge and practical experiences, allowing me to gain a deeper understanding of this field.

Moreover, your facilitation of hands-on activities and application of knowledge through world projects has enabled me to develop practical skills and implement the learned concepts in a professional context.

real-Once again, I want to extend my sincere appreciation to Professor Mai Tiến Dũng for your dedication and passion in teaching. I look forward to continuing my learning journey under your guidance in the upcoming courses.

Thank you sincerely. Ho Chi Minh City, January 10, 2024

</div><span class="text_page_counter">Trang 3</span><div class="page_container" data-page="3">

<b>Table of Contents </b>

<i><b>A. Introduction ... 4 </b></i>

<b><small>I.Overview ... 5</small></b>

<small>1.Introduction to Digital Image Processing ... 5</small>

<small>2.Differences between Image Processing and Computer Vision ... 6</small>

<small>3.The history of image processing ... 6</small>

<small>4.Abstraction ... 7</small>

<small>5.Motivation ... 7</small>

<small>6.Related work ... 8</small>

<b><small>II.Problem Identification... 9</small></b>

<small>1.Input: Image with Noise... 9</small>

<small>2.Output: Denoised Images ... 9</small>

<small>2)Challenges in Image Denoising: ... 11</small>

<small>3)Significance of Medical Image Denoising:... 12</small>

<small>4)Impact on Diagnosis and Treatment: ... 12</small>

<small>1.Digital Dental Periapical X-Ray Database for Caries Screening... 18</small>

<small>2.The mini-MIAS database of mammograms ... 19</small>

<b><small>II.Choosing Hyperparameters for Model Training ... 20</small></b>

</div><span class="text_page_counter">Trang 4</span><div class="page_container" data-page="4">

<small>1.Choosing Epochs: ... 20</small>

<small>2.Choosing Batch Size: ... 20</small>

<small>3.Choosing Learning Rate: ... 21</small>

<small>4.Choosing Optimizer: ... 21</small>

<small>5.Choosing Loss Function: ... 21</small>

<small>Autoencoder Model Architecture: ... 21</small>

<small>Training Configuration: ... 22</small>

<b><small>III.Metrics ... 22</small></b>

<small>1.Peak Signal-to-Noise Ratio (PSNR): ... 22</small>

<small>2.Structural Similarity Index (SSIM):... 23</small>

<small>3.Why Choose PSNR and SSIM:... 24</small>

<small>4.Why not choose MSE: ... 24</small>

Picture 2: Key Stages in Digital Image Processing ... 5

Picture 3: Differences between Image Processing & Computer Vision ... 6

Picture 4: (“Image denoising: Can plain neural networks compete with BM3D?.” (CVPR), 2012)... 8

Picture 5: Examples for Input ... 9

Picture 6: Examples for Output... 9

Picture 7: Noisy image ...11

Picture 8: Denoising Medical Image...11

Picture 9: Basic autoencoder ... 13

Picture 10: stacked denoising autoencoder ... 14

Picture 11: Methodology ... 15

Picture 12:adding noise to data ... 16

Picture 13: "Digital Dental Periapical X-Ray Database for Caries Screening" ... 19

Picture 14: The mini-MIAS database of mammograms ... 20

Picture 15: Model Architecture ... 22

Picture 16: Why not choose MSE ... 24

Picture 17: Result adding 2 noise... 25

Picture 18: Result on Dental (Not augmentation) ... 25

Picture 19: Result on MIAS (Not augmentation) ... 26

Picture 20: Result on Denal (Augmentation) ... 27

Picture 21: Result on MIAS (Augmentation) ... 27

Picture 22: Gadio.app... 28

Picture 23: Demo ... 29

</div><span class="text_page_counter">Trang 5</span><div class="page_container" data-page="5">

<i><b>A. Introduction </b></i>

<b>I. Overview </b>

<i><b>1. Introduction to Digital Image Processing </b></i>

Image processing is a significant field in computer science and information technology, primarily focusing on the processing, transformation, and understanding of information from images and videos. The main goal of image processing is to extract useful information from images and videos, making them easy to read, analyze, or use in various applications.

In the realm of digital image processing, the field concentrates on two major tasks: • Improvement of pictorial information for human interpretation.

• Processing of image data for storage, transmission, and representation for autonomous machine perception.

These tasks involve applying methods and algorithms to enhance image quality, such as increasing resolution, improving contrast, noise reduction, and various other processing tasks to make image information clearer and more manageable. Concurrently, image processing also involves the analysis and extraction of feature information from images, aiding computers in understanding the content of images and videos automatically.

The continuum from image processing to computer vision can be broken up into low-, mid- and high-level processes.

<i><small>Picture 1: Levels of Image Processing </small></i>

There are some Key Stages in Digital Image Processing:

<i><small>Picture 2: Key Stages in Digital Image Processing </small></i>

Enhancement, is a facet that focuses on minimizing or eliminating noise from an image.

</div><span class="text_page_counter">Trang 6</span><div class="page_container" data-page="6">

<i><b>2. Differences between Image Processing and Computer Vision </b></i>

<b>Attribute Image Processing Computer Vision Definition </b>

A field of computer science and

technology focusing on the processing, transformation, and understanding of information from images and videos.

A branch of artificial intelligence aimed at enabling computers to understand and solve visual tasks similar to humans.

<b>Output </b> <sup>Processed images (enhanced, filtered, </sup><sub>etc.). </sub>

Information or decisions based on the recognition and understanding of visual content.

<b>Illustrative Example </b>

- Improving image quality (resolution, contrast).

- Removing noise from images (Image Denoising).

- …

- Face recognition in images or videos.

- Object classification in images (animals, objects).

- …

<i><small>Picture 3: Differences between Image Processing & Computer Vision </small></i>

<i><b>3. The history of image processing </b></i>

The history of image processing began with early efforts to understand and utilize images, and it has undergone significant advancements over the decades. Below is a summary of the development history of image processing:

<b>b) Preliminary Image Processing Era (1970s - 1980s): </b>

The advent of digital computers opened up the possibility of digital image processing and the implementation of more complex algorithms.

Methods like low-pass filtering and high-pass filtering were developed to enhance image quality.

</div><span class="text_page_counter">Trang 7</span><div class="page_container" data-page="7">

<b>c) Breakthroughs in Computer Vision (1980s - 1990s): </b>

Computer vision began to emerge as an independent field, concentrating on image recognition and understanding.

Algorithms such as Hough Transform, Edge Detection, and Segmentations appeared, paving the way for new applications.

<b>d) Advancements in Statistical Methods and Machine Learning (2000s - Present): </b>

The prevalence of statistical methods and machine learning changed the landscape of image processing, with the emergence of deep learning models.

Convolutional Neural Networks (CNNs) demonstrated outstanding performance in various computer vision tasks, from object recognition to medical image processing.

<b>e) Wide-Spread Applications and Future Prospects (Present - Ongoing): </b>

Image processing has become a crucial component in various fields, including healthcare, autonomous vehicles, security, and many others.

Ongoing research continues to focus on developing advanced image processing methods and implementing them in real-world applications.

<i><b>4. Abstraction </b></i>

Image denoising plays a crucial role as a preprocessing step in the analysis of medical images. Over the past three decades, various algorithms have been proposed, each exhibiting different denoising performances. More recently, deep learning-based models have demonstrated remarkable success, surpassing traditional methods. However, these advanced models face limitations, particularly in terms of demanding large training sample sizes and incurring high computational costs.

In this paper, we address these challenges and propose a novel approach utilizing denoising autoencoders constructed with convolutional layers. Our method is distinctive in its ability to efficiently denoise medical images even when trained on a small sample size. We demonstrate the effectiveness of our approach by combining heterogeneous images, thereby boosting the effective sample size and subsequently enhancing denoising performance.

One notable advantage of our method is its adaptability to the complexities of medical imaging. Even with the simplest network architectures, our denoising autoencoders exhibit an exceptional ability to reconstruct images, even in scenarios where corruption levels are so high that noise and signal become indistinguishable to the human eye.

Through this research, we aim to provide a practical and effective solution for medical image denoising that overcomes the limitations associated with large training datasets and computational resource requirements commonly observed in deep learning-based models.

<i><b>5. Motivation </b></i>

In the realm of medical imaging, achieving precise diagnoses and conducting thorough analyses of medical images necessitate a high level of accuracy and detail. Unfortunately, these images are frequently marred by noise originating from diverse factors. Addressing this significant challenge has become imperative, and I am motivated to leverage the capabilities of deep learning to tackle this issue.

</div><span class="text_page_counter">Trang 8</span><div class="page_container" data-page="8">

The inherent complexity and intricacy of medical images demand sophisticated solutions for effective denoising. By harnessing the power of autoencoders, a type of neural network, I aim to develop a robust and efficient method for image denoising in medical imaging. Autoencoders have demonstrated remarkable capabilities in learning complex patterns and representations, making them well-suited for handling the intricate nature of medical image data.

My motivation stems from the realization that a successful image denoising approach can substantially enhance the accuracy of diagnostic procedures and contribute to more reliable medical analyses. Through the utilization of autoencoders, I aspire to bring about advancements in the field, ultimately leading to improved image quality and facilitating more accurate medical assessments.

<i><b>6. Related work </b></i>

The landscape of image denoising techniques has witnessed significant advancements, with various approaches demonstrating remarkable capabilities in addressing noise-related challenges. Notably, BM3D has been regarded as state-of-the-art in image denoising, characterized by its well-engineered methodology. However, Burger et al. challenged this notion by showcasing that a simple multi-layer perceptron (MLP) can achieve denoising performance comparable to BM3D.

<i><small>Picture 4: (“Image denoising: Can plain neural networks compete with BM3D?.” (CVPR), 2012)</small></i>

An addition to the image denoising is the introduction of denoising autoencoders. Serving as fundamental components for deep networks, these autoencoders, as extended by Vincent et al., offer a novel approach to image denoising. The concept involves stacking denoising autoencoders to construct deep networks, where the output of one denoising autoencoder is fed as input to the subsequent layer.

Jain et al. proposed image denoising using convolutional neural networks (CNNs), demonstrating that even with a small sample of training images, performance on par or superior to state-of-the-art methods based on wavelets and Markov random fields can be

</div><span class="text_page_counter">Trang 9</span><div class="page_container" data-page="9">

achieved. Additionally, Xie et al. leveraged stacked sparse autoencoders for both image denoising and inpainting, showcasing performance comparable to K-SVD.

Agostenelli et al. explored the application of adaptive multi-column deep neural networks for image denoising, constructed through a combination of stacked sparse autoencoders. This innovative system demonstrated robustness across various noise types. The collective findings from these related works highlight the versatility and effectiveness of different autoencoder-based approaches, including MLPs, convolutional neural networks, and stacked sparse autoencoders, in the domain of image denoising.

<b>II. Problem Identification </b>

<i><b>1. Input: Image with Noise </b></i>

The primary input of problem is a grayscale image that are affected by noise. Noise can be introduced during the image acquisition process or due to other environmental factors.

<i><small>Picture 5: Examples for Input </small></i>

<i><b>2. Output: Denoised Images </b></i>

-> The desired output is a denoised image where the unwanted noise has been effectively removed while preserving the essential features and details in the images.

<i><small>Picture 6: Examples for Output </small></i>

<i><b>3. Constraints: </b></i>

<b>a) Limited Training Data: Constraints on the size of the training dataset may limit the </b>

number of available images for model training. This can be a challenge, especially in medical data where images are scarce.

<b>b) Model Depth: Constraints on the number of layers in the autoencoder can be applied to </b>

control the complexity of the model and mitigate the risk of overfitting.

<b>c) Model Stability: Ensuring that the model is not overly complex to avoid overfitting and </b>

maintain stable performance on new data.

<b>d) Acceptable Training Time: Limiting the training time of the model can be an important </b>

constraint, especially when computational resources are limited.

<b>e) Optimal Performance: Constraints on the model's performance, particularly achieving </b>

high denoising performance on various types of noise and different lighting conditions.

</div><span class="text_page_counter">Trang 10</span><div class="page_container" data-page="10">

<b>f) Flexibility: The model needs to be flexible and adaptable to various types of medical </b>

images and imaging conditions.

<b>g) Noise Tolerance: The model should have the ability to handle and denoise images in </b>

noisy conditions without losing important information.

<b>h) Scalability: The model should be scalable to apply to a large volume of diverse medical </b>

images without requiring extensive re-tuning.

<i><b>4. Requirments: </b></i>

<b>a) High Denoising Accuracy: The autoencoder should demonstrate high accuracy in </b>

denoising images, effectively removing noise while preserving essential details.

<b>b) Adaptability to Various Image Types: The model should be able to adapt to different </b>

types of medical images, considering variations in imaging modalities and structures.

<b>c) Efficient Handling of Limited Training Data: The autoencoder should be designed to </b>

effectively learn from a limited dataset, considering potential constraints on the availability of labeled training images.

<b>d) Flexibility in Model Architecture: The architecture of the autoencoder should be </b>

flexible, allowing adjustments to the number of layers and neurons to achieve optimal denoising performance.

<b>e) Robustness to Varied Lighting Conditions: The autoencoder should be robust to </b>

changes in lighting conditions, providing consistent denoising performance across different levels of illumination.

<b>f) Interpretability of Results: The denoising results should be interpretable and should </b>

not introduce artifacts or distortions that could mislead medical professionals during image analysis.

<b>g) Scalability: The model should be scalable to handle a growing dataset and potential </b>

advancements in imaging technology without requiring significant reconfiguration.

<b>h) Compatibility with Existing Infrastructure: The integration of the autoencoder into </b>

existing medical imaging systems or workflows should be seamless and compatible with established infrastructure.

<b>i) User-Friendly Interface for Training and Evaluation: If applicable, there should be </b>

a user-friendly interface for training the autoencoder and evaluating its denoising performance, making it accessible to practitioners without extensive machine learning expertise.

</div><span class="text_page_counter">Trang 11</span><div class="page_container" data-page="11">

<i><b>B. PRELIMINARIES </b></i>

<b>I. Noisy image </b>

A " Noisy image " refers to an image that has been corrupted, containing unwanted or random components added from various sources. Noise can appear in an image due to various factors such as poor lighting conditions, inaccurate sensor equipment, or imperfect data transmission processes.

<i><small>Picture 7: Noisy image </small></i>

Where is the noisy image produced as a sum of original image x and some noise z.

Images with noise often appear blurry and unclear, reducing the quality of the original image and posing challenges in analyzing and interpreting information within the image. The goal of the Image Denoising problem is to utilize an Autoencoder model to eliminate or minimize these noisy components, reconstructing the original image with improved quality while retaining essential information.

<b>II. Denoising </b>

<i><b>1) Overview: </b></i>

Denoising image, the process of removing noise or unwanted artifacts from images, plays a crucial role in various scientific fields. In particular, Medical Image Denoising is of paramount importance due to its direct impact on diagnostic accuracy and the overall reliability of medical imaging.

<i><small>Picture 8: Denoising Medical Image</small></i>

<i><b>2) Challenges in Image Denoising: </b></i>

In scientific research, especially in disciplines such as astronomy, biology, and materials science, images captured through various instruments are often contaminated with noise. This noise can obscure critical details and affect the accuracy of subsequent analyses. Addressing these challenges requires advanced denoising techniques capable of preserving essential features while eliminating unwanted distortions.

</div><span class="text_page_counter">Trang 12</span><div class="page_container" data-page="12">

<i><b>3) Significance of Medical Image Denoising: </b></i>

In the medical field, where image quality directly influences diagnostic decisions, Medical Image Denoising is a critical step in enhancing the clarity and precision of medical imaging modalities such as X-rays, CT scans, MRIs, and ultrasound. The importance of this process extends to various medical applications, including disease detection, treatment planning, and surgical guidance.

<i><b>4) Impact on Diagnosis and Treatment: </b></i>

<i>- Enhanced Visibility: Denoising ensures that medical images are free from artifacts, </i>

allowing healthcare professionals to have a clearer view of anatomical structures and abnormalities.

<i>- Improved Diagnostic Accuracy: Clean and precise images contribute to accurate </i>

diagnosis, reducing the likelihood of misinterpretations or missed diagnoses.

<i>- Optimized Treatment Planning: Medical Image Denoising aids in the planning of surgical </i>

procedures, radiation therapy, and other interventions by providing clinicians with quality images for detailed analysis.

<b>high-5) Technological Advances: </b>

Recent advancements in denoising techniques, particularly the application of deep learning algorithms such as convolutional neural networks (CNNs) and autoencoders, have significantly improved the efficacy of image denoising. These methods can adaptively learn complex patterns in medical images, making them valuable tools in the pursuit of high-quality diagnostic imaging.

<b>III. Autoencoder </b>

An autoencoder is a type of neural network that tries to learn an approximation to identity function using backpropagation, given a set of unlabeled training inputs x

<small>(1)</small>

, x

<small>(2)</small>

, ..., x

<small>(n) </small>

, it uses

<small>z</small><sup>(i) </sup><small>= x</small><sup>(i) </sup>

An autoencoder first takes an input x ∈ [0,1]

<small>d </small>

and maps(encode) it to a hidden representation y ∈ [0, 1]

<sup>d′ </sup>

using deterministic mapping, such as

y = s(W x + b)

where s can be any non linear function. Latent representation y is then mapped back(decode) into a reconstruction z, which is of same shape as x using similar mapping.

z = s(W′y + b′)

Model parameters (W,W′,b,b′) are optimized to minimize reconstruction error. prime symbol is not a matrix transpose. Model parameters (W,W′,b,b′) are optimized

</div><span class="text_page_counter">Trang 13</span><div class="page_container" data-page="13">

to minimize recon- struction error, which can be assessed using different loss functions such as squared error or cross-entropy.

<i><small>Picture 9: Basic autoencoder </small></i>

Layer L1 is input which is encoded in layer L2 using latent representation and input is reconstructed at L3. Using number of hidden units lower than inputs forces autoencoder to learn a compressed approximation. Mostly an autoencoder learns low dimensional representation very similar to Principal Component Analysis (PCA). Having hidden units larger than number of inputs can still discover useful insights by imposing certain sparsity constraints.

<b>IV. Denoising Autoencoder </b>

Denoising autoencoder is a stochastic extension to classic autoencoder, that is we force the model to learn reconstruction of input given its noisy version. A stochastic corruption process randomly sets some of the inputs to zero, forcing denoising autoencoder to predict missing(corrupted) values for randomly selected subsets of missing patterns.

Denoising autoencoders can be stacked to create a deep network (stacked denoising autoencoder)

</div><span class="text_page_counter">Trang 14</span><div class="page_container" data-page="14">

<i><small>Picture 10: stacked denoising autoencoder </small></i>

Output from the layer below is fed to the current layer and training is done layer wise.

<b>V. Convolutional Autoencoder </b>

Convolutional autoencoders are based on standard autoencoder architecture with

<i>convolutional encoding and decoding layers. Compared to classic autoencoders, </i>

convolutional autoencoders are better suited for image processing as they utilize full capability of convolutional neural networks to exploit image structure.

In convolutional autoencoders, weights are shared among all input locations which helps preserve local spatiality. Rep- resentation of ith feature map is given as

where bias is broadcasted to whole map, ∗ denotes convo- lution (2D) and s is an activation. Single bias per latent map is used and reconstruction is obtained as

</div><span class="text_page_counter">Trang 15</span><div class="page_container" data-page="15">

where c is bias per input channel, H is group of latent feature maps, W ̃ is flip operation over both weight dimensions.

Backpropogation is used for computation of gradient of the error function with respect to the parameters.

<i><b>C. Methodology </b></i>

<i><small>Picture 11: Methodology </small></i>

The goal of Methodology for Image Denoising using Autoencoderis to leverage Autoencoder architecture for image denoising. Initially, a dataset of clean images is available. To train the Autoencoder model, noise is introduced to these images, creating a set of noisy images. The purpose is to enable the Autoencoder to learn the mapping from noisy images to clean images, facilitating the ability to denoise images effectively.

<b>I. Data Preprocessing </b>

<i><b>1. Data Collection </b></i>

Collect a dataset of clean images to serve as the foundation for training the Autoencoder.

<i><b>2. Resizing Images for Model Input </b></i>

To ensure that both clean and noisy images; images from another dataset are appropriately sized for input into the Autoencoder model, we need to perform the image resizing process. This process involves adjusting the resolution of the images to match the requirements of the model.

<i><b>3. Noise adding </b></i>

Apply different types and levels of noise to the clean images to generate a diverse set of noisy images. This step is crucial for training the Autoencoder to handle various noise patterns.

</div>

×