Deep Learning-Based Automatic Fault Detection in Semiconductor Chips

Principal Investigator: Makarand Deo, Department of Engineering

Institution: Norfolk State University

CO-PI: Sacharia Albin, Norfolk State University

CO-PI: Michel Audette, Old Dominion University

PROPOSAL TITLE: Deep Learning-Based Automatic Fault Detection in Semiconductor Chips

RESEARCH SUMMARY:

Current Outlook and Motivation: Semiconductor chip manufacturing has become increasingly complex in recent years with the average feature size in VLSI chips shrinking to less than 10 nm. This is especially true in the DRAM industry, where each successive generation of DRAM chips is getting smaller and more compact, yet higher in capacity. Consequently, the probabilities of manufacturing process-induced defects in the chips are rising, which reduce the yield and also raise the costs of fault inspections. Fault inspections can be broadly classified into two categories, optical and e-beam. Optical inspection tools are fast, but they have resolution limits [1]. E-beam inspection systems have better resolution, but they are far too slow. In order to improve product quality while reducing manufacturing and labor costs, automatic defect classification (ADC) systems have been developed [2]. An ADC system scans the wafer surface, collects the coordinates of suspicious areas where defects may exist and takes cross-sectional images using scanning electron microscope (SEM) for physical failure analysis (PFA). However, the defect detection and classification accuracy of ADC systems is limited and they often involve manual intervention. A typical wafer may imbed 150–300 chips, making expert manual evaluation impractical. Moreover, additional challenges are faced in inspecting modern 3D structures such as finFETs and 3D NANDs using the traditional 2D imaging modalities.

Current ADC approaches apply a series of image recognition and machine learning techniques tofind features for defect classification. These methods include decision-trees, artificial neural networks, support vector machines etc. [2]–[4]. However, these approaches are based on prior knowledge of defect classes and are computationally very expensive. Commercially available ADC tool KlarityTM [5] performs automated defect analysis using spatial signature analysis. Klarity is widely used in semiconductor industry for fault detection and yield analysis, in combination with other software tools such as JMP (https://www.jmp.com), and has also been used in several research studies for benchmarking. Recently, deep learning has gained growing attention because of its ability to automatically learn complex and high-dimensional patterns in the data. In particular, the convolutional neural networks (CNNs) are widely adopted because of relatively lower number of training parameters involved and their ability to perform complex image analysis and classification tasks. Deep learning-based defect detection and classification methodologies have been shown to yield more accurate fault detections [6]–[9]. These methods are especially suitable for real time defect analysis (RDA) and enhanced software-based defect analysis (ESDA) of VLSI chips. However, the deep learning-based fault detection methods are fairly new and are mostly limited to defect detection based on wafer maps. Circuit-level fault detection and multi- class defect classification have not been explored adequately till date.

Convolutional Neural Network (CNN): CNNs are the state-of-the-art algorithms for visual recognition tasks consisting of several stages of learnable filters that convolve the input data (image, audio or video) followed by a multilayer perception (MLP) network as shown in Figure 1. The convolution layer convolves an input image with a set of adaptive filters to reveal a feature map that consists of an abstraction of several features from the input data. Many layers of convolution and pooling are stacked on top of each other, revealing higher-level features from the deeper layers. The features revealed at the output of the Network consisting of fully connected neural layers. These layers implement the classification or recognition task using the features extracted by the convolution layers. CNNs have been widely used in image-based classification applications in medicine, infrastructure inspection as well as surface defect detection and analysis. Notable CNN architectures include Alexnet, GoogleNet, ResNet and DenseNet [10].

Figure 1. A typical CNN architecture for image classification.

Prior Work: We designed a CNN-based classifier for breast cancer identification from histopathology images of tissue biopsy. We combined transfer learning techniques with data augmentation to improve the performance of the CNN classifier. Our network achieved magnification level classification accuracy as high as 96% [11] on the BreakHis breast cancer dataset, which is the highest reported accuracy till date using that dataset. We further designed a genetic algorithm-based hyperparameter search algorithm which automatically designs an optimal CNN architecture for any visual image recognition tasks based on any given input dataset. Using this self-evolving algorithm (GA net), we were able to further improve the classification accuracy of our CNN classifier (Alexnet TL) as shown in Figure 2. The task of identifying the cell-level pattern discrepancies in a cancer tissue and accurately classifying fatal (malignant) vs. non-fatal (benign) or localized vs. systemic (metastatic) tumors has striking similarities with that of identifying and classifying defects in semiconductor wafers, which is the topic of this proposal. We have also used an AlexNet CNN classifier to detect geriatric falls based on a 2D scalogram of accelerometry data, which embeds spectral information akin to a Stockwell transform, for geriatric injury-mitigation applications [12].

Figure 2. Classification accuracies of genetic algorithm-based breast cancer classifier (GA net) compared to other published methods for various image magnification levels.

Proposed Research: In this work, we intend to design a fast and intelligent fault detection and classification algorithm for semiconductor chips. We propose to utilize the pattern recognition capabilities of CNN-based deep network to identify defects at wafer-level as well as circuit-level. The project will be divided into two Specific Aims as described below. Specific Aim 1: Designing a CNN-based network for fault identification and classification. A transfer learning approach will be explored by adopting a previously trained image-classification network, such as AlexNet or GoogleNet. The network will be pre-trained using the WM-811K Kagle Wafer Map dataset [13], which contains 811,457 semiconductor wafer images from 46,393 lots with eight defect labels. The pre trained network will then be appended with additional fully- connected computational layers and will be trained with labeled SEM image data from VLSI chips and wafers. The data will be augmented by clipping the images into smaller patches and by introducing affine transformations, if needed. The performance of the classifier will be tested using additional SEM data which is not used during the training or validation process. Specific Aim 2: Optimizing the network for best performance on circuit-level detection. We will utilize the genetic algorithm-based hyperparameter optimization process recently developed in our lab to automatically synthesize the optimal network for a given training data. The network architecture properties and training hyperparameters such as the number of convolution and
pooling stages, kernel size, number of filter channels, and strides will be encoded as a chromosome which undergoes evolution through pre-decided number of generations. All the candidate networks Figure 2. Classification accuracies of genetic algorithm-based breast cancer classifier (GA net) compared to other published methods for various image magnification levels. Each generation will be evaluated using a fitness function formulated based on the validation loss, validation accuracy as well as overfitting index. The optimal network will be trained using the SEM images at circuit-level as well as wafer maps.

Proposed Collaboration with Micron: The deep learning algorithms are effective if trained with appropriate and adequate data. As such, our proposed CNN-based approach will need to be trained using actual imaging data from semiconductor chips. There is no data available publicly on the circuit-level images of fault detection in VLSI chips. We propose a collaboration with Micron Technology, Inc., a leading manufacturer of semiconductor memory chips, to obtain optical inspection and PFA data (SEM images) from their older or discontinued chips to avoid confidentiality concerns (Support letter attached). The high-resolution images of wafers and circuit tracks, as well as corresponding classification outcomes (no defect or type of defect, such as peripheral or array defects etc.), will be used to train/validate our deep learning networks and test their performance. In addition, the fabrication engineers at Micron will be sought for their expert guidance in creating a labeled defect database and interpreting/validating the model outcomes. A support letter from Micron will be sent directly to the VMEC Grant Review Committee.

Team: We have assembled a collaborative team of experts from Norfolk State University (NSU) and Old Dominion University (ODU) for the proposed project. Dr. Makarand Deo (PI, NSU), an expert in multiscale modeling and deep learning, will lead the project and will oversee the design of deep network methodology. Dr. Sacharia Albin (Co-PI, NSU), an expert in semiconductor fabrication and microelectronics, will provide his expertise in fault analysis and dataset creation. Dr. Michel Audette (Co-PI, ODU), an expert in scientific visualization and image processing, will work on the image data augmentation, GPU-accelerated implementations and assessing the performance of the model. Dr. Audette also has access to ODU’s state-of-the-art high-performance computing (HPC) facilities which will be utilized for the computationally intensive GPU-based implementations. One full-time graduate student will be recruited at NSU to work on the CNN architecture and model training/testing. One full-time graduate student will be recruited at ODU to work on image data augmentation and HPC-accelerated implementation.

Impact Statement: The proposal has the potential to enhance profitability in semiconductor industry for which we have initiated collaboration with Micron. The project will enable prospects for external funding from agencies such as Semiconductor Research Corporation and NSF. The fault detection and yield analysis related modules will be introduced in 400 and 600 level engineering classes (Semiconductor Process Technology, VLSI Design, and Artificial Neural Networks). The students will be prepared and encouraged for careers in the area of yield analysis as well as machine learning-based fault detection, thus enabling skilled future workforce.

References:

[1] S. H. Huang and Y. C. Pan, Computers in Industry, 66: 1–10, 2015.
[2] P. B. Chou, et al., Mach. Vis. Appl., 9(4): 201–214, 1996.
[3] C.-F. Chang, et al., Int. J. Computer, Consumer and Control (IJ3C), 2 (2): 25-26, 2013.
[4] C. T. Su, et al., IEEE Trans. Semiconductor Manufacturing, 15(2): 260–266, 2002.
[5] https://www.kla-tencor.com/products/chip-manufacturing/data-analytics. [Accessed: 10-Oct-2020].
[6] G. Tello, et al., IEEE Trans. Semicond. Manuf., 31(20): 315–322, 2018.
[7] S. Cheon, et al., IEEE Trans. Semicond. Manuf., 32(2): 163–170, 2019.
[8] T. Ishida, et al., Proc. Int. Symp. Quality Electronic Design, ISQED, 2019(March): 291–297.
[9] J. Wang, et al., IEEE Trans. Semicond. Manuf., 32(3): 310–319, 2019.
[10] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.
[11] P. Yamlome, et al., Proc. IEEE Eng. Med. Biol., EMBS, 2020, 2020(July): 1144–1147.
[12] H. Yhdego, M.A. Audette, et al, Proc. Model. & Sim. in Med. Symp. SpringSim, 2019.
[13] https://www.kaggle.com/qingyi/%0Awm811k-wafer-map. [Accessed: 10-Nov-2020]