Semiconductor Engineering sat down to discuss the issues and challenges with machine learning in semiconductor manufacturing with Kurt Ronse, director of the advanced lithography program at Imec; Yudong Hao, senior director of marketing at Onto Innovation; Romain Roux, data scientist at Mycronic; and Aki Fujimura, chief executive of D2S. What follows are excerpts of that conversation.
L-R: Yudon Hao, Romain Roux, Aki Fujimura, Kurt Ronse.
SE: Machine learning is a hot topic. This technology uses a neural network to crunch data and identify patterns, then matches certain patterns and learns which of those attributes are important. We also have more advanced forms called deep learning. Is that correct?
Fujimura: Deep learning is a subset of artificial intelligence or AI. In machine learning, some people say that it’s a subset of AI. Some people say machine learning is a different computer science or a data analytics way of thinking. But it doesn’t matter either way. Deep learning is one particular kind of machine learning. It also has enhanced what machine learning can do. Is it a language? No, it is not a language. It’s an approach. Deep learning is a particular approach to software. In some ways, it’s automatic programming. Instead of a software engineer sitting down and writing code, deep learning involves an engineer sitting there manipulating what kind of a neural network to use and what kind of tuning he does to the neural network. But you also manipulate the data you give it. You train the neural network, so that the neural network that results is automatically programmed to do whatever it is that you want it to do. For example, you might want it to tell a cat from a dog, or a defect from a non-defect. You have an objective in mind that you want a deep learning neural network to do, and then you train it with data.
Hao: Pattern matching is part of machine learning. When you think about machine learning, you can say it’s a model. We build a predictive model that can map some inputs to some outputs. So we have new data come in, and then we can predict what the outputs would be. The applications include inspection, image processing, natural language processing and others.
SE: At one time, didn’t we rely on physics for these applications? Some time ago, the industry generally developed equipment and determined its functionality using pure physics and physical models, right?
Ronse: Traditionally, yes, but in the last couple of years there have been more and more trials to do some artificial intelligence. The reason is that all these tools are generating a lot of data, which have nothing to do with each other. Normally, you store this data, but nobody ever looked at the data or tried to find the connections. More recently, we had one particular tool, for example, that had a number of down situations. It was one after the other. And so we started a machine learning project. We tried to see if you could find correlations between the data that the tool was generating and different types of data in down situations so that in the following periods you would be able to predict this type of data. So, in turn, we could expect a down situation and it could be anticipated. But it has to do a lot with data that was not used in the past. Now, with powerful computers, you want to basically analyze the data and see if you find a trend that could be correlated with what you are interested in, such as avoiding down situations.
Roux: System modeling can be complex, and sometimes getting an accurate model is just simply not possible. Deep learning is very efficient at finding hidden correlations within the data. If you can collect inputs and corresponding outputs of a system, you can try to use this technology to model your system and also inverse it. Deep learning is an empirical science. It is really hard to anticipate the accuracy of the model you will obtain. You have to try it to know for sure. Once you have the right data, training a neural network to mimic the physical process that transformed the inputs into the outputs is relatively fast and easy.
SE: Machine learning isn’t new. AI and machine learning have been around for years. In the 1990s, for example, IBM presented a paper on ways to find defects in chips using an inspection system and early forms of machine learning. But the system was slow and inaccurate. Why did the early attempts fall short?
Hao: There are two reasons. One reason is computing power. At that time, it was not enough to support a complicated machine learning system. And second, the machine learning technology at that time was still in its infancy. The technology was not ready then. But over time, the semiconductor industry has seen vast improvements in computing power. This made it more likely that we can apply machine learning and artificial intelligence in the field. There is another factor, too. Everything is getting more complicated. The devices have become 3D. The complexity has grown exponentially. To just use physics to model everything is still possible, but it just may take a year for you to do it. Machine learning can make it much faster. So we’ve seen vast improvements in computing power and machine learning. That can benefit the semiconductor business.
Fujimura: Deep learning is definitely enabled by computing power. Most people would say that the rise of GPU computing is what really made it happen. I remember 10 years ago, people would say, ‘How in the world are we going to use all this computing power that we have now let alone what we are going to have 10 years from now. We don’t need leading-edge anymore.’ People’s imagination was limited to PCs or gaming. Then, people came up with this mechanism with tremendous amounts of computing power.
Roux: In addition, several free and open source libraries are available like TensorFlow (Google) and PyTorch (Facebook) for instance. They were out in 2015 and 2016, respectively. So it’s still new. Some major companies have been pioneers in this domain for 5 to 10 years. We are currently moving from the academic to the industrial world. We are still in this transition.
SE: For some time, Amazon, Google, Facebook and others have been using machine learning to improve web searches as well as other apps. It’s also being used in certain parts of the semiconductor manufacturing flow. Can it being used in the photomask world?
Fujimura: There are many applications in the photomask world for deep learning. There are two big categories — big data and image processing. Big data applications may not need to be deep or even neural-network-based, but machine learning as a general category of techniques can help analyze a large quantity of data to extract trends or correlations for predictions. Image processing, meanwhile, isn’t limited to analysis or manipulation of pictures like SEM images. This can be used for automatic defect categorization. It can be broadly applied to anything that’s analyzing or manipulating pixel data. So mask or wafer simulation, and therefore OPC/ILT (optical proximity correction/inverse lithography technology), are also a part of that category. In designing mask shapes, OPC/ILT uses deep learning both to accelerate computing times and to improve the accuracy of the results. Deep learning is a statistical method based on intensive pattern matching, so you can’t replace OPC/ILT with just deep learning. But because deep learning is an excellent fast estimator, many companies have shown excellent results in improving run times, while improving the quality of the results.
Roux: We are currently working on anomaly detection for our flat panel display photomask writers. Today, our customers are systematically inspecting photomasks without any guidance. We are working on an anomaly estimation module to help them focus the inspection on the right region. Deep learning is very interesting for this type of application because deep neural networks are very good at modeling complex physical phenomena. This project is aimed at detecting any type of anomalies without having to explicitly describe what type of anomalies could occur. Basically, we are training a neural network to understand what normal behavior means so that we can monitor how normal or abnormal things went when writing a photomask. This module will help our customers to increase yield because they will avoid missing small potential defects during the inspection.
SE: Is machine learning better or more accurate than the traditional ways of doing it in the photomask world or elsewhere? Are there any challenges here?
Fujimura: That’s a key question for whether a particular application is fit for deep learning or not. For any given effect, simulation is always more accurate than deep learning because deep learning is a statistical method. For many effects, like short-range blur modeled as a Gaussian convolution, simulation/computation is faster, particularly on a GPU. But sometimes, particularly for very sophisticated effects, the tradeoff between run-time and accuracy works in favor of deep learning. If simulating is too slow to be practically deployable, doing a deep learning-based estimation gains in accuracy because a fast estimator is more accurate than a simulator that’s too slow. Deep learning tends to be an excellent fast estimator because it is an automatic program generation method that is programmed with data.
Using Machine Learning In Fabs
Finding Defects In Chips With Machine Learning