Abstract
Rapid thermal processing (RTP) is an important process in the fabrication of semiconductor devices. It is difficult to achieve temperature uniformity control of the wafer in RTP since the system is a highly nonlinear process with strong spatial distribution. In this study, a transfer learning-based three-dimensional (3D) fuzzy multivariable control scheme is proposed for the temperature uniformity control of an RTP system. In difference to the traditional expert-knowledge based design, a two-level framework of transfer learning methodology is constructed to design the 3D fuzzy multivariable controller (3D FMC) with the help of a multi-output support vector regression (M-SVR). The 3D FMC defines a qualitative spatial fuzzy structure that will be transferred to the M-SVR. On the other hand, the structure parameters of the M-SVR will be learned from data and transferred to design quantitative parameters of the 3D FMC. Under the framework of transfer learning, the control laws (e.g. human control experience) hidden in spatio-temporal data can be extracted and formulated back into multi-output 3D fuzzy rules. The proposed method provides an effective integration of the spatial fuzzy inference and the transfer learning for 3D FLC design. The newly developed method is applied to the temperature uniformity control of a rapid thermal chemical vapor deposition (RTCVD) system at the set temperature 1000K, and the maximum non-uniformity along the wafer radius is close to 1K.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Su AJ, Jeng JC, Huang HP, Yu CC, Hung SY, Chao CK (2007) Control relevant issues in semiconductor manufacturing: Overview with some new results. Control Eng Pract 15(10):1268–1279
Lee KS, Lee J, Insik Chin A, Choi J, Lee JH (2001) Control of wafer temperature uniformity in rapid thermal processing using an optimal iterative learning control technique. Ind Eng Chem Res 40:1661–1672
Dassau E, Grosman B, Lewin DR (2006) Modeling and temperature control of rapid thermal processing. Comput Chem Eng 30(4):686–697
Zhang X -X, Li H -X, Wang B, Ma SW (2017) A hierarchical intelligent methodology for spatiotemporal control of wafer temperature in rapid thermal processing. IEEE Trans Semicond Manuf 30(1):52–59
Balakrishnan KS, Edgar TF (2000) Model-based control in rapid thermal processing. Thin Solid Films 365 (2):322–333
Yang DR, Lee KS, Ahn HJ, Lee JH (2003) Experimental application of a quadratic optimal iterative learning control method for control of wafer temperature uniformity in rapid thermal processing. IEEE Trans Semicond Manuf 16(1):36–44
Lin CA, Jan YK (2001) Control system design for a rapid thermal processing system. IEEE Trans Control Syst Technol 9(1):122–129
Lee B, Kim J, Ji S, Lee S (2016) Design of a decentralized controller for a glass RTP system. IEEE Trans Semicond Manuf 29(1):1–8
Christofides PD (2001) Nonlinear and robust control of partial differential equation systems: methods and applications to transport-reaction processes. Birkhäuser, Boston
Theodoropoulou A, Zafiriou E, Adomaitis RA (1999) Inverse model based real-time control for temperature uniformity of RTCVD. IEEE Trans Semicond Manuf 12(1):87–101
Baker J, Christofides PD (1999) Output feedback control of parabolic PDE systems with nonlinear spatial differential operators. Ind Eng Chem Res 38(11):4372–4380
Xiao T, Li H -X (2016) Sliding mode control design for a rapid thermal processing system. Chem Eng Sci 143:76–85
Li H -X, Zhang X -X, Li SY (2007) A three-dimensional fuzzy control methodology for a class of distributed parameter system. IEEE Trans Fuzzy Syst 15(3):470–481
Zhang X -X, Li H -X, Li SY (2008) Analytical study and stability design of three-dimensional fuzzy logic controller for spatially distributed dynamic systems. IEEE Trans Fuzzy Syst 16(6):1613–1625
Zhang X -X, Jiang Y, Li H -X, Li SY (2013) SVR Learning-based spatiotemporal fuzzy logic controller for nonlinear spatially distributed dynamic systems. IEEE Trans Neural Netw Learn Syst 24(10):1635–1647
Zhang X -X, Zhao L, Li JJ, Cao GT, Wang B (2017) Space-decomposition Based 3D Fuzzy Control Design for Nonlinear Spatially Distributed Systems with Multiple Control Sources Using Multiple Single-output SVR Learning. Appl Soft Comput 59:378–388
Kecman V (2001) Learning and soft computing: Support vector machines, neural networks and fuzzy logic models. The MIT Press, London
Pan S, Yang Q (2010) A survey on transfer learning. IEEE IEEE Trans Knowl Data Eng 22(10):1345–1359
Sarinnapakorn K, Kubat M (2007) Combining Subclassifiers in Text Categorization: A DST-based Solution and a Case Study. IEEE Trans Knowl Data Eng 19(12):1638–1651
Pan SJ, Zheng VW, Yang Q, Hu DH (2008) Transfer Learning for WiFi-Based Indoor Localization. In: Proceedings of Workshop Transfer Learning for Complex Task of the 23rd Assoc. for the Advancement of Artificial Intelligence (AAAI) Conf. Artificial Intelligence, pp. 1383–1388
Blitzer J, Dredze M, Pereira F (2007) Biographies, Bollywood, Boom-Boxes and Blenders: Domain Adaptation for Sentiment Classification. In: Proceedings of 45th Ann. Meeting of the Assoc. Computational Linguistics, pp. 432–439
Zhuo H, Yang Q, Hu DH, Li L (2008) Transferring knowledge from another domain for learning action models. In: Proceedings of 10th pacific rim int’l conf. Artificial intelligence. vol 5351, pp 1110–1115
Raykar VC, Krishnapuram B, Bi J, Dundar M, Rao RB (2008) Bayesian Multiple Instance Learning: Automatic Feature Selection and Inductive Transfer. In: Proceedings of 25th Int’l Conf. Machine Learning, pp. 808–815
Deng Z, Jiang Y, Chung FL, Ishibuchi H, Wang S (2013) Knowledge-Leverage-Based Fuzzy system and its modeling. IEEE Trans Fuzzy Syst 21(4):597–609
Deng Z, Jiang Y, Choi KS, Chung FL, Wang S (2013) Knowledge-leverage-based TSK Fuzzy System modeling. IEEE Trans Neural Netw Learn Syst 24(8):1200–1212
Yang C, Deng Z, Choi KS, Wang S (2016) Takagi–Sugeno–Kang Transfer learning fuzzy logic system for the adaptive recognition of epileptic electroencephalogram signals. IEEE Trans Fuzzy Syst 24(5):1079–1094
Zuo H, Zhang G, Pedrycz W, Behbood V, Lu J (2017) Fuzzy regression transfer learning in Takagi–Sugeno fuzzy models. IEEE Trans Fuzzy Syst 25(6):1795–1807
Behbood V, Lu J, Zhang G, Pedrycz W (2015) Multistep fuzzy bridged refinement domain adaptation algorithm and its application to bank failure prediction. IEEE Trans Fuzzy Syst 23(6):1917–1935
Shell J, Coupland S (2015) Fuzzy Transfer Learning: Methodology and application. Inf Sci 293:59–79
Pedrycz W, Russo B, Succi G (2012) Knowledge transfer in system modeling and its realization through an optimal allocation of information granularity. Appl Soft Comput 12(8):1985–1995
Adomaitis RA (1995) RTCVD Model Reduction: A Collocation on empirical eigenfunctions approach Technical Report. Institute for Systems Research, University of Maryland
Theodoropoulou A, Adomaitis RA, Zafiriou E (1998) Model reduction for optimization of rapid thermal chemical vapor deposition systems. IEEE Trans Semicond Manuf 11(1):85–98
Wang LX (1997) A course in fuzzy systems and control. Prentice-Hall, Upper Saddle River
Chen Y, Wang JZ (2003) Support vector learning for fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 11(6):716–728
Babuska R (1998) Fuzzy modeling for control. Kluwer, Boston
Ying H (2000) Fuzzy control and modeling: analytical foundations and applications. IEEE Press, New York
Zhao W, Liu JK, Chen YY (2015) Material behavior modeling with multi-output support vector regression. Appl Math Model 39(17):5216–5229
Vapnik V (1998) Statistical learning theory. Wiley, New York
Lu X, Zou W, Huang M (2016) A novel spatiotemporal LS-SVM method for complex distributed parameter systems with applications to curing thermal process. IEEE Trans Ind Inf 12(3):1156–1165
Lu X, Yin F, Huang M (2017) Online spatiotemporal least-squares support vector machine modeling approach for time-varying distributed parameter processes. Ind Eng Chem Res 56:7314–7321
Efron B (1979) Bootstrap methods: Another look at the Jackknife. Ann Stat 7:1–26
Angelis L, Stamelos I (2000) A Simulation Tool for Efficient Analogy Based Cost Estimation. Empir Softw Eng 5:35–68
Acknowledgements
The authors acknowledge the fund supported by the National Science Foundation of China (No. 61273182).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1
1.1 3D fuzzy multivariable controller: operations and mathematical derivation
A 3D FMC consists of three parts: 3D fuzzifier, multiple-output 3D fuzzy rule inference, and defuzzifier. Different to 3D FLC in [13], the 3D fuzzy rules in 3D FMC are designed with multiple outputs in the consequent parts, and then 3D fuzzy rule inference has to be carried out for each output. Therefore, defuzzifier will be executed on each output.
-
(1) 3D fuzzifier
The 3D fuzzifier transforms the crisp spatial input \(\boldsymbol {x}(\bar {Z})\) into a 3D fuzzy input \(\bar A_{X}\) as in (19). It is an extension of traditional fuzzifier with a space dimension and has two types: singleton type and non-singleton type.
where ∗ denotes the t-norm operation.
In this study, singleton fuzzifier is used for brevity.
-
(2) Multi-output 3D fuzzy rule inference
The multi-output 3D fuzzy rule inference consists of three operations: spatial information fusion, dimension reduction, and traditional inference [13]. It can cope with space information with two functions. The first one is capturing spatial information and the second one is inferring in a traditional way.
Assumed that multi-output 3D fuzzy rules are designed as in (20).
Then, for each fired rule, L fuzzy relations are formulated as below:
The 3D fuzzy input \({\bar A}_{X}\) goes through the multi-output 3D fuzzy inference engine. In (20), it is assumed that Gaussian type 3D membership function (MF) is chosen to describe a 3D fuzzy set. Then, 3D MF of \({{x}_{i}}(\bar Z)\) is expressed as
where \({c_{i}^{l}}(\bar Z)=({c_{i}^{l}}({z_{1}}), \cdots , {c_{i}^{l}}({z_{p}}))'\) is the center of the i th Gaussian type 3-D MF \(\bar C_{il}\) in the l th rule, \({\sigma _{i}^{l}}(\bar Z)=({\sigma _{i}^{l}}({z_{1}}), \cdots , {\sigma _{i}^{l}}({z_{p}}))'\) is the width of the i th Gaussian type 3-D MF \(\bar C_{il}\) in the l th rule. And 2D MF of \({{x}_{i}}(\bar Z)\) at z = zi is given as
where \(c_{ij}^{l}={c_{i}^{l}}({z_{j}})\) and \(\sigma _{ij}^{l}={\sigma _{i}^{l}}({z_{j}})\).
Firstly, as for each fuzzy relation, spatial information fusion operation is executed. A spatially distributed set Wl is formulated and its grade of the MF is given as
where ∗ denotes the t-norm operation.
Then, dimension reduction operation is carried out. For instance, if a weighted aggregation dimension reduction [14] is chosen, we will have a 2D set χl
In the next place, traditional inference operation is carried out. To begin with, implication operation is executed. For brevity, Mamdani implication is used. As for each fuzzy relation of \(\bar R_{l}\), a fuzzy output set is derived as follows.
where ∗ is a t-norm; \(\mu _{{B_{il}}}({u^{i}})\) and Vil are the membership grade of Bil and the output fuzzy set of \(\bar R_{l}\) for ui, respectively.
Finally, all the fired rules are combined by the inference engine. We have a composite output fuzzy set for each fuzzy relation of \(\bar R_{l}\)
where N is the number of fired rules.
-
(3) Defuzzifier
Defuzzifier is used to transform an output fuzzy set into a crisp output. If a linear defuzzifier [36] is chosen and Bil is a singleton fuzzy set, we have L crisp outputs, whose nonlinear mathematical expressions are given as the following.
where \({\zeta _{l}^{i}}\) is the nonzero value in the singleton fuzzy set Bil of the output variable ui in the l th rule.
Appendix 2
2.1 Multi-output SVR
In this study, we introduce an M-SVR with ε-insensitive cost function to settle the problem of regression with multiple variables, which is based on a previous contribution [37]. For simplicity, M-SVR with ε-insensitive cost function is abbreviated as M-SVR.
Let D = {[xi,yi] ∈ Rs × Rm,i = 1,⋯,r} be a training set with r pairs (x1,y1), (x2,y2), ⋯, (xr,yr), where \({{\text {x}}_{i}}{\text { = [}}{x_{i1}}{\text {,}} \cdots , {x_{is}}{]^{\prime }}\) is of s-dimension, \({{\text {y}}_{i}} = [{y_{1}^{i}}, \cdots , {y_{m}^{i}}]'\) is of m-dimension, and both of them are continuous. The goal is to find m functions fj(x,wj) (j = 1,⋯ ,m) so that all training patterns in D has a maximum deviation ε from the target values.
In terms of the structural risk minimization [38], by introducing the slack variables ξi and \(\xi _{i}^{*}\), the multi-output regression problem can be transformed into a convex optimization problem in (26).
subject to
where C is a constant that will be selected by the user.
By introducing the Lagrange multipliers \({\alpha _{k}^{j}}\), \(\alpha _{k}^{j*}\), γk and applying the saddle point condition, a dual optimization problem of (26) is yielded as follows.
subject to
Solving (28)-(29), we obtain the best regression hyper-surface fk(x,wk) (k = 1, 2,...,m ) with optimal weight vector wk and optimal bias bk as given in (30).
where \({w_{k}} = \sum \limits _{j = 1}^{r} {({\alpha _{k}^{j}} - \alpha _{k}^{j*}){{\text {x}}_{j}}}\), \({b_{k}} = \frac {1}{r}\left ({\sum \limits _{j = 1}^{r} {({y_{k}^{j}} - \left \langle {{w_{k}},{{\text {x}}_{j}})} \right \rangle } } \right )\). The training pattern xj with nonzero \({\alpha _{k}^{j}} - \alpha _{k}^{j*}\) is called Support Vector (SV).
To make the M-SVR nonlinear, and avoid a direct mapping, the kernel trick is used. Then, the best regression hyper-surface fk(x,wk) (k = 1,⋯ ,m) can be written as
where K(x, xj) is a Kernel Function (KF), which satisfy the Mercer’s theorem.
Let x1, x2, ⋯, xN represent support vectors. The solution of the M-SVR is described by a multi-output three-layer network structure as shown in Fig. 18.
Rights and permissions
About this article
Cite this article
Zhang, XX., Li, HX., Cheng, C. et al. Transfer learning based 3D fuzzy multivariable control for an RTP system. Appl Intell 50, 812–829 (2020). https://doi.org/10.1007/s10489-019-01557-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-019-01557-7