Abstract
Many engineering problems require identifying feasible domains under implicit constraints. One example is finding acceptable car body styling designs based on constraints like aesthetics and functionality. Current active-learning based methods learn feasible domains for bounded input spaces. However, we usually lack prior knowledge about how to set those input variable bounds. Bounds that are too small will fail to cover all feasible domains; while bounds that are too large will waste query budget. To avoid this problem, we introduce Active Expansion Sampling (AES), a method that identifies (possibly disconnected) feasible domains over an unbounded input space. AES progressively expands our knowledge of the input space, and uses successive exploitation and exploration stages to switch between learning the decision boundary and searching for new feasible domains. We show that AES has a misclassification loss guarantee within the explored region, independent of the number of iterations or labeled samples. Thus it can be used for real-time prediction of samples’ feasibility within the explored region. We evaluate AES on three test examples and compare AES with two adaptive sampling methods — the Neighborhood-Voronoi algorithm and the straddle heuristic — that operate over fixed input variable bounds.
Similar content being viewed by others
Notes
Note that in this paper the terms “active learning” and “adaptive sampling” are interchangeable.
A point infinitely far away from previous queries has the \(\bar {f}(\boldsymbol {x})\) close to 0 and the maximum V (x), thus the highest p 𝜖 (x).
In Section 3, we assume that the queried point is the exact solution to the query strategy. However since we approximate the exact solution by using a pool-based sampling setting, the query may be deviate from the exact solution slightly.
Sampling methods like random sampling or Poisson-disc sampling (Bridson 2007) can be used to generate the pool. We use random sampling here thereby for simplicity. The specific choice of the sampling method within the local pool is not central to the overall method.
We can set 𝜖 and τ such that the accuracy bound is as required. Details about how to set hyperparameters are in Section 5.3.
Technically, due to sampling error introduced when generating the pool, the exploitation stage will be influenced by 𝜖 (since \(\bar {f}(\boldsymbol {x}^{*})\) is only ≈ 0). But this effect is negligible compared to 𝜖’s influence on the exploration stage.
For NV algorithm, its pool size refers to the test samples generated for the Monte Carlo simulation.
This difference is because NV’s explored region covers more area than AES at the beginning.
References
Agarwal A (2013) Selective sampling algorithms for cost-sensitive multiclass prediction. ICML (3) 28:1220–1228
Alabdulmohsin I, Gao X, Zhang X (2015) Efficient active learning of halfspaces via query synthesis. In: Proceedings of the Twenty-Ninth AAAI conference on artificial intelligence. AAAI Press, pp 2483–2489
Angluin D (2004) Queries revisited. Theor Comput Sci 313(2):175–194
Argamon-Engelson S, Dagan I (1999) Committee-based sample selection for probabilistic classifiers. J Artif Intell Res(JAIR) 11:335–360
Awasthi P, Feldman V, Kanade V (2013) Learning using local membership queries Shalev-Shwartz S, Steinwart I (eds), vol 30, Proceedings of Machine Learning Research, Princeton
Baram Y, Yaniv RE, Luz K (2004) Online choice of active learning algorithms. J Mach Learn Res 5:255–291
Basudhar A, Missoum S (2008) Adaptive explicit decision functions for probabilistic design and optimization using support vector machines. Comput Struct 86(19):1904–1917
Basudhar A, Missoum S (2010) An improved adaptive sampling scheme for the construction of explicit boundaries. Struct Multidiscip Optim 42(4):517–529
Bellman R (1957) Dynamic programming. Princeton University Press, Princeton
Bouneffouf D (2016) Exponentiated gradient exploration for active learning. Computers 5(1):1
Bridson R (2007) Fast poisson disk sampling in arbitrary dimensions. In: ACM SIGGRAPH 2007 sketches SIGGRAPH ’07. ACM, New York, https://doi.org/10.1145/1278780.1278807, (to appear in print)
Bryan B, Nichol RC, Genovese CR, Schneider J, Miller CJ, Wasserman L (2006) Active learning for identifying function threshold boundaries. In: Advances in neural information processing systems, pp 163–170
Campbell C, Cristianini N, Smola AJ (2000) Query learning with large margin classifiers. In: Proceedings of the seventeenth international conference on machine learning. Morgan Kaufmann Publishers Inc., pp 111–118
Cavallanti G, Cesa-Bianchi N, Gentile C (2009) Linear classification and selective sampling under low noise conditions. In: Advances in neural information processing systems, pp 249–256
Cesa-Bianchi N, Gentile C, Orabona F (2009) Robust bounds for classification via selective sampling. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 121–128
Chen W, Fuge M (2017) Beyond the known: detecting novel feasible domains over an unbounded design space. J Mech Des 139(11):111,405
Chen Z, Qiu H, Gao L, Li X, Li P (2014) A local adaptive sampling method for reliability-based design optimization using kriging model. Struct Multidiscip Optim 49(3):401–416
Chen Z, Peng S, Li X, Qiu H, Xiong H, Gao L, Li P (2015) An important boundary sampling method for reliability-based design optimization using kriging model. Struct Multidiscip Optim 52(1):55–70
Chen L, Hassani H, Karbasi A (2016) Near-optimal active learning of halfspaces via query synthesis in the noisy setting. arXiv:160303515
Chen W, Fuge M, Chazan J (2017) Design manifolds capture the intrinsic complexity and dimension of design spaces. J Mech Des 139(5):051,102. https://doi.org/10.1115/1.4036134
Cohn D, Atlas L, Ladner R (1994) Improving generalization with active learning. Mach Learn 15 (2):201–221
Dagan I, Engelson SP (1995) Committee-based sampling for training probabilistic classifiers. In: Proceedings of the twelfth international conference on machine learning
Dasgupta S, Kalai AT, Monteleoni C (2009) Analysis of perceptron-based active learning. J Mach Learn Res 10:281–299
Dekel O, Gentile C, Sridharan K (2012) Selective sampling and active learning from single and multiple teachers. J Mach Learn Res 13(Sep):2655–2697
Devanathan S, Ramani K (2010) Creating polytope representations of design spaces for visual exploration using consistency techniques. J Mech Des 132(8):081,011
Freund Y, Seung HS, Shamir E, Tishby N (1997) Selective sampling using the query by committee algorithm. Mach Learn 28(2):133–168
Gotovos A, Casati N, Hitz G, Krause A (2013) Active learning for level set estimation. In: Proceedings of the twenty-third international joint conference on artificial intelligence. AAAI Press, pp 1344–1350
Hoang TN, Low BKH, Jaillet P, Kankanhalli M (2014) Nonmyopic 𝜖-bayes-optimal active learning of gaussian processes. In: Xing E P, Jebara T (eds) Proceedings of the 31st international conference on machine learning, PMLR, vol 32. Proceedings of Machine Learning Research, Bejing, pp 739–747
Hoi SC, Jin R, Zhu J, Lyu MR (2009) Semisupervised svm batch mode active learning with applications to image retrieval. ACM Trans Inf Syst (TOIS) 27(3):16
Hsu WN, Lin HT (2015) Active learning by learning. In: Twenty-Ninth AAAI conference on artificial intelligence
Huang YC, Chan KY (2010) A modified efficient global optimization algorithm for maximal reliability in a probabilistic constrained space. J Mech Des 132(6):061,002
Huang SJ, Jin R, Zhou ZH (2010) Active learning by querying informative and representative examples. In: Advances in neural information processing systems, pp 892–900
Jackson JC (1997) An efficient membership-query algorithm for learning dnf with respect to the uniform distribution. J Comput Syst Sci 55(3):414–440
Kandasamy K, Schneider J, Póczos B (2017) Query efficient posterior estimation in scientific experiments via bayesian active learning. Artif Intell 243:45–56
Kapoor A, Grauman K, Urtasun R, Darrell T (2010) Gaussian processes for object categorization. Int J Comput Vis 88(2):169–188
King RD, Whelan KE, Jones FM, Reiser PG, Bryant CH, Muggleton SH, Kell DB, Oliver SG (2004) Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427(6971):247–252
Krause A, Guestrin C (2007) Nonmyopic active learning of gaussian processes: an exploration-exploitation approach. In: Proceedings of the 24th international conference on machine learning. ACM, pp 449–456
Krempl G, Kottke D, Lemaire V (2015) Optimised probabilistic active learning (opal). Mach Learn 100 (2-3):449–476
Larson BJ, Mattson CA (2012) Design space exploration for quantifying a system model’s feasible domain. J Mech Des 134(4):041,010
Lee TH, Jung JJ (2008) A sampling technique enhancing accuracy and efficiency of metamodel-based rbdo: constraint boundary sampling. Comput Struct 86(13):1463–1476
Lewis DD, Catlett J (1994) Heterogeneous uncertainty sampling for supervised learning. In: Proceedings of the eleventh international conference on machine learning, pp 148–156
Lewis DD, Gale WA (1994) A sequential algorithm for training text classifiers. In: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval. Springer-Verlag New York Inc., New York, pp 3–12
Ma Y, Garnett R, Schneider J (2014) Active area search via bayesian quadrature. In: Artificial intelligence and statistics, pp 595–603
Mac Aodha O, Campbell ND, Kautz J, Brostow GJ (2014) Hierarchical subquery evaluation for active learning on a graph. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 564–571
McCallum A, Nigam K et al (1998) Employing em and pool-based active learning for text classification. In: ICML, vol 98, pp 359–367
Nguyen HT, Smeulders A (2004) Active learning using pre-clustering. In: Proceedings of the twenty-first international conference on machine learning ICML ’04. ACM, New York, p 79, https://doi.org/10.1145/1015330.1015349, (to appear in print)
Nowacki H (1980) Modelling of design decisions for cad. In: Computer aided design modelling, systems engineering, CAD-Systems. Springer, pp 177–223
Orabona F, Cesa-Bianchi N (2011) Better algorithms for selective sampling. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 433–440
Osugi T, Kim D, Scott S (2005) Balancing exploration and exploitation: a new algorithm for active machine learning. In: Fifth IEEE international conference on data mining. IEEE
Rasmussen C, Williams C (2006) Gaussian processes for machine learning. The MIT Press
Ren Y, Papalambros PY (2011) A design preference elicitation query as an optimization process. J Mech Des 133(11):111,004
Schohn G, Cohn D (2000) Less is more: active learning with support vector machines. In: ICML, pp 839–846
Settles B (2010) Active learning literature survey. Univ Wiscons Madison 52(55–66):11
Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 1070–1079
Singh P, Van Der Herten J, Deschrijver D, Couckuyt I, Dhaene T (2017) A sequential sampling strategy for adaptive classification of computationally expensive data. Struct Multidiscip Optim 55(4):1425–1438
Tong S, Koller D (2001) Support vector machine active learning with applications to text classification. J Mach Learn Res 2:45–66
Yang X, Liu Y, Gao Y, Zhang Y, Gao Z (2015a) An active learning kriging model for hybrid reliability analysis with both random and interval variables. Struct Multidiscip Optim 51(5):1003–1016
Yang Y, Ma Z, Nie F, Chang X, Hauptmann AG (2015b) Multi-class active learning by uncertainty sampling with diversity maximization. Int J Comput Vis 113(2):113–127
Yannou B, Moreno F, Thevenot HJ, Simpson TW (2005) Faster generation of feasible design points. In: ASME 2005 international design engineering technical conferences and computers and information in engineering conference. American Society of Mechanical Engineers, pp 355–363
Zhu X, Lafferty J, Ghahramani Z (2003) Combining active learning and semi-supervised learning using gaussian fields and harmonic functions. In: ICML 2003 workshop on the continuum from labeled to unlabeled data in machine learning and data mining, vol 3
Zhuang X, Pan R (2012) A sequential sampling strategy to improve reliability-based design optimization with implicit constraint functions. J Mech Des 134(2):021,002
Acknowledgments
The authors thank the anonymous reviewers whose efforts improved the manuscript. This work was funded through a University of Maryland Minta Martin Grant.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Theorem proofs
1.1 A1 Proof of theorem 3
According to (2), given an optimal query x∗, we have
where
and
Similarly,
where
Therefore for the optimal query x∗ we have
Both Theorem 1 and 2 state that p 𝜖 (x∗) = τ, thus
When τ = Φ(−η𝜖), we have
Plugging (A1) into (A5) and solving for the distance δ, we get
where
1.2 A2 Proof of theorem 5
Theorem 1 states that the optimal query in the exploitation stage lies at the intersection of \(\bar {f}(\boldsymbol {x})= 0\) and p 𝜖 (x) = τ. By substituting Φ(−η𝜖) for τ, we have
According to (A3), we have \(V(\boldsymbol {x}^{*})>1-k_m^2\nu \). Combining (A1), (A4), and (A7), we get
where
1.3 A3 Proof of theorem 6
According to (A7), the predictive variance of an optimal query x e x p l o i t in the exploitation stage is
While in the exploration stage, we have p 𝜖 (x e x p l o r e ) = τ at the optimal query x e x p l o r e (Theorem 2). And by applying (4) and setting τ = Φ(−η𝜖), we have
Appendix B: Additional experimental results
1.1 B1 Hosaki example
We use the Hosaki example as an additional 2-dimensional example to demonstrate the performance of our proposed method. Different from the Branin example, the Hosaki example has feasible domains of different scales. Its feasible domains resemble two isolated feasible regions — a large “island” and a small one (Fig. 15a). The Hosaki function is
We define the label y = 1 if x ∈ {x|g(x) ≤ − 1, 0 < x1, x2 < 5}; and y = − 1 otherwise.
For AES, we set the initial point x(0) = (3, 3). We use a Gaussian kernel with a length scale l = 0.4. The test set to compute F1 scores is generated along a 100 × 100 grid in the region where x1 ∈ [− 3, 9] and x2 ∈ [− 3.5, 8.5]. For NV and straddle, the input space bounds are shown in Table 3.
Table 4 shows the final F1 scores and running time of AES, NV, and the straddle heuristic. Fig. 15 shows the F1 scores and queries under different 𝜖 and η. Fig. 16 compares the performance of AES and NV with different boundary sizes. Fig. 17 shows the performance of AES and NV under Bernoulli and Gaussian noise.
1.2 B2 Results of straddle heuristic
In this section we list experimental results related to the straddle heuristic. Specifically, Fig. 18 shows straddle’s F1 scores and queries using different sizes of input variable bounds, and the comparison with AES. Fig. 19 shows the comparison of AES and straddle under noisy labels.
Rights and permissions
About this article
Cite this article
Chen, W., Fuge, M. Active expansion sampling for learning feasible domains in an unbounded input space. Struct Multidisc Optim 57, 925–945 (2018). https://doi.org/10.1007/s00158-017-1894-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00158-017-1894-y