Active Learning in Black-Box Settings

Authors

  • Neil Rubens University of Electro-Communications, Tokyo, Japan
  • Vera Sheinman Japanese Institute of Educational Measurement, Tokyo, Japan
  • Ryota Tomioka University of Tokyo, Tokyo, Japan
  • Masashi Sugiyama Tokyo Institute of Technology, Tokyo, Japan

DOI:

https://doi.org/10.17713/ajs.v40i1&2.204

Abstract

Active learning refers to the settings in which a machine learning algorithm (learner) is able to select data from which it learns (selecting points and then obtaining their labels), and by doing so aims to achieve better accuracy (e.g., by avoiding obtaining training data that is redundant or unimportant). Active learning is particularly useful in cases where the labeling
cost is high. A common assumption is that an active learning algorithm is aware of the details of the underlying learning algorithm for which it obtains the data. However, in many practical settings, obtaining precise details of the learning algorithm may not be feasible, making the underlying algorithm in essence a black box – no knowledge of the internal workings of the algorithm is available, and only the inputs and corresponding output estimates are accessible. This makes many of the traditional approaches not applicable, or
at the least not effective. Hence our motivation is to use the only data that is accessible in black box settings – output estimates. We note that accuracy will improve only if the learner’s output estimates change. Therefore we propose active learning criterion that utilizes the information contained within the changes of output estimates.

References

Andersen, R. (2008). Modern methods for robust regression (No. 152). Thousand Oaks, CA, USA: Sage Publications.

Bell, R. M., and Koren, Y. (2007). Lessons from the netflix prize challenge. SIGKDD Explorations Newsletter, 9, 75–79.

Boyd, S., and Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.

Chan, N. (1981). A-optimality for regression designs (Tech. Rep.). Stanford, CA, USA: Stanford University, Department of Statistics.

Cook, R. D. (1977). Detection of influential observation in linear regression. Technometrics, 19(1), 15–18.

Dette, H., and Studden, W. J. (1993). Geometry of e-optimality. The Annals of Statistics, 21(1), 416-443.

Hager, W. (1989). Updating the inverse of a matrix. SIAM review, 31(2), 221–239.

Hodge, V., and Austin, J. (2004). A survey of outlier detection methodologies. Artificial Intelligence Review, 22(2), 85–126.

John, R. C. S., and Draper, N. R. (1975, Feb.). D-optimality for regression designs: A review. Technometrics, 17(1), 15-23.

Riedl, J., and Konstan, J. (1998). Movielens data set. http://movielens.umn.edu.

Romano, D., and Kinnaert, M. (2005). An experiment-based methodology for robust design of optimal residual generators. In IEEE conference on decision and control (p. 6286 - 6291). Seville, Spain: IEEE.

Rubens, N., Kaplan, D., and Sugiyama, M. (2010). Recommender systems handbook. In (chap. Active Learning for Recommender Systems). New York, NY: Springer.

Settles, B. (2009). Active learning literature survey (Computer Sciences Technical Report No. 1648). Madison, Wisconsin, USA: University of Wisconsin–Madison.

Sugiyama, M., and Ogawa, H. (2000). Incremental active learning for optimal generalization. Neural Computation, 12(12), 2909–2940.

Yu, K., Bi, J., and Tresp, V. (2006). Active learning via transductive experimental design. In Proceedings of the 23rd int. conference on machine learning icml ’06 (pp. 1081–1088). New York, NY, USA: ACM.

Downloads

Published

2016-02-24

How to Cite

Rubens, N., Sheinman, V., Tomioka, R., & Sugiyama, M. (2016). Active Learning in Black-Box Settings. Austrian Journal of Statistics, 40(1&2), 125–135. https://doi.org/10.17713/ajs.v40i1&2.204

Issue

Section

Articles