We use cookies to improve your experience with our site.

Indexed in:

SCIE, Ei, Scopus, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Charu C. Aggarwal, Chen Chen, Jiawei Han. The Inverse Classification Problem[J]. Journal of Computer Science and Technology, 2010, 25(3): 458-468.
Citation: Charu C. Aggarwal, Chen Chen, Jiawei Han. The Inverse Classification Problem[J]. Journal of Computer Science and Technology, 2010, 25(3): 458-468.

The Inverse Classification Problem

More Information
  • Author Bio:

    Charu C. Aggarwal is a research staff member at the IBM T. J.Watson Research Center in Yorktown Heights, New York. He completedhis B.S. degree from IIT Kanpur in 1993 and his Ph.D. degree from MIT in 1996. Hehas since worked in the field of performance analysis, databases,and data mining. He has published over 125 papers in refe-reedconferences and journals, and has been granted over 45 patents.Because of the commercial value of the above-mentioned patents, hehas thrice been designated a Master Inventor at IBM. He is arecipient of the IBM Outstanding Innovation Award (2008) for hisscientific contributions to privacy technology, and a recipient ofan IBM Research Division Award (2008) for his scientificcontributions to data stream research. He has served on the programand executive committees of most major database/data miningconferences. He served as an associate editor of the IEEETransactions on Knowledge and Data Engineering Journal from 2004 to2008. He is an action editor of the Data Mining and KnowledgeDiscovery Journal, an associate editor of the ACM SIGKDDExplorations, and an associate editor of the Knowledge andInformation Systems Journal. He is a fellow of the IEEE, and alife-member of the ACM.

    Chen Chen is a Ph.D. candidate in the Department of ComputerScience, University of Illinois at Urbana-Champaign. He received theB.E. degree in computer science and technology from University of Scienceand Technology of China in 2003, and the M.S. degree in computer science fromUniversity of Illinois at Urbana-Champaign in 2006, respectively.He has been working in the area of data mining in general, and hiscurrent research is focused on modeling, managing and analyzinglarge-scale graph and information network data, with applicationsfrom chemical and bio- informatics, social networks, the Web andcomputer systems.

    Jiawei Han received his Ph.D. degree from the University of Wisconsinin computer science in 1985. He is currently a professor, at theDepartment of Computer Science in the University of Illinois atUrbana-Champaign. He has been working on research into data mining,data warehousing, database systems, data mining from spatiotemporaldata, multimedia data, stream and RFID data, Web data, socialnetwork data, and biological data, with over 300 journal andconference publications. He has chaired or served on over 100program committees of international conferences and workshops,including PC co-chair of 2005 (IEEE) International Conference onData Mining (ICDM), Americas Coordinator of 2006 InternationalConference on Very Large Data Bases (VLDB). He is also serving asthe founding Editor-in-Chief of ACM Transactions on KnowledgeDiscovery from Data. He is an ACM and IEEE fellow and has received2004 ACM SIGKDD Innovations Award and 2005 IEEE Computer SocietyTechnical Achievement Award. His book ``Data Mining: Concepts andTechniques'' (2nd ed., Morgan Kaufmann, 2006) has been popularly usedas a textbook worldwide.

  • Received Date: November 02, 2009
  • Revised Date: December 08, 2009
  • Published Date: May 04, 2010
  • In this paper, we examine an emerging variation of the classification problem, which is known as the inverse classification problem. In this problem, we determine the features to be used to create a record which will result in a desired class label. Such an approach is useful in applications in which it is an objective to determine a set of actions to be taken in order to guide the data mining application towards a desired solution. This system can be used for a variety of decision support applications which have pre-determined task criteria. We will show that the inverse classification problem is a powerful and general model which encompasses a number of different criteria. We propose a number of algorithms for the inverse classification problem, which use an inverted list representation for intermediate data structure representation and classification. We validate our approach over a number of real datasets.
  • [1]
    Aggarwal C, Han J, Wang J, Yu P. A framework for ondemand classification of evolving data streams. May 2006, 18(5): 577-589.
    [2]
    Alsabti K, Ranka S, Singh V. CLOUDS: A decision tree classifier for large datasets. In Proc. KDD, New York, USA, Aug. 27-31, 1998, pp.2-8.
    [3]
    Breiman L, Friedman J, Olshen R A, Stone C J. Classification and Regression Trees. Chapman & Hall, 1984.
    [4]
    Brodley C E, Utgoff P E. Multivariate decision trees. Machine Learning, 1995, 19(1): 45-77.
    [5]
    Breslow L, Aha D. Simplifying decision trees. Knowledge Engineering Review, 1997, 12(1): 1-40.
    [6]
    Duda R, Hart P, Stork D. Pattern Classification. 2nd Edition, New York: John Wiley and Sons Inc., 2001.
    [7]
    Friedman J H. A recursive partitioning decision rule for nonparametric classifiers. IEEE Transactions on Computers, 1977, 26(4): 404-408.
    [8]
    Gehrke J, Ganti V, Ramakrishnan R, Loh W Y. BOAT: Optimistic decision tree construction. In Proc. ACM SIGMOD Int. Conf. Management of Data, Philadelphia, USA, May 31-June 3, 1999, pp.169-180.
    [9]
    James M. Classification Algorithms. Wiley, 1985.
    [10]
    Quinlan J R. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
    [11]
    Smyth P, Gray A, Fayyad U M. Retrofitting decision tree classifiers using kernel density estimation. In Proc. the International Conference on Machine Learning, Taheo City, USA, July 9-12, 1995, pp.506-514.
    [12]
    Achtert E, Kriegel H P, KrÄoger P, Renz M, ZÄufle A. Reverse k-nearest neighbor search in dynamic and general metric databases. In Proc. EDBT, Saint Petersburg, Russia, Mar. 24-26, 2009, pp.886-897.
    [13]
    Tao Y, Yiu M L, Mamoulis N. Reverse nearest neighbor search in metric spaces. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(9): 1239-1252.
    [14]
    Kaelbling L, Littman M, Moore A. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 1996, 4: 237-285.
    [15]
    Sutton R, Barto A. Re-Inforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1988.
    [16]
    Hettich S, Blake C, Merz C. UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, 1998, http://archive.ics.uci.edu/ml.
    [17]
    Witten I, Frank E. Data Mining: Practical Machine Learning Tools with Java Implementations. San Francisco: Morgan Kaufmann Publishers, CA, 2000, http://www.cs.waikato.ac.nz/»ml/weka/book.html.
    [18]
    Kohavi R. The power of decision tables. In Proc. European Conference on Machine Learning, Crete, Greece, Apr. 25-27, 1995, pp.174-189.

Catalog

    Article views (30) PDF downloads (1796) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return