Chapter 1 Data Mining Lecture Note



introdution to data mining
  Data Mining and Warehousing Chapter OneIntroduction  2  Why Data Mining?  The Explosive Growth of Data: from terabytes to petabytes  Data collection and data availability  Automated data collection tools, database systems, Web, computerized society  Major sources of abundant data: Business, science and society  The computing power is available and is affordable  DM commercial products and machine learning algorithms are available  The competitive pressure is very strong ã How to gain competitive advantage? ã How to control the volatile market? ã How to satisfy customers (prosumers) need?  “Necessity is the mother of invention”—  Data mining  —  Automated analysis of massive data sets. We are drowning in data, but starving for knowledge!   We are data rich, but information poor. 3  4  What Is Data Mining?  Data mining (knowledge discovery from data)  Extraction of interesting ( non-trivial,implicit, previously unknownand potentially useful)patterns or knowledge from huge amount of data  Data mining: a misnomer?  Alternative names  Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc.  What is not data mining?  Simple search and query processing  Expert systems or small statistical programs
