Data mining, or Knowledge Discovery in Databases (KDD), is of little benefit to commercial enterprises unless it can be carried out efficiently on realistic volumes of data. Operational factors also dictate that KDD should be performed within the context of standard DBMS. Fortunately, relational DBMS have a declarative query interface (SQL) that has allowed designers of parallel hardware to exploit data parallelism efficiently. Thus, an effective approach to the problem of efficient KDD consists of arranging that KDD tasks execute on a parallel SQL server. In this paper we devise generic KDD primitives, map these to SQL and present some results of running these primitives on a commercially-available parallel SQL server.
Ссылка удалена правообладателем ---- The book removed at the request of the copyright holder.