Knowledge discovery is the process of extracting the unknown or hidden nuggets of information from the large volumes of data in repositories. It uses various statistical and mathematical techniques that correlate the information to find out trends and patterns from data pool. It provides insights for the plethora of data available randomly. Knowledge Discovery is a process where raw data is transformed into understandable data and then the trends and patterns are recognized using various techniques and algorithms. The process includes following operations:
1. Data Collection: The structured, semi-structured and unstructured data (web mails, phone records, audios, graphics, images etc.) is integrated from disparate source and collected in a repository called data. This large data from database and data warehouse is subject to pre-processing process to obtain the target data for reporting and analysis.
2. Data Cleaning: The real world data is dirty as it has missing values, inconsistent data, unclear attributes etc. It is an important part of knowledge discovery process as it removes the noise and inconsistency of data, provides smoothness and minimize to it by using techniques like Binning, Regression and Clustering.
3. Data Integration: It is normalizes the clean data into cubes, flat files etc. so that the unknown patterns can be identified from the target data.
4. Data Transformation: It is the process of mapping the instances and schema following one-to-one or one-to-many transformation rules to minimize redundancy.
5. Data Categorization: The data is now reduced in volumes but holds complete representation. Aggregation, selection and clustering techniques are used to categorize and reduce data volume.
6. Data Mining: It is a complete technology which uses heuristics methods to discover tends and patterns followed within a cluster of data or among the different clusters of data. Some of the widely used data mining techniques are Neural Network, Linear Regression, Decision Trees, Bayesian Network, Support Vector Machine, Clustering etc.
7. Data Evaluation and Prediction: The pattern discovered by data mining techniques is evaluated on various historical facts available in the data pool to predict -what next-. The statistics allows to performing predictive analytics on the trends and patterns obtained after data mining.
APPLICATIONS: Knowledge Discovery is an interdisciplinary field having widespread applications in various domains such as Financial Industry, Retail Industry, Telecommunication Industry, Aviation Industry, Medical Research, Security Agencies.
Aarken Technologies, a sophisticated IT company that deals in top of the line engineering solutions, proposes a tool called Matrix Mapper that uses Knowledge Discovery, Data Mining and Predictive Anlaytics as its major components. The Matrix Mapper (MM) is a product for that would mirror the capabilities of the Analysts’ Notebook (AN), an application created by i2, an IBM subsidiary, which cannot be sold anywhere outside the US and Russia and is subject to stringent US Export Control (EC) regulations.
For more details visit us at or