Full Program »
Data Quality Mining
We are living in a world of information abundance, surplus, and access. We have technologies to acquire any type of information but we still face the challenge of extracting the underlying valuable knowledge. Data analyses and mining processes may be severely impaired whenever data are corrupted by noise, ambiguity and distortions. This paper aims to provide a systematic procedure for data cleaning in single files data sources without schema that may be corrupted by the most common data problems. The methodology is guided by the dimensions of data quality standards and focuses on the goal of performing reasonable posterior statistical analyses.