Pratheesh, R and Divya, V. (2024) Feature Extraction to Evaluate the Quality of Data Using Machine Learning Technique. In: 2024 Third International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE), Ballari, India.
Full text not available from this repository. (Request a copy)Abstract
Data quality analysis is a crucial step in ensuring the reliability and accuracy of data used in machine learning models. The proposed approach utilized with machine learning algorithms to identify outliers in the data. Outliers can indicate errors are anomalies that might affect the quality of the dataset. Data Profiling is used to identify patterns and distributions. Missing Data Imputation predict the missing values. Data Validation and Cleansing, Data Duplication Detection, Inconsistency Detection are considered as the primary analysis to be done. The feature extraction process read the sample test dataset, pre-process the dataset by cleaning the data for data duplication, inconsistency through dynamic algorithms principal component analysis (PCA), Linear Discriminant Analysis (LDA) and independent component analysis (ICA) is considered here. The proposed feature extraction data is further opted for pattern validation through isolated forest algorithm. The performance of the system is evaluated using accuracy, precision, and Recall. The LDA model outperforms with accuracy of 99% for the given testset.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Subjects: | Computer Science Engineering > Machine Learning |
Divisions: | Computer Science |
Depositing User: | Mr IR Admin |
Date Deposited: | 08 Oct 2024 11:20 |
Last Modified: | 08 Oct 2024 11:20 |
URI: | https://ir.vistas.ac.in/id/eprint/9497 |