Applying Principal Component Analysis and Autoencoders for Dimensionality Reduction in Data Stream

Authors

  • Mayur Prakashrao Gore Principal Software Engineer, CGI Inc, Austin, Texas
  • Amol Ashokrao Shinde Lead Software Engineer, Mastech Digital Technologies Inc, Pittsburgh PA, United States
  • Amit Choudhury Department of Information Technology, Dronacharya College of Engineering, Gurgaon

Keywords:

Dimensionality Reduction, Data Streams, Principal Component Analysis (PCA), Autoencoders, Real-Time Processing, High-Dimensional Data

Abstract

This research focuses on the role and efficiency of Principal Component Analysis (PCA) and autoencoders, when working separately and concurrently for dimensionality reduction in large scale data. This is because the data obtained from sources such as the IoT sensors and the social media platforms is a lot more complicated than before and therefore proper dimensionality reduction is critical for real time analysis. The performance of these methods is analyzed on synthetic and real datasets based on which explained variance, reconstruction error and processing time of these methods are compared to define the optimal configuration. The results show that solely PCA is fast in linear data and autoencoders capture nonlinear dependence with slightly higher time complexity. This preserves considerable variance alongside a reasonable reconstruction error and thus makes the PCA-autoencoder model well suited to dynamic environments while incurring less computational expense than alternative PCA models. This work shows that it is possible to utilize relevant combinations of methods for dimensionality reduction to boost real-time data stream analysis especially in applications that demand for high accuracy at the same time as low delays.

References

D. Cacciarelli and M. Kulahci, "Hidden dimensions of the data: PCA vs autoencoders," Quality Engineering, vol. 35, no. 4, pp. 741-750, 2023. Available From: https://doi.org/10.1080/08982112.2023.2231064

K. Shinde, V. Itier, J. Mennesson, D. Vasiukov, and M. Shakoor, "Dimensionality reduction through convolutional autoencoders for fracture patterns prediction," Applied Mathematical Modelling, vol. 114, pp. 94-113, 2023.Available From: https://doi.org/10.1016/j.apm.2022.09.034

M. Ashraf, F. Anowar, J. H. Setu, A. I. Chowdhury, E. Ahmed, A. Islam, and A.-M. A. Al-Mamun, "A survey on dimensionality reduction techniques for time-series data," IEEE Access, vol. 11, pp. 42909-42923, 2023. Available Frome: https://doi.org/10.1109/ACCESS.2023.3269693

J. Jiang, J. Xu, Y. Liu, B. Song, X. Guo, X. Zeng, and Q. Zou, "Dimensionality reduction and visualization of single-cell RNA-seq data with an improved deep variational autoencoder," Briefings in Bioinformatics, vol. 24, no. 3, art. no. bbad152, 2023. Available From: https://doi.org/10.1093/bib/bbad152

P. Li, Y. Pei, and J. Li, "A comprehensive survey on design and application of autoencoder in deep learning," Applied Soft Computing, vol. 138, art. no. 110176, 2023. Available From: https://doi.org/10.1016/j.asoc.2023.110176

Z. Wang, G. Zhang, X. Xing, X. Xu, and T. Sun, "Comparison of dimensionality reduction techniques for multi-variable spatiotemporal flow fields," Ocean Engineering, vol. 291, art. no. 116421, 2024. Available From: https://doi.org/10.1016/j.oceaneng.2023.116421

G. Zhang, Z. Wang, H. Huang, H. Li, and T. Sun, "Comparison and evaluation of dimensionality reduction techniques for the numerical simulations of unsteady cavitation," Physics of Fluids, vol. 35, no. 7, 2023. Available From: https://doi.org/10.1063/5.0161471

J. Kneifl, D. Rosin, O. Avci, O. Röhrle, and J. Fehr, "Low-dimensional data-based surrogate model of a continuum-mechanical musculoskeletal system based on non-intrusive model order reduction," Archive of Applied Mechanics, vol. 93, no. 9, pp. 3637-3663, 2023. Available From: https://doi.org/10.1007/s00419-023-02458-5

A. Ilnicka and G. Schneider, "Compression of molecular fingerprints with autoencoder networks," Molecular Informatics, vol. 42, no. 6, art. no. 2300059, 2023. Available From: https://doi.org/10.1002/minf.202300059

A. Abbas, A. Rafiee, and M. Haase, "DeepMorpher: deep learning-based design space dimensionality reduction for shape optimisation," Journal of Engineering Design, vol. 34, no. 3, pp. 254-270, 2023. Available From: https://doi.org/10.1080/09544828.2023.2192606

G. Zhang, Z. Wang, H. Huang, H. Li, and T. Sun, "Comparison and evaluation of dimensionality reduction techniques for the numerical simulations of unsteady cavitation," Physics of Fluids, vol. 35, no. 7, 2023. Available From: https://doi.org/10.1063/5.0161471

S. He, X. Ye, T. Sakurai, and Q. Zou, "MRMD3.0: A python tool and webserver for dimensionality reduction and data visualization via an ensemble strategy," Journal of Molecular Biology, vol. 435, no. 14, pp. 1681-1686, 2023 Available From: https://doi.org/10.1016/j.jmb.2023.168116

Downloads

Published

2024-11-03

How to Cite

Gore, M. P., Shinde, A. A., & Choudhury, A. (2024). Applying Principal Component Analysis and Autoencoders for Dimensionality Reduction in Data Stream. International Journal of Innovative Research in Engineering and Management, 11(5), 121–126. Retrieved from http://ijirem.irpublications.org/index.php/ijirem/article/view/85

Issue

Section

Articles

Similar Articles

1 2 3 4 5 6 > >> 

You may also start an advanced similarity search for this article.