Research on Real-time Multilingual Transcription and Minutes Generation for Video Conferences Based on Large Language Models

Authors

  • Gaike Wang Department of Computer Engineering, New York University, NY, USA
  • Qiwen Zhao Department of Computer Science, University of California San Diego, CA, USA
  • Zhongwen Zhou Department of Computer Science, University of California San Diego, CA, USA

Keywords:

Large Language Models, Multilingual Speech Recognition, Automated Minutes Generation, Real-time Video Conferencing.

Abstract

This paper presents an innovative approach to real-time multilingual transcription and minutes generation for video conferences using Large Language Models (LLMs). The proposed system integrates advanced speech recognition techniques with sophisticated natural language processing capabilities to address the challenges of multilingual communication in virtual meetings. The implementation incorporates a novel hierarchical architecture combining transformer-based models for speech recognition and rhetorical structure modeling for automated minutes generation. The system achieves significant performance improvements with an average Word Error Rate of 4.2% across supported languages and ROUGE-L scores of 0.825 for minutes generation. Through the implementation of adaptive resource allocation and selective forwarding techniques, the system demonstrates a 35% reduction in bandwidth consumption while maintaining processing latency under 150 milliseconds. The paper introduces a comprehensive evaluation framework incorporating both automated metrics and human assessment, demonstrating robust performance across various operational conditions. Experimental results show improvements in transcription accuracy by 28% and resource utilization efficiency by 25% compared to baseline systems. The system supports simultaneous processing of five major languages while maintaining consistent performance levels across different meeting scenarios. The research contributes to the advancement of multilingual video conferencing technology by providing a scalable and efficient solution for real-time communication and documentation needs.

 

References

S. Muppidi, J. Kandi, B. S. Kondaka, C. Kethireddy, and S. E. Kandregula, "Automatic meeting minutes generation using Natural Language processing," in Proc. 2023 Int. Conf. Evolutionary Algorithms and Soft Computing Techniques (EASCT), 2023, pp. 1–7. Available from: https://doi.org/10.1109/EASCT59475.2023.10393102

J. J. Zhang and P. Fung, "Automatic parliamentary meeting minute generation using rhetorical structure modeling," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 9, pp. 2492–2504, 2012. Available from: https://doi.org/10.1109/TASL.2012.2215592

J. Solanki and B. Senapati, "Enhancing real-time multilingual communication in virtual meetings through optimized WebRTC broadcasting," in Proc. 2024 IEEE Int. Conf. and Expo on Real Time Communications at IIT (RTC), 2024, pp. 1–8. Available from: http://dx.doi.org/10.1109/RTC62204.2024.10739086

M. Li, "Exploring the application of large language models in spoken language understanding tasks," in Proc. 2024 IEEE 2nd Int. Conf. on Sensors, Electronics and Computer Engineering (ICSECE), 2024, pp. 1537–1542. Available from: https://doi.org/10.1109/ICSECE61636.2024.10729345

R. Jayakody and G. Dias, "Performance of recent large language models for a low-resourced language," in Proc. 2024 Int. Conf. on Asian Language Processing (IALP), 2024, pp. 162–167. Available from: https://doi.org/10.48550/arXiv.2407.21330

W. Zheng, M. Yang, D. Huang, and M. Jin, "A deep learning approach for optimizing monoclonal antibody production process parameters," Int. J. Innov. Res. Comput. Sci. & Technol., vol. 12, no. 6, pp. 18–29, 2024. Available from: https://doi.org/10.48550/arXiv.2308.03928

X. Ma, J. Wang, X. Ni, and J. Shi, "Machine learning approaches for enhancing customer retention and sales forecasting in the biopharmaceutical industry: A case study," Int. J. Eng. Manag. Res., vol. 14, no. 5, pp. 58–75, 2024. Available from: http://dx.doi.org/10.3390/forecast6010010

G. Cao, Y. Zhang, Q. Lou, and G. Wang, "Optimization of high-frequency trading strategies using deep reinforcement learning," J. Artif. Intell. Gen. Sci. (JAIGS), vol. 6, no. 1, pp. 230–257, 2024. Available from: http://dx.doi.org/10.60087/jaigs.v6i1.247

G. Wang, X. Ni, Q. Shen, and M. Yang, "Leveraging large language models for context-aware product discovery in e-commerce search systems," J. Knowl. Learn. Sci. Technol., vol. 3, no. 4, 2024. Available from: http://dx.doi.org/10.48550/arXiv.2410.12829

H. Li, G. Wang, L. Li, and J. Wang, "Dynamic resource allocation and energy optimization in cloud data centers using deep reinforcement learning," J. Artif. Intell. Gen. Sci. (JAIGS), vol. 1, no. 1, pp. 230–258, 2024. Available from: http://dx.doi.org/10.1109/TNSM.2021.3100460

S. Xia, M. Wei, Y. Zhu, and Y. Pu, "AI-driven intelligent financial analysis: Enhancing accuracy and efficiency in financial decision-making," J. Econ. Theory Bus. Manag., vol. 1, no. 5, pp. 1–11, 2024. Available from: http://dx.doi.org/10.13140/RG.2.2.14057.71524

H. Zhang, T. Lu, J. Wang, and L. Li, "Enhancing facial micro-expression recognition in low-light conditions using attention-guided deep learning," J. Econ. Theory Bus. Manag., vol. 1, no. 5, pp. 12–22, 2024. Available from: https://doi.org/10.3390/s24175724

J. Wang, T. Lu, L. Li, and D. Huang, "Enhancing personalized search with AI: A hybrid approach integrating deep learning and cloud computing," Int. J. Innov. Res. Comput. Sci. & Technol., vol. 12, no. 5, pp. 127–138, 2024. http://dx.doi.org/10.24191/mij.v4i2.23026

X. Ma, Z. W., X. Ni, and P. G., "Artificial intelligence-based inventory management for retail supply chain optimization: A case study of customer retention and revenue growth," J. Knowl. Learn. Sci. Technol., vol. 3, no. 4, pp. 260–273, 2024. Available from: http://dx.doi.org/10.51594/ijmer.v6i3.882

H. Zheng, J. Wu, R. Song, L. Guo, and Z. Xu, "Predicting financial enterprise stocks and economic data trends using machine learning time series analysis," Appl. Comput. Eng., vol. 87, pp. 26–32, 2024. Available from: http://dx.doi.org/10.20944/preprints202407.0895.v1

C. Ju and Y. Zhu, "Reinforcement learning-based model for enterprise financial asset risk assessment and intelligent decision-making," unpublished, 2024. Available from: https://arXiv:2407.09557v1[q-fin.TR]

D. Huang, M. Yang, and W. Zheng, "Integrating AI and deep learning for efficient drug discovery and target identification," unpublished, 2024. Available from: https://doi.org/10.1016/j.imed.2021.10.001

M. Yang, D. Huang, and X. Zhan, "Federated learning for privacy-preserving medical data sharing in drug development," unpublished, 2024. Available from: http://dx.doi.org/10.20944/preprints202410.1641.v1

S. Zhou, W. Zheng, Y. Xu, and Y. Liu, "Enhancing user experience in VR environments through AI-driven adaptive UI design," J. Artif. Intell. Gen. Sci. (JAIGS), vol. 6, no. 1, pp. 59–82, 2024. Available from: http://dx.doi.org/10.60087/jaigs.v6i1.230

M. Yang, D. Huang, H. Zhang, and W. Zheng, "AI-enabled precision medicine: Optimizing treatment strategies through genomic data analysis," J. Comput. Technol. Appl. Math., vol. 1, no. 3, pp. 73–84, 2024. Available from: http://dx.doi.org/10.60087/vol2iisue1.p21

X. Wen, Q. Shen, W. Zheng, and H. Zhang, "AI-driven solar energy generation and smart grid integration: A holistic approach to enhancing renewable energy efficiency," Int. J. Innov. Res. Eng. Manag., vol. 11, no. 4, pp. 55–66, 2024. Available from: http://dx.doi.org/10.58532/V3BDRS1P1CH8

Y. Zhang, W. Bi, and R. Song, "Research on deep learning-based authentication methods for e-signature verification in financial documents," Acad. J. Sociol. Manag., vol. 2, no. 6, pp. 35–43, 2024. Available from: http://dx.doi.org/10.32628/IJSRSET207632

Z. Zhou, S. Xia, M. Shu, and H. Zhou, "Fine-grained abnormality detection and natural language description of medical CT images using large language models," Int. J. Innov. Res. Comput. Sci. & Technol., vol. 12, no. 6, pp. 52–62, 2024. Available from: http://dx.doi.org/10.1109/ICHI61247.2024.00080

Y. Zhang, Y. Liu, and S. Zheng, "A graph neural network-based approach for detecting fraudulent small-value high-frequency accounting transactions," Acad. J. Sociol. Manag., vol. 2, no. 6, pp. 25–34, 2024. Available from: http://dx.doi.org/10.1109/TKDE.2020.3025588

K. Yu, Q. Shen, Q. Lou, Y. Zhang, and X. Ni, "A deep reinforcement learning approach to enhancing liquidity in the US municipal bond market: An intelligent agent-based trading system," Int. J. Eng. Manag. Res., vol. 14, no. 5, pp. 113–126, 2024. Available from: http://dx.doi.org/10.1109/ACCESS.2022.3203697

Y. Wang, Y. Zhou, H. Ji, Z. He, and X. Shen, "Construction and application of artificial intelligence crowdsourcing map based on multi-track GPS data," in Proc. 2024 7th Int. Conf. Adv. Algorithms Control Eng. (ICAACE), Mar. 2024, pp. 1425–1429. Available from: https://doi.org/10.48550/arXiv.2402.15796

A. Akbar, N. Peoples, H. Xie, P. Sergot, H. Hussein, W. F. Peacock IV, and Z. Rafique, "Thrombolytic administration for acute ischemic stroke: What processes can be optimized?," McGill J. Med., vol. 20, no. 2, 2022. Available from: https://doi.org/10.26443/mjm.v20i2.881

Y. Zhang, H. Xie, S. Zhuang, and X. Zhan, "Image processing and optimization using deep learning-based generative adversarial networks (GANs)," J. Artif. Intell. Gen. Sci. (JAIGS), vol. 5, no. 1, pp. 50–62, 2024. Available from: https://doi.org/10.60087/jaigs.v5i1.163

T. Lu, M. Jin, M. Yang, and D. Huang, "Deep learning-based prediction of critical parameters in CHO cell culture process and its application in monoclonal antibody production," Int. J. Adv. Appl. Sci. Res., vol. 3, pp. 108–123, 2024. Available from: https://doi.org/10.3390/fermentation10050234

S. Xia, Y. Zhu, S. Zheng, T. Lu, and X. Ke, "A deep learning-based model for P2P microloan default risk prediction," Int. J. Innov. Res. Eng. Manag., vol. 11, no. 5, pp. 110–120, 2024. Available from: http://dx.doi.org/10.1007/978-3-030-82322-1_20

T. Lu, Z. Zhou, J. Wang, and Y. Wang, "A large language model-based approach for personalized search results re-ranking in professional domains," Int. J. Lang. Stud., vol. 1, no. 2, pp. 1–6, 2024. Available from: http://dx.doi.org/10.60087/ijls.v1.n2.001

H. Zheng, K. Xu, M. Zhang, H. Tan, and H. Li, "Efficient resource allocation in cloud computing environments using AI-driven predictive analytics," Appl. Comput. Eng., vol. 82, pp. 6–12, 2024. Available from: https://dx.doi.org/10.54254/2755-2721/82/2024GLG0055

X. Ni, L. Yan, X. Ke, and Y. Liu, "A hierarchical Bayesian market mix model with causal inference for personalized marketing optimization," J. Artif. Intell. Gen. Sci. (JAIGS), vol. 6, no. 1, pp. 378–396, 2024. Available from: http://dx.doi.org/10.55524/ijirem.2024.11.5.19

L. Li, Y. Zhang, J. Wang, and X. Ke, "Deep learning-based network traffic anomaly detection: A study in IoT environments," unpublished, 2024. Available from: https://doi.org/10.53469/wjimt.2024.07(06).03

H. Li, J. Sun, and X. Ke, "AI-driven optimization system for large-scale Kubernetes clusters: Enhancing cloud infrastructure availability, security, and disaster recovery," J. Artif. Intell. Gen. Sci. (JAIGS), vol. 2, no. 1, pp. 281–306, 2024. Available from: https://doi.org/10.60087/jaigs.v2i1.244

Downloads

Published

2024-12-02

How to Cite

Gaike Wang, Qiwen Zhao, & Zhongwen Zhou. (2024). Research on Real-time Multilingual Transcription and Minutes Generation for Video Conferences Based on Large Language Models. International Journal of Innovative Research in Engineering and Management, 11(6), 8–20. Retrieved from http://ijirem.irpublications.org/index.php/ijirem/article/view/89

Issue

Section

Articles