Real-Time, low audio latency based AI-Powered application architecture design

Péter Mileff

doi:10.32968/psaie.2025.1.5

Authors

Péter Mileff

DOI:

https://doi.org/10.32968/psaie.2025.1.5

Keywords:

OpenAI, low latency audio, parallel processing, websocket

Abstract

This paper presents the design and implementation of a mobile application that provides users with an interactive conversational experience powered by OpenAI's language model. A key feature of this application is its real-time text response streaming, coupled with synchronized audio synthesis using Azure's text-to-speech (TTS) services. The architecture includes a Node.js backend server that handles OpenAI communication in streaming mode, sentence segmentation for response buffering, and a dedicated, multithreaded audio service for efficient TTS conversion. Parallelized webSocket communication enables high throughput real-time coordination between the backend and the audio service. This paper explores the system's architecture, implementation challenges, performance evaluation, and potential applications in education, accessibility, and virtual assistants.

Real-Time, low audio latency based AI-Powered application architecture design

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section