AI-based Service Development and Operations

Background

Developed a service that transcribes speech in real-time and automatically summarizes and answers questions based on the transcription.

Key Features

Provides a domain-specific speech recognition model to improve transcription performance.
Enables real-time speech transcription.
Provides a feature to summarize transcription results using an LLM and perform Q&A based on RAG.
Also provides a feature to extract keywords from the transcribed audio and visualize them as a knowledge graph.

Achievements

Achieved 452 Monthly Active Users (MAU).
Selected as an outstanding project in the Software Maestro program, leading to participation in CES, a tour of Silicon Valley companies, and mentoring with developers for about 5 weeks.
The technology used in the service was transferred to a software solutions provider.

My Role

Built a data collection pipeline based on the YouTube API for domain-specific training of the speech recognition model.
Trained a domain-specific speech recognition model using QLoRA.
Served the speech recognition model on-premise using Ray and Kubernetes to reduce costs.
Built an RAG pipeline using LlamaIndex and Pinecone.

Project Overview

Architecture diagram of the service.