AI-based Service Development and Operations
Background
Developed a service that transcribes speech in real-time and automatically summarizes and answers questions based on the transcription.
Key Features
- Provides a domain-specific speech recognition model to improve transcription performance.
- Enables real-time speech transcription.
- Provides a feature to summarize transcription results using an LLM and perform Q&A based on RAG.
- Also provides a feature to extract keywords from the transcribed audio and visualize them as a knowledge graph.
Achievements
- Achieved 452 Monthly Active Users (MAU).
- Selected as an outstanding project in the Software Maestro program, leading to participation in CES, a tour of Silicon Valley companies, and mentoring with developers for about 5 weeks.
- The technology used in the service was transferred to a software solutions provider.
My Role
- Built a data collection pipeline based on the YouTube API for domain-specific training of the speech recognition model.
- Trained a domain-specific speech recognition model using QLoRA.
- Served the speech recognition model on-premise using Ray and Kubernetes to reduce costs.
- Built an RAG pipeline using LlamaIndex and Pinecone.
Architecture diagram of the service.