You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
A 1300-hour English speech and text corpus of parliamentary debates for streaming ASR training and benchmarking, speech data filtering and speech data verbatimization.
Faster-Whisper Transcription Server & API is a production-ready speech-to-text micro-service stack that wraps faster-whisper with a streaming FastAPI server, a Celery/Redis background queue, and optional Docker deployment—delivering real-time or batch audio transcription with minimal latency and simple web-hook integration.
OpenAI-compatible proxy bridging Doubao/Volcengine ASR 2.0 (Seed-ASR) WebSocket protocol to /v1/audio/transcriptions; works with Spokenly and OpenAI-compatible clients. OpenAI 兼容代理:将豆包/火山引擎 ASR 2.0(Seed-ASR)WebSocket 协议桥接到 /v1/audio/transcriptions,适用于 Spokenly 与其他 OpenAI 兼容客户端。
PhD Thesis: "Automatic speech recognition and machine translation with deep neural networks for open educational resources, parliamentary contents and broadcast media" (2024)
Low-latency voice AI agent platform with streaming ASR/TTS, FSM-based dialog management, and microservices architecture. Built with FastAPI, LangGraph, vLLM, and F5-TTS.