語音TTS已經(jīng)走向了AIGC大模型模式,利用大規(guī)模高品質訓練數(shù)據(jù),實現(xiàn)zero-shot的聲音復刻正成為越來越火熱的技術趨勢。零樣本語音合成在娛樂產(chǎn)業(yè)、通用AI平臺及AIGC領域都有廣泛的應用場景,包括電影電視配音及解說、有聲書、角色配音、虛擬主播、語音導航等。晴數(shù)智慧前瞻性地推出“語音復刻大模型高品質數(shù)據(jù)集”,此數(shù)據(jù)具備48KHz高采樣率,采集人數(shù)上萬,環(huán)境純凈,內容自然多樣,近萬小時,是實現(xiàn)零樣本語音合成的絕佳數(shù)據(jù)。特別地,此數(shù)據(jù)的多樣化語料內容能為模型學習協(xié)同發(fā)音提供助益。
語種(地區(qū))
英語(中國香港、中國大陸、印度、馬來西亞、菲律賓、新加坡、泰國、土耳其)
數(shù)據(jù)風格
對話式&朗讀式
音頻格式
PCM
采樣率
48kHz
比特率
16bits
聲道
1
人數(shù)
17,392
時長
10,758小時
音頻數(shù)量
5,979,570個
As a new trend of AIGC, zero-shot speech synthesis has wide-ranging applications, including voice assistants, audiobooks, video game character voices, creating podcast, and real-time voice changer, among others. has proactively designed and developed the "High-Quality Dataset for Voice Replication," which boasts a 48kHz high sampling rate, tens of thousands of contributors with diverse content, making it an excellent resource for achieving zero-shot speech synthesis. In particular, the diverse content of this data can be beneficial for the model's coarticulation learning.
符合ISO/IEC 27001和ISO/IEC 27701:2019標準認證
音頻、文本、圖像、音視頻多模態(tài)數(shù)據(jù)
涵蓋多領域的對話式、朗讀式及自發(fā)式數(shù)據(jù)
人機協(xié)同高精度標注