使用发音评估

您所在的位置：网站首页 › paper读音发音准确 › 使用发音评估

使用发音评估

2023-11-19 05:34| 来源: 网络整理| 查看: 265

你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

使用发音评估项目 11/16/2023

在本文中，你将了解如何通过语音 SDK 使用语音转文本来评估发音。若要获取发音评估结果，需要将 PronunciationAssessmentConfig 设置应用于 SpeechRecognizer 对象。

注意

作为基线，对于即用即付还是承诺层级定价，使用发音评估的费用与语音转文本的费用相同。如果你购买了语音转文本的承诺层级，则用于发音评估的支出将用于满足承诺。

有关脚本化评估与非脚本化评估之间的定价差异，请参阅定价说明。

流式处理模式下的发音评估

发音评估支持不间断的流式处理模式。通过使用语音 SDK，录制时间可以不受限制。只要不停止录制，评估过程就不会结束，你可以方便地暂停和恢复评估。在流式处理模式下，AccuracyScore、FluencyScore、ProsodyScore和CompletenessScore将在整个录制和评估过程中随时间而变化。

有关如何在自己的应用程序中使用流式处理模式下的发音评估，请参阅示例代码。

配置参数

注意

发音评估在适用于 GO 的语音 SDK 中不可用。可以阅读本指南中的概念，但必须选择另一种编程语言以获取实现详细信息。

在 SpeechRecognizer 中，可以指定正在学习或练习以改进发音的语言。如果未另行指定，则默认区域设置为 en-US。若要了解如何在自己的应用程序中指定用于发音评估的学习语言，请参阅示例代码。

提示

如果当一种语言具有多个区域设置（如西班牙语）时不确定要设置哪个，请分别尝试每个区域设置（如 es-ES 和 es-MX）。评估结果以确定哪个区域设置在你的特定场景中分数更高。

必须创建PronunciationAssessmentConfig对象。需要配置PronunciationAssessmentConfig对象，以为发音评估启用韵律评估。此功能评估重音、语调、语速和节奏等方面，从而帮助了解语音的自然性和表现力。对于内容评估（口语学习场景的未脚本化评估的一部分），还需要配置PronunciationAssessmentConfig对象。通过提供主题说明，可以增强评估对谈论的特定主题的理解，从而获得更精确的内容评估分数。

var pronunciationAssessmentConfig = new PronunciationAssessmentConfig( referenceText: "", gradingSystem: GradingSystem.HundredMark, granularity: Granularity.Phoneme, enableMiscue: false); pronunciationAssessmentConfig.EnableProsodyAssessment(); pronunciationAssessmentConfig.EnableContentAssessmentWithTopic("greeting"); auto pronunciationConfig = PronunciationAssessmentConfig::Create("", PronunciationAssessmentGradingSystem::HundredMark, PronunciationAssessmentGranularity::Phoneme, false); pronunciationConfig->EnableProsodyAssessment(); pronunciationConfig->EnableContentAssessmentWithTopic("greeting"); PronunciationAssessmentConfig pronunciationConfig = new PronunciationAssessmentConfig("", PronunciationAssessmentGradingSystem.HundredMark, PronunciationAssessmentGranularity.Phoneme, false); pronunciationConfig.enableProsodyAssessment(); pronunciationConfig.enableContentAssessmentWithTopic("greeting"); pronunciation_config = speechsdk.PronunciationAssessmentConfig( reference_text="", grading_system=speechsdk.PronunciationAssessmentGradingSystem.HundredMark, granularity=speechsdk.PronunciationAssessmentGranularity.Phoneme, enable_miscue=False) pronunciation_config.enable_prosody_assessment() pronunciation_config.enable_content_assessment_with_topic("greeting") var pronunciationAssessmentConfig = new sdk.PronunciationAssessmentConfig( referenceText: "", gradingSystem: sdk.PronunciationAssessmentGradingSystem.HundredMark, granularity: sdk.PronunciationAssessmentGranularity.Phoneme, enableMiscue: false); pronunciationAssessmentConfig.EnableProsodyAssessment(); pronunciationAssessmentConfig.EnableContentAssessmentWithTopic("greeting"); SPXPronunciationAssessmentConfiguration *pronunicationConfig = [[SPXPronunciationAssessmentConfiguration alloc] init:@"" gradingSystem:SPXPronunciationAssessmentGradingSystem_HundredMark granularity:SPXPronunciationAssessmentGranularity_Phoneme enableMiscue:false]; [pronunicationConfig enableProsodyAssessment]; [pronunicationConfig enableContentAssessmentWithTopic:@"greeting"]; let pronAssessmentConfig = try! SPXPronunciationAssessmentConfiguration("", gradingSystem: .hundredMark, granularity: .phoneme, enableMiscue: false) pronAssessmentConfig.enableProsodyAssessment() pronAssessmentConfig.enableContentAssessment(withTopic: "greeting")

下表列出了发音评估的部分关键配置参数。

参数说明 ReferenceText 用来对发音进行评估的文本。

ReferenceText 参数是可选的。如果要为阅读语言学习场景运行脚本化评估，请设置参考文本。如果要为口语学习场景运行脚本化评估，不要设置参考文本。

有关脚本化评估与非脚本化标评估之间的定价差异，请参阅定价说明

GradingSystem 用于分数校准的分数系统。 FivePoint 系统给出 0-5 的浮点分数，而 HundredMark 系统给出 0-100 的浮点分数。默认值：FivePoint。 Granularity 确定评估粒度的最低级别。返回大于或等于最小值的级别分数。接受的值为 Phoneme（显示全文、单词、音节和音素级别的分数）、Syllable（显示全文、单词和音节级别的分数）、Word（显示全文和单词级别的分数）或 FullText（只显示全文级别的分数）。提供的完整引用文本可以是单词、句子或段落，具体取决于输入引用文本。默认值：Phoneme。 EnableMiscue 将发音的字与引用文本进行比较时，启用误读计算。启用误读是可选的。如果此值为 True，则可以根据比较将 ErrorType 结果值设置为 Omission 或 Insertion。接受的值为 False 和 True。默认值：False。要启用错误计算，请将 EnableMiscue 设置为 True。可以参考表下面的代码片段。 ScenarioId 一个 GUID，表示自定义分数系统。获取发音评估结果

当语音被识别时，你可以将发音评估结果请求为 SDK 对象或 JSON 字符串。

using (var speechRecognizer = new SpeechRecognizer( speechConfig, audioConfig)) { pronunciationAssessmentConfig.ApplyTo(speechRecognizer); var speechRecognitionResult = await speechRecognizer.RecognizeOnceAsync(); // The pronunciation assessment result as a Speech SDK object var pronunciationAssessmentResult = PronunciationAssessmentResult.FromResult(speechRecognitionResult); // The pronunciation assessment result as a JSON string var pronunciationAssessmentResultJson = speechRecognitionResult.Properties.GetProperty(PropertyId.SpeechServiceResponse_JsonResult); }

无法通过带有适用于 C++ 的语音 SDK 的 SDK 对象获得单词、音节和音素结果。单词、音节和音素结果仅以 JSON 字符串形式提供。

auto speechRecognizer = SpeechRecognizer::FromConfig( speechConfig, audioConfig); pronunciationAssessmentConfig->ApplyTo(speechRecognizer); speechRecognitionResult = speechRecognizer->RecognizeOnceAsync().get(); // The pronunciation assessment result as a Speech SDK object auto pronunciationAssessmentResult = PronunciationAssessmentResult::FromResult(speechRecognitionResult); // The pronunciation assessment result as a JSON string auto pronunciationAssessmentResultJson = speechRecognitionResult->Properties.GetProperty(PropertyId::SpeechServiceResponse_JsonResult);

若要了解如何在自己的应用程序中指定用于发音评估的学习语言，请参阅示例代码。

对于 Android 应用程序开发，单词、音节和音素结果可通过适用于 Java 的语音 SDK 的 SDK 对象获得。这些结果也以 JSON 字符串形式提供。对于 Java 运行时 (JRE) 应用程序开发，单词、音节和音素结果仅以 JSON 字符串形式提供。

SpeechRecognizer speechRecognizer = new SpeechRecognizer( speechConfig, audioConfig); pronunciationAssessmentConfig.applyTo(speechRecognizer); Future future = speechRecognizer.recognizeOnceAsync(); SpeechRecognitionResult speechRecognitionResult = future.get(30, TimeUnit.SECONDS); // The pronunciation assessment result as a Speech SDK object PronunciationAssessmentResult pronunciationAssessmentResult = PronunciationAssessmentResult.fromResult(speechRecognitionResult); // The pronunciation assessment result as a JSON string String pronunciationAssessmentResultJson = speechRecognitionResult.getProperties().getProperty(PropertyId.SpeechServiceResponse_JsonResult); recognizer.close(); speechConfig.close(); audioConfig.close(); pronunciationAssessmentConfig.close(); speechRecognitionResult.close(); var speechRecognizer = SpeechSDK.SpeechRecognizer.FromConfig(speechConfig, audioConfig); pronunciationAssessmentConfig.applyTo(speechRecognizer); speechRecognizer.recognizeOnceAsync((speechRecognitionResult: SpeechSDK.SpeechRecognitionResult) => { // The pronunciation assessment result as a Speech SDK object var pronunciationAssessmentResult = SpeechSDK.PronunciationAssessmentResult.fromResult(speechRecognitionResult); // The pronunciation assessment result as a JSON string var pronunciationAssessmentResultJson = speechRecognitionResult.properties.getProperty(SpeechSDK.PropertyId.SpeechServiceResponse_JsonResult); }, {});

若要了解如何在自己的应用程序中指定用于发音评估的学习语言，请参阅示例代码。

speech_recognizer = speechsdk.SpeechRecognizer( speech_config=speech_config, \ audio_config=audio_config) pronunciation_assessment_config.apply_to(speech_recognizer) speech_recognition_result = speech_recognizer.recognize_once() # The pronunciation assessment result as a Speech SDK object pronunciation_assessment_result = speechsdk.PronunciationAssessmentResult(speech_recognition_result) # The pronunciation assessment result as a JSON string pronunciation_assessment_result_json = speech_recognition_result.properties.get(speechsdk.PropertyId.SpeechServiceResponse_JsonResult)

若要了解如何在自己的应用程序中指定用于发音评估的学习语言，请参阅示例代码。

SPXSpeechRecognizer* speechRecognizer = \ [[SPXSpeechRecognizer alloc] initWithSpeechConfiguration:speechConfig audioConfiguration:audioConfig]; [pronunciationAssessmentConfig applyToRecognizer:speechRecognizer]; SPXSpeechRecognitionResult *speechRecognitionResult = [speechRecognizer recognizeOnce]; // The pronunciation assessment result as a Speech SDK object SPXPronunciationAssessmentResult* pronunciationAssessmentResult = [[SPXPronunciationAssessmentResult alloc] init:speechRecognitionResult]; // The pronunciation assessment result as a JSON string NSString* pronunciationAssessmentResultJson = [speechRecognitionResult.properties getPropertyByName:SPXSpeechServiceResponseJsonResult];

若要了解如何在自己的应用程序中指定用于发音评估的学习语言，请参阅示例代码。

let speechRecognizer = try! SPXSpeechRecognizer(speechConfiguration: speechConfig, audioConfiguration: audioConfig) try! pronConfig.apply(to: speechRecognizer) let speechRecognitionResult = try? speechRecognizer.recognizeOnce() // The pronunciation assessment result as a Speech SDK object let pronunciationAssessmentResult = SPXPronunciationAssessmentResult(speechRecognitionResult!) // The pronunciation assessment result as a JSON string let pronunciationAssessmentResultJson = speechRecognitionResult!.properties?.getPropertyBy(SPXPropertyId.speechServiceResponseJsonResult) 结果参数

根据使用的是脚本化还是未脚本化评估，可以获取不同的发音评估结果。脚本化评估适用于阅读语言学习场景，非脚本化评估适用于口语学习场景。

注意

有关脚本化评估与非脚本化评估之间的定价差异，请参阅定价说明。

脚本化评估结果

下表列出了脚本化评估（阅读场景）的一些关键发音评估结果，以及每个脚本评估受支持的粒度。

参数说明粒度 AccuracyScore 语音的发音准确度。准确度表示音素与母语人士发音的接近程度。音节、单词和全文准确性分数由音素级别的准确度分数聚合而来，并根据评估目标进行调整。音素水平，音节水平（仅限 zh-CN），词汇水平，全文水平 FluencyScore 给定语音的流畅度。流畅度表示语音与母语人士在单词之间使用无声停顿的接近程度。全文水平 CompletenessScore 语音的完整性，按发音单词与输入引用文本的比例进行计算。全文水平 ProsodyScore 给定语音的韵律。韵律表示给定的语音有多么自然，包括重音、语调、语速和节奏。全文水平 PronScore 总分，表示给定语音的发音质量。 PronScore 按权重从 AccuracyScore、FluencyScore 和 CompletenessScore 聚合而成。全文水平 ErrorType 此值指示，与参考文本相比，单词是否被省略、插入、不正确地插入断句或标点符号处缺少断句。它还指示单词的发音是否糟糕，或者话语中单调上升、下降或平淡。可能的值为None（表示此词没有错误）、Omission、Insertion、Mispronunciation、UnexpectedBreak、MissingBreak和Monotone。当单词的发音 AccuracyScore 低于 60 时，错误类型可能是 Mispronunciation。词汇水平未脚本化评估结果

下表列出了未脚本化评估（口语场景）的一些关键发音评估结果，以及每个脚本评估受支持的粒度。

注意

VocabularyScore、GrammarScore 和 TopicScore 参数汇总到合并的内容评估中。

内容和韵律评估仅在zh-CN区域设置中可用。

响应参数说明粒度 AccuracyScore 语音的发音准确度。准确度表示音素与母语人士发音的接近程度。音节、单词和全文准确性分数由音素级别的准确度分数聚合而来，并根据评估目标进行调整。音素水平，音节水平（仅限 zh-CN），词汇水平，全文水平 FluencyScore 给定语音的流畅度。流畅度表示语音与母语人士在单词之间使用无声停顿的接近程度。全文水平 ProsodyScore 给定语音的韵律。韵律表示给定的语音有多么自然，包括重音、语调、语速和节奏。全文水平 VocabularyScore 词汇用法的熟练程度。它评估演讲者对单词的有效使用及其在给定上下文中准确表达想法的适当性，以及词汇复杂性水平。全文水平 GrammarScore 使用语法和各种句子模式的正确性。语法错误由词汇准确性、语法准确性和句子结构的多样性共同评估。全文水平 TopicScore 对主题的理解和参与程度，这提供了对演讲者有效表达思想和想法的能力以及参与主题的能力的见解。全文水平 PronScore 总分，表示给定语音的发音质量。这根据 AccuracyScore、FluencyScore 和 CompletenessScore（具有权重）进行汇总。全文水平 ErrorType 此值指示单词是否发音糟糕、断句插入不正确、标点符号处缺少断句，或者话语中单调上升、下降或平坦。可能的值为 None（表示此词没有错误）、Mispronunciation、UnexpectedBreak、MissingBreak和Monotone。词汇水平

下表更详细地描述了韵律评估结果：

字段说明 ProsodyScore 整个话语的韵律分数。 Feedback 有关词汇水平的反馈，包括断句和语调。 Break ErrorTypes 与断句相关的错误类型，包括UnexpectedBreak和MissingBreak。在当前版本中，我们不提供断句错误类型。需要分别设置以下字段的阈值，UnexpectedBreak – Confidence 和 “MissingBreak – confidence”，以确定单词前面是否存在意外断句或缺少断句。 UnexpectedBreak 指示单词前面出现意外断句。 MissingBreak 指示单词前面缺少断句。 Thresholds 两个置信度分数的建议阈值为 0.75。这意味着，如果“UnexpectedBreak – Confidence”的值大于 0.75，可以决定存在意外断句。如果“MissingBreak – confidence”的值大于 0.75，可以决定缺少断句。如果要在这两个断句上具有可变检测灵敏度，建议为“UnexpectedBreak - Confidence”和“MissingBreak - Confidence”字段分配不同的阈值。 Intonation 指示语音中的语调。 ErrorTypes 与语调相关的错误类型，目前仅支持单调。如果“单调”存在于字段“ErrorTypes”中，话语检测为单调。在整个话语中检测到单调，但标记会分配给所有单词。同一话语中的所有单词共享相同的单调检测信息。 Monotone 指示单调语音。 Thresholds (Monotone Confidence) 字段“Monotone - SyllablePitchDeltaConfidence”保留用于用户自定义的单调检测。如果对提供的单调决策不满意，可以调整这些字段的阈值以根据偏好自定义检测。 JSON 结果示例

语音单词“hello”的脚本化发音评估结果显示为以下示例中的 JSON 字符串。应了解以下知识：

音素字母是 IPA。音节与相同单词的音素一起返回。可使用 Offset 和 Duration 值将音节与其对应的音素对齐。例如，第二个音节（“loʊ”）的起始偏移量（11700000）与第三个音素（“l”）对齐。偏移量表示已识别的语音在音频流中开始的时间，以 100 纳秒为单位衡量。若要详细了解 Offset 和 Duration，请参阅响应属性。有五个 NBestPhonemes 对应于请求的语音音素的数量。在 Phonemes 中，最可能的口语音素是 "ə"，而不是预期的音素 "ɛ"。预期的音素 "ɛ" 仅获得 47 分的置信度。其他潜在匹配的置信度分数为 52、17 和 2。 { "Id": "bbb42ea51bdb46d19a1d685e635fe173", "RecognitionStatus": 0, "Offset": 7500000, "Duration": 13800000, "DisplayText": "Hello.", "NBest": [ { "Confidence": 0.975003, "Lexical": "hello", "ITN": "hello", "MaskedITN": "hello", "Display": "Hello.", "PronunciationAssessment": { "AccuracyScore": 100, "FluencyScore": 100, "CompletenessScore": 100, "PronScore": 100 }, "Words": [ { "Word": "hello", "Offset": 7500000, "Duration": 13800000, "PronunciationAssessment": { "AccuracyScore": 99.0, "ErrorType": "None" }, "Syllables": [ { "Syllable": "hɛ", "PronunciationAssessment": { "AccuracyScore": 91.0 }, "Offset": 7500000, "Duration": 4100000 }, { "Syllable": "loʊ", "PronunciationAssessment": { "AccuracyScore": 100.0 }, "Offset": 11700000, "Duration": 9600000 } ], "Phonemes": [ { "Phoneme": "h", "PronunciationAssessment": { "AccuracyScore": 98.0, "NBestPhonemes": [ { "Phoneme": "h", "Score": 100.0 }, { "Phoneme": "oʊ", "Score": 52.0 }, { "Phoneme": "ə", "Score": 35.0 }, { "Phoneme": "k", "Score": 23.0 }, { "Phoneme": "æ", "Score": 20.0 } ] }, "Offset": 7500000, "Duration": 3500000 }, { "Phoneme": "ɛ", "PronunciationAssessment": { "AccuracyScore": 47.0, "NBestPhonemes": [ { "Phoneme": "ə", "Score": 100.0 }, { "Phoneme": "l", "Score": 52.0 }, { "Phoneme": "ɛ", "Score": 47.0 }, { "Phoneme": "h", "Score": 17.0 }, { "Phoneme": "æ", "Score": 2.0 } ] }, "Offset": 11100000, "Duration": 500000 }, { "Phoneme": "l", "PronunciationAssessment": { "AccuracyScore": 100.0, "NBestPhonemes": [ { "Phoneme": "l", "Score": 100.0 }, { "Phoneme": "oʊ", "Score": 46.0 }, { "Phoneme": "ə", "Score": 5.0 }, { "Phoneme": "ɛ", "Score": 3.0 }, { "Phoneme": "u", "Score": 1.0 } ] }, "Offset": 11700000, "Duration": 1100000 }, { "Phoneme": "oʊ", "PronunciationAssessment": { "AccuracyScore": 100.0, "NBestPhonemes": [ { "Phoneme": "oʊ", "Score": 100.0 }, { "Phoneme": "d", "Score": 29.0 }, { "Phoneme": "t", "Score": 24.0 }, { "Phoneme": "n", "Score": 22.0 }, { "Phoneme": "l", "Score": 18.0 } ] }, "Offset": 12900000, "Duration": 8400000 } ] } ] } ] }

可获取以下项的发音评估分数：

全文单词音节组 SAPI 或 IPA 格式的音素

注意

发音评估的音节组、音素名称和口语音素功能目前仅适用于 en-US（美国）区域设置。有关发音评估的可用性的信息，请参阅支持的语言和可用性区域。

音节组

发音评估可以提供音节级的评估结果。按音节分组更易读，并且与说话习惯一致，因为一个单词通常逐音节发音，而不是逐音素发音。

下表将示例音素与相应的音节进行了比较。

示例单词音素音节技术 teknələdʒɪkl tek·nə·lɑ·dʒɪkl hello hɛloʊ hɛ·loʊ luck lʌk lʌk photosynthesis foʊtəsɪnлəsɪs foʊ·tə·sɪn·θə·sɪs

要请求音节级结果和音素，请将粒度配置参数设置为 Phoneme。

音素字母格式

对于 en-US 区域设置，音素名称与分数一起提供，这样有助于标识哪些音素发音准确或不准确。对于其他区域设置，只能获取音素分数。

下表将示例 SAPI 音素与相应的 IPA 音素进行了比较。

示例单词 SAPI 音素 IPA 音素 hello h eh l ow h ɛ l oʊ luck l ah k l ʌ k photosynthesis f ow t ax s ih n th ax s ih s f oʊ t ə s ɪ n θ ə s ɪ s

要请求 IPA 音素，请将音素字母设置为 "IPA"。如果不指定字母表，则默认情况下音素为 SAPI 格式。

pronunciationAssessmentConfig.PhonemeAlphabet = "IPA"; auto pronunciationAssessmentConfig = PronunciationAssessmentConfig::CreateFromJson("{\"referenceText\":\"good morning\",\"gradingSystem\":\"HundredMark\",\"granularity\":\"Phoneme\",\"phonemeAlphabet\":\"IPA\"}"); PronunciationAssessmentConfig pronunciationAssessmentConfig = PronunciationAssessmentConfig.fromJson("{\"referenceText\":\"good morning\",\"gradingSystem\":\"HundredMark\",\"granularity\":\"Phoneme\",\"phonemeAlphabet\":\"IPA\"}"); pronunciation_assessment_config = speechsdk.PronunciationAssessmentConfig(json_string="{\"referenceText\":\"good morning\",\"gradingSystem\":\"HundredMark\",\"granularity\":\"Phoneme\",\"phonemeAlphabet\":\"IPA\"}") var pronunciationAssessmentConfig = SpeechSDK.PronunciationAssessmentConfig.fromJSON("{\"referenceText\":\"good morning\",\"gradingSystem\":\"HundredMark\",\"granularity\":\"Phoneme\",\"phonemeAlphabet\":\"IPA\"}"); pronunciationAssessmentConfig.phonemeAlphabet = @"IPA"; pronunciationAssessmentConfig?.phonemeAlphabet = "IPA" 口语音素

通过口语音素，可获得表明口语音素与预期音素匹配的可能性的置信度分数。

例如，若要获取单词“Hello”的完整口语发音，可以连接每个具有最高置信度分数的预期音素的第一个语音音素。在以下评估结果中，当你说“hello”这个词时，预期的 IPA 音素是“h ɛ l oʊ”。但是，实际口语音素是“h ə l oʊ”。在本示例中，每个预期的音素都有五个可能的候选项。评估结果显示最可能的口语音素是 "ə" 而不是预期的音素 "ɛ"。预期的音素 "ɛ" 仅获得 47 分的置信度。其他潜在匹配的置信度分数为 52、17 和 2。

{ "Id": "bbb42ea51bdb46d19a1d685e635fe173", "RecognitionStatus": 0, "Offset": 7500000, "Duration": 13800000, "DisplayText": "Hello.", "NBest": [ { "Confidence": 0.975003, "Lexical": "hello", "ITN": "hello", "MaskedITN": "hello", "Display": "Hello.", "PronunciationAssessment": { "AccuracyScore": 100, "FluencyScore": 100, "CompletenessScore": 100, "PronScore": 100 }, "Words": [ { "Word": "hello", "Offset": 7500000, "Duration": 13800000, "PronunciationAssessment": { "AccuracyScore": 99.0, "ErrorType": "None" }, "Syllables": [ { "Syllable": "hɛ", "PronunciationAssessment": { "AccuracyScore": 91.0 }, "Offset": 7500000, "Duration": 4100000 }, { "Syllable": "loʊ", "PronunciationAssessment": { "AccuracyScore": 100.0 }, "Offset": 11700000, "Duration": 9600000 } ], "Phonemes": [ { "Phoneme": "h", "PronunciationAssessment": { "AccuracyScore": 98.0, "NBestPhonemes": [ { "Phoneme": "h", "Score": 100.0 }, { "Phoneme": "oʊ", "Score": 52.0 }, { "Phoneme": "ə", "Score": 35.0 }, { "Phoneme": "k", "Score": 23.0 }, { "Phoneme": "æ", "Score": 20.0 } ] }, "Offset": 7500000, "Duration": 3500000 }, { "Phoneme": "ɛ", "PronunciationAssessment": { "AccuracyScore": 47.0, "NBestPhonemes": [ { "Phoneme": "ə", "Score": 100.0 }, { "Phoneme": "l", "Score": 52.0 }, { "Phoneme": "ɛ", "Score": 47.0 }, { "Phoneme": "h", "Score": 17.0 }, { "Phoneme": "æ", "Score": 2.0 } ] }, "Offset": 11100000, "Duration": 500000 }, { "Phoneme": "l", "PronunciationAssessment": { "AccuracyScore": 100.0, "NBestPhonemes": [ { "Phoneme": "l", "Score": 100.0 }, { "Phoneme": "oʊ", "Score": 46.0 }, { "Phoneme": "ə", "Score": 5.0 }, { "Phoneme": "ɛ", "Score": 3.0 }, { "Phoneme": "u", "Score": 1.0 } ] }, "Offset": 11700000, "Duration": 1100000 }, { "Phoneme": "oʊ", "PronunciationAssessment": { "AccuracyScore": 100.0, "NBestPhonemes": [ { "Phoneme": "oʊ", "Score": 100.0 }, { "Phoneme": "d", "Score": 29.0 }, { "Phoneme": "t", "Score": 24.0 }, { "Phoneme": "n", "Score": 22.0 }, { "Phoneme": "l", "Score": 18.0 } ] }, "Offset": 12900000, "Duration": 8400000 } ] } ] } ] }

要指示是否以及有多少潜在的口语音素可以获得置信度分数，请将 NBestPhonemeCount 参数设置为整数值，例如 5。

pronunciationAssessmentConfig.NBestPhonemeCount = 5; auto pronunciationAssessmentConfig = PronunciationAssessmentConfig::CreateFromJson("{\"referenceText\":\"good morning\",\"gradingSystem\":\"HundredMark\",\"granularity\":\"Phoneme\",\"phonemeAlphabet\":\"IPA\",\"nBestPhonemeCount\":5}"); PronunciationAssessmentConfig pronunciationAssessmentConfig = PronunciationAssessmentConfig.fromJson("{\"referenceText\":\"good morning\",\"gradingSystem\":\"HundredMark\",\"granularity\":\"Phoneme\",\"phonemeAlphabet\":\"IPA\",\"nBestPhonemeCount\":5}"); pronunciation_assessment_config = speechsdk.PronunciationAssessmentConfig(json_string="{\"referenceText\":\"good morning\",\"gradingSystem\":\"HundredMark\",\"granularity\":\"Phoneme\",\"phonemeAlphabet\":\"IPA\",\"nBestPhonemeCount\":5}") var pronunciationAssessmentConfig = SpeechSDK.PronunciationAssessmentConfig.fromJSON("{\"referenceText\":\"good morning\",\"gradingSystem\":\"HundredMark\",\"granularity\":\"Phoneme\",\"phonemeAlphabet\":\"IPA\",\"nBestPhonemeCount\":5}"); pronunciationAssessmentConfig.nbestPhonemeCount = 5; pronunciationAssessmentConfig?.nbestPhonemeCount = 5 后续步骤了解我们的质量基准试用Speech Studio 中的发音评估查看易于部署的发音评估演示并观看发音评估的视频演示。

【本文地址】

使用发音评估

使用发音评估

今日新闻

推荐新闻