I’ve been integrating the Microsoft Cognitive Services Text-to-speech thingy, to make a little web API that can speak Vietnamese.
Take a look at this sample code for using the microsoft-cognitiveservices-speech-sdk
npm package (I’ve snipped some stuff for brevity):
var sdk = require("microsoft-cognitiveservices-speech-sdk");
// This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
const speechConfig = sdk.SpeechConfig.fromSubscription(process.env.SPEECH_KEY, process.env.SPEECH_REGION);
const audioConfig = sdk.AudioConfig.fromAudioFileOutput("audio.wav");
// The language of the voice that speaks.
speechConfig.speechSynthesisVoiceName = "en-US-JennyNeural";
// Create the speech synthesizer.
var synthesizer = new sdk.SpeechSynthesizer(speechConfig, audioConfig);
var text = "You have selected Microsoft Sam as the Computer's default voice.";
// Start the synthesizer and wait for a result.
synthesizer.speakTextAsync(text,
function (result) {
if (result.reason === sdk.ResultReason.SynthesizingAudioCompleted) {
console.log("Synthesis finished.");
} else {
console.error("Speech synthesis canceled, " + result.errorDetails);
}
synthesizer.close();
synthesizer = null;
},
function (err) {
console.trace("err - " + err);
synthesizer.close();
synthesizer = null;
}
);
console.log("Now synthesizing to: " + audioFile);
The problem
The example synthesizer
object only produces values in its callbacks.
If you wrap this code sample in a speak(text)
function, then it will return before the values you want are “ready”. You can’t directly return
the outcome; you’d have to either match its old-school vibes and pass a cb
callback argument, or if you were hoisting it into express, perhaps you’d pass in the express response
object so that your speak
function can directly return a web response.
Rubbish solutions!!
ObviouslY we want to wrap it in a Promise so that we can provide a dead simple awaitable interface for an async route handler to use:
app.get("/create/:text", async (req, res) => {
const result = await speak(req.params.text);
return res.json(result);
});
Trading callbacks for a Promise
But how do we get a Promise out of this mess?
- Wrap the insides of the new function in
return new Promise((resolve, reject) => { ... }
- Anywhere you want to return a success, invoke
resolve(someResponseObject)
- Anywhere you want to handle an error, invoke
reject(someErrorObject)
That’s it! Your new method is then-able and async!
export const speak = (text) => {
return new Promise((resolve, reject) => {
const audioFileName = "audio.wav";
if (!process.env.SPEECH_KEY || !process.env.SPEECH_REGION)
{
const resultModel = {
"Status" : "Error",
"Error" : "Unable to connect to speech server"
}
reject(resultModel);
}
try {
// Connect SDK
const speechConfig = sdk.SpeechConfig.fromSubscription(
process.env.SPEECH_KEY,
process.env.SPEECH_REGION
);
const audioConfig = sdk.AudioConfig.fromAudioFileOutput(audioFileName);
speechConfig.speechSynthesisVoiceName = "en-US-JennyNeural";
// Create the speech synthesizer.
var synthesizer = new sdk.SpeechSynthesizer(speechConfig, audioConfig);
synthesizer.speakTextAsync(
text,
function (result) {
synthesizer.close();
synthesizer = null;
const resultModel = {
"Status" : "OK",
"Text" : text,
"Audio" : audioFileName,
"ResultID" : result.resultId
}
resolve(resultModel);
},
function (err) {
synthesizer.close();
synthesizer = null;
const resultModel = {
"Status" : "Error",
"Error" : err
}
reject(resultModel);
}
);
} catch (error) {
const resultModel = {
"Status" : "Error",
"Error" : error
}
reject(resultModel);
}
});
};
Isn’t that cool?
As usual, I basically got this all from the great Promise examples on MDN.
I wish you growth and harmony this summertime 🌼