Typically, for using Deepgram API you would require to maintain a server but I have made this project totally serverless. To know more continue reading.
Here's a brief demo of the entire app in action:
Submission Category:
Analytics Ambassadors
Link to Code on GitHub
The entire app is open sourced - try it out and also feel free to contribute to this project 😉 :
A cross-platform mobile app that helps you to generate transcripts either from a voice recording or by uploading an audio file. The project uses a totally serverless architecture.
Architecture
The mobile app is created using Flutter which is integrated with Firebase. Firebase Cloud Functions is used to deploy the backend code required for communicating with the Deepgram API.
App overview
The Flutter application consists of the following pages/screens:
Login Page
Register Page
Dashboard Page
Record Page
Upload Page
Transcription Page
For authenticating the user inside the app -- Login and Register pages are used. Authentication is required to generate unique accounts for users required for storing the generated transcripts to Firestore and facilitate cloud-sync.
The Dashboard Page displays a list of all the transcripts currently present on the user's account. It also has two buttons -…
Generate transcript from audio recording & audio file using Deepgram API.
Cloud-sync for syncing across multiple devices using the same account.
Transcribe confidence map view.
Export as PDF and share with anyone.
Architecture
I'm using a totally serverless architecture for this project 🤯, let's have a look how it works:
The mobile app is created using Flutter which is integrated with Firebase. I have used Firebase Cloud Functions to deploy the backend code required for communicating with the Deepgram API.
I have deployed the following function to Firebase:
constfunctions=require("firebase-functions");const{Deepgram}=require("@deepgram/sdk");exports.getTranscription=functions.https.onCall(async (data,context)=>{try{constdeepgram=newDeepgram(process.env.DEEPGRAM_API_KEY);constaudioSource={url:data.url,};constresponse=awaitdeepgram.transcription.preRecorded(audioSource,{punctuate:true,utterances:true,});console.log(response.results.utterances.length);constconfidenceList=[];for (leti=0;i<response.results.utterances.length;i++){confidenceList.push(response.results.utterances[i].confidence);}constwebvttTranscript=response.toWebVTT();constfinalTranscript={transcript:webvttTranscript,confidences:confidenceList,};constfinalTranscriptJSON=JSON.stringify(finalTranscript);console.log(finalTranscriptJSON);returnfinalTranscriptJSON;}catch (error){console.error(`Unable to transcribe. Error ${error}`);thrownewfunctions.https.HttpsError("aborted","Could not transcribe");}});
The getTranscription function takes an audio URL, generates the transcripts using Deepgram API along with the respective confidences, and returns the data in a particular JSON format (that can be parsed within the app).
App screens
The Flutter application consists of the following pages/screens:
Login Page
Register Page
Dashboard Page
Record Page
Upload Page
Transcription Page
For authenticating the user inside the app -- Login and Register pages are used. Authentication is required to generate unique accounts for users required for storing the generated transcripts to Firestore and facilitate cloud-sync.
The Dashboard Page displays a list of all the transcripts currently present on the user's account. It also has two buttons - one for navigating to the Record Page and the other for navigating to the Upload Page.
Record Page lets you record your audio using the device microphone and the transcribe it using Deepgram. You always have an option to re-record if you think the last recording wasn't good.
From the Upload Page, you can choose any audio file present on your device and generate the transcript of it.
Transcription Page is where the entire transcript can be viewed. It has an audio-transcript synchronized playback that highlights the text transcript part with respect to the audio that is playing.
You can also see the confidence map of each of the parts of the transcript (it shows how much accurate is that part of transcript generation - darker is higher confidence).
You can also easily print or share the generated transcript in the PDF format.
Deepgram
Overview of my Deepgram dashboard (completed the mission, Get a Transcript via API or SDK):
Usage analytics of the Deepgram API:
Log of one of the API calls for transcribing from audio: