This video is a technical deep dive on the demo presented in https://youtu.be/I_hqzdqQ5vE, where I run multilingual voice queries on financial documents, using two state of the art Transformer models for speech to text and semantic search in less than 100 lines of Python:
— Dataset preparation, based on SEC filings downloaded with an AWS SDK. See https://youtu.be/SU1L6f0N6iw for details.
— Experimentation in a Jupyter notebook, available at https://gitlab.com/juliensimon/huggingface-demos/-/tree/main/voice-queries
— Implementation of a Hugging Face Spaces application, available at https://huggingface.co/spaces/juliensimon/voice-queries