In this video tutorial we will go over how to do client side inferencing in the browser with ONNX Runtime web. Below is a video on how to understand and use a QuickStart template to start building out a static web app with an open source computer vision model. Additionally, you can find a written step-by-step tutorial in the onnxruntime.ai docs here. Let's learn a bit more about the library, ONNX Runtime (ORT), which allows us to inference in many different languages.
What is ORT and ORT-Web?
ONNX Runtime (ORT)
is a library to optimize and accelerate machine learning inferencing. It has cross-platform support so you can train a model in Python and deploy with C#, Java, JavaScript, Python and more. Check out all the support platforms, architectures, and APIs here.
ONNX Runtime Web (ORT-Web)
enables JavaScript developers to run and deploy machine learning models client-side. With ORT-Web you have the option to use a backend of either WebGL
for GPU processing or WebAssembly WASM
for CPU processing. If you want to do JavaScript server side inferencing with node checkout the onnxruntime-node library.
Video tutorial:
Written tutorial:
Check out the written tutorial here: ONNX Runtime Web Docs tutorial
Resources
- Start using the template now by going to GitHub NextJS ORT-Web Template.
- ONNX Runtime Web Docs tutorial
- ONNX Runtime docs
- ONNX Runtime GitHub
- Deploy with Azure Static Web Apps