We all know how great AI is, however, there are still two major problems: data privacy and cost.
All the applications using AI right now are connected to cloud APIs. These APIs log prompts and contexts and in some cases they use that data to train models. That means that any sensitive data you include on them is potentially exposed.
Most web applications integrate AI features using the following schema:
The problem here is that the application servers need to send the user data to the AI API, which is a third-party API and we cannot really know what will happen with the user data.
But, why don't we just process AI in the user device instead of the cloud? I have been testing it for a few weeks with amazing results. I found 3 main advantages:
- The user data is never sent to a third-party. It always remains on the user device.
- It's free for the app developer, you don't need to pay for the user inference, because it happens directly on the user device.
- The scalability is unlimited as every single new user brings his own computation power.
Let's take a quick look at how the previous schema changes when we offload the AI computation to the users:
It's a very simple concept. The user uses the we application as always, but when there is some task that requires to perform AI computation, instead of using a third-party API, we send it to the user and it's device will perform that computation in the most secure way, locally.
This is not just a dream, it's already fully functional, and I created a platform called Offload so that everyone can use this architecture easily, just changing a few lines of code. The SDK will handle everything behind the scenes, from downloading a model that fits on the user device, to help you manage the prompts and evaluate prompt responses locally, sending back the evaluation results to you without exposing the user data. Everything works transparently with a single function invocation.
I am looking for web developers that may benefit from this, even if it is just for hobby projects, so, if you like this approach ping me! I would love to help you set it up in your application and you will see that it is actually really simple to migrate within minutes.