How have I ensured the privacy and anonymity of my website users? 🤔

Alex Fedorov - Dec 17 '19 - - Dev Community

Over the course of the past month, I’ve built the first version of FelloWage—a website that allows users to share their salary information and view information shared by others.

Of course, I wanted to keep the information of my users private and anonymous. When somebody is looking at the wage entry, they shouldn’t be able to tell who this entry belongs to.

On the other hand, to keep the quality of the shared data high, users should be able to update their salary entries when they change over time. Thus the wage records need to be somehow connected to the respective user accounts.

This poses an issue: if I implement it in the most obvious way (a database foreign key relation), then the website operator (me), effectively, has access to this connection between the user account and a salary record. Also, legal authorities coming with a court order will be able to see this as well, and in the unfortunate case of a successful cyber attack, the hackers will get their hands on this data too.

Privacy problem

This is not good enough! How can we do better?

We are looking for a solution where:

  • the wage entry is readable by all users of the system,
  • the connection between wage entry and the user account is readable ONLY by the user account owning that entry,
  • user accounts are readable to the system (at least partially) for the purposes of the login system.

In my scenario, since the user sign-up verification is a manual process, I needed the system to be able to write this connection without the presence of the user.

If we sum it up: only the user should be able to read a connection record, and the system should be able to write this record (but not read it).

Assymetric encryption

This sounds to me like asymmetric encryption, where the system knows the user’s public key, and the user knows their own private key. The system uses the public key to encrypt the information when it needs to write it, and the user can read that information using their private key.

Of course, the next challenge is UX. We can’t have the users use private keys every time they want to login.

That’d be too clunky.

Asymmetric encryption

Passphrase-encrypted private key

Then it has stricken me: what if I do the same thing SSH keys do when you set them up with a passphrase?

Now the system will store both types of keys:

  • the public key, and
  • the private key encrypted using the user’s password.

Now, even if I were to drop into the raw SQL in my database console, I wouldn’t be able to tell who owns which wage entries anymore! And the user can still see and manage their own entry.

As a bonus, I’ve found this method of securing the user’s data quite convenient, and I used it for other information as well, where I’m sure that only the user will need access to this data.

Of course, we still need to make sure that the users create strong passwords that are not vulnerable to dictionary attacks, and weren’t part of any breach. I’ve used zxcvn by Dropbox library and “Have I Been Pwned” API for that.

Thank you for reading!

I’m glad you’ve gotten to the end of this post! If you are interested in more behind-the-scenes posts like this, you should subscribe to our newsletter.

The next post that is in the making is a deeper dive into the implementation details and challenges of the solution from this article. Don’t miss it, grab our newsletter here! 🚀

Thank you for your support!

. . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player