Some guide on what makes a good ID
To me, one of the small fascinating things about working with Oracle Cloud here at Mythics is the ID system used by Oracle to identify its resources. Those familiar will see Oracle Cloud ID (OCID)'s that look like these:
Tenancy (main cloud account):
ocid1.tenancy.oc1..aaaaaaaaba3pv6wkcr4jqae5f44n2b2m2yt2j6rx32uzr4h25vqstifsfdsq
Compute Instance:
ocid1.instance.oc1.phx.abuw4ljrlsfiqw6vzzxb43vyypt4pkodawglp3wqxjqofakrwvou52gb6s5a
Notice they follow a predictable pattern. Referencing Oracle's article, an OCID can be broken down into these items:
ocid1.<RESOURCE TYPE>.<REALM>.[REGION][.FUTURE USE].<UNIQUE ID>
where:
-
<RESOURCE TYPE>
identifies the resource, e.g.,vault
,user
,autonomousdatabase
-
<REALM>
identifies the region, e.g.oc1
for commercial,oc2
for Government -
[REGION]
optionally identifies the region, e.g.,iad
, short for Dulles airport which is the closest major airport to Ashburn -
<UNIQUE ID>
which is the actual unique portion whose format varies As I'm developing applications for Mythics, I seek the work of others to improve our standards, and OCID stood out as an excellent guide on how to make good IDs when creating resources whether that may be database row IDs, object IDs, response IDs, and so on. So what makes a good ID, generally?
1. Unique
Obviously. A good ID cannot have an identical twin identifying a different component in the same realm. Using a person's birthdate or name as their ID leads to problems down the road.
A corollary follows that such IDs must be strongly defined to the resource it identifies. That means the ID must be trusted to uniquely identify the intended resource. A user ID cannot point towards a device ID owned by the user for example, as the user can have more than one device.
2. Identifiable
Second, this ID system should be identifiable.
When you see an OCID, you know instantly this is an Oracle resource identifier You know that because of its long length, its strict format, and of course, the ocid1
prefix string that denotes this came from Oracle Cloud. What's more, you could accurately identify the specific source of the ID, whether that may be a VCN, autonomous database, instance, or something else.
This is not only important if you randomly find that ID somewhere and need to know what's it for, but it also helps parsers and data scientists to properly extract information from raw data.
Some other examples are the nine-digit SSNs that have the pattern 012–34–5678, phone numbers that start with +1 (xxx) xxx-xxxx, Georgia Tech IDs that start with 90xxxxxxx, and the ubiquitous credit cards, which have the bonus of telling the payment card provider through its first 4 digits (and some more details with the other numbers).
3. Public
Despite the despise of federal government interventions, Americans have a unique national identification number, the Social Security Number (SSN). In general. It's unique but it's neither secure nor adopted by all Americans (citizens or non-citizens).
But by far my most hated characteristic of this number is that it must be kept very secret or you risk your identity and financial data being stolen. Just by knowing a 9-digit number that you can't change.* Which is in the hands of countless companies and probably memorized by your closest family members as well.
It fails to be public in that the sheer mention of this number is to be treated as sacred. You must use this number for identification purposes for governmental/medical/financial/educational/career reasons yet one data leak demands permanent damage and lifelong identity theft protection.
If an ID is to be used, it must be shareable.
Examples include OCIDs, emails, usernames, and dates of birth. Even your card number is still more shareable given that you need the expiry date, CVV, and address to make it complete.
4. Cheap to produce
This is where the standard start to be specific. For me, the generation of new IDs should not demand an expensive query to the backend, e.g., by checking the latest ID and adding by 1, nor should it take a long time to produce, e.g., by collecting the randomness of the cosmic microwave background.
Again, in general. Extra steps should be taken in generating secure cryptographic keys but most of the time, security and identifiability are not synonymous. Same for seriality.
A SHA1 hash and a simple random number generator are good examples.
5. Secure
Secure here means the ID checks itself using a checksum function. Running this ID through an isolated inexpensive algorithm will be able to tell whether the number is valid.
An OCID is valid because it follows a structured pattern taken from a specified set of constants. An awsS3
is not a valid resource type within the OCID, for example. An email is valid because it must contain an @ and a .domain. A card number is valid because you can run the number through the Luhn algorithm. A person should not be able to randomly change any one digit and have the ID still completely valid.
6. Easily written down
In general, across all fields, IDs should best be easy to write. To share. And consequently, memorize.
There are times when technology could not read and write the ID for you, like a broken barcode or writing your card number, or reading a serial number from the computer's case. This demands a level of care to how long and complex the ID can be so our users can use the ID multifacetedly.
Conclusion
Again, in general. There are some cases where the niche requirements of the ID demand some bending of the rules but I find them good starting guides. The OCID is certainly a beautiful example of what makes a good ID in its area and I push you dear reader to find more ways it can be improved.
Some Package Recommendations
Some Javascript packages that I like are the native crypto
property which can generate UUIDs and the NanoID which generates customizable, secure ID strings.
Some References
Safe harbor statement
The information provided on this channel/article/story is solely intended for informational purposes and cannot be used as a part of any contractual agreement. The content does not guarantee the delivery of any material, code, or functionality, and should not be the sole basis for making purchasing decisions. The postings on this site are my own and do not necessarily reflect the views or work of Oracle or Mythics, LLC.
This work is licensed under a Creative Commons Attribution 4.0 International License.