What is UUID?
A Universally Unique Identifier (UUID) is a 128-bit
label defined in RFC 4122. It is designed to be unique across space and time. While collisions are theoretically possible, the probability is so vanishingly small that it is considered negligible in practice.
Use cases
Database primary keys work well in distributed systems and microservices because UUIDs can be generated independently without a central authority.
API identifiers make IDs less predictable than sequential numbers, reducing risks like enumeration attacks.
File/Object names handy for ensuring unique names in storage systems like S3.
Tracing/correlation IDs useful in logging and observability to track requests across services.
Trade-offs
Storage sizes are double or quadruple to sequential numbers. This makes indexes and databases bigger.
Some versions of UUID hurt clustering in database, because random UUIDs (like v4) scatter inserts across the index, which can reduce performance compared to sequential IDs.
They are also harder for humans to read, debug, or compare.
In some scenarios, sequential numbers are faster.
Versions
There are several versions available for UUID.
version 1 and 6
These are based on date-time and MAC address.
Security concern, since these uses device’s MAC address, end result is considered to be less secure because in theory one can detect device that generated the number.
Difference of 1 and 6 is that version 6 identifiers can be sorted. This is advantage for database indexes which works more efficiently this way.
version 2
These are based on date-time, MAC address and DCE security version. This is not often implemented by many libraries.
version 3 and 5
These identifiers are based on given namespace and name the result is always the same UUID for the same namespace and name.
Difference between version 3 and 5 is that version 3 uses md5 and version 5 uses sha1 to create the identifier.
version 4
These identifiers are purely random.
version 7
These identifiers are based on timestamp and random value. Because of timestamp, this is efficient identifier to be used for database indexers.
version 8
Aside from some bits beginning of UUID, the implementation is up to vendor.
Which one to use
Today, v4 and v7 are the most commonly used. v4 is purely random, while v7 combines randomness with timestamp ordering, making it efficient for databases.
Structure of UUID
UUIDs have two common representations:
- String form, 36 characters with hyphens
550e8400-e29b-41d4-a716-446655440000
. - Hex/binary form, 16 bytes which is more efficient in databases.
Collisions
Rough analogy: you’d need to generate a billion UUIDs per second for 100 years to get a 50% chance of one collision.
For comparison: you are more likely to win the lottery jackpot several times in a row than to see a UUIDv4 collision in normal usage.
Special values
The following special values exist for UUIDs:
- Minimum value:
00000000-0000-0000-0000-000000000000
- Maximum value:
FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF
Security
While UUIDs are useful for obscuring identifiers, they are not a substitute for proper authentication and authorization. Never use them as passwords, API keys, or session tokens.
Future
With wide adoption of UUIDv7, developers get both predictability for databases and randomness for uniqueness, making it the emerging default choice.