SQL Server 2008 R2 Books Online clearly spells out why to consider the primary key as the authenticator when encrypting a column value in SQL Server:
If an authenticator parameter is specified when data is encrypted, the same authenticator is required to decrypt the data. At encryption time, a hash of the authenticator is encrypted together with the plaintext. At decryption time, the same authenticator must be passed to the decryption function. If the two do not match, the decryption will fail, indicating that the value has been moved since it was encrypted. We recommend using the primary key of the table in which the result will be stored as an authenticator. – http://msdn.microsoft.com/en-us/library/ms365192(v=sql.105).aspx
Thus the authenticator effectively protects the data from substitution attack vectors. The encrypted column in one row cannot be substituted for that same column from another row if a unique authenticator column is used, e.g. the primary key, unless of course the attacker(s) know to move the primary key to. Granted, that has it’s own set of challenges but nonetheless can be done. Or, in some cases, could not the attack swap all columns except the encrypted column and the primary key to successfully perpetrate the attack an authenticator is supposed to prevent? It would surely come to the surface faster as the executive would suddenly get a laborers wage and squawk real loud. Further, in the BOL example where all rows contain personally identifiable information (PII) stored in clear text how much protection does encrypting one column provide anyway?
The “Authenticators” page referenced above seems to be removed from the SQL Server 2012 BOL. That is too bad. It needed to be strengthened not removed. Instead the keyword is now indexed to the EncryptByKey() help page – that admittedly already had much of the same information, and I certainly understand the need to consolidate. However, the recommendation on the new landing page is somewhat different:
We recommend using a column containing a unique and unchanging value as the authenticator.
I agree the authenticator is easier to use if unchanging but it still goes too far to one extreme to be the default usage or a dangerous recommendation without understanding the user data. Just as there are other possibilities as valid as using the primary key, there are other possibilities than a unique and unchanging value. Yes, there is protection from a cut-n-paste attack if a unique value is used as the authenticator. Is this also true if the letter ‘a’ is always appended to the value before encrypting. Close, but not quite. The encryption operation will a have a randomizing effect that could thwart the dictionary attacker by obfuscation even when a static salt is added, but it is portable to that column in any row that contains a value that can be decrypted using the same authenticator value – even without any nasty hacker gyrations like trying to swap key values. This easily justifies the additional processing required to increase the uniqueness of the authenticator when compared to using a static value like “a”, but it does not have to be unique to realize the protection. Other values with high selectivity are also potential candidate authenticators. The less unique the authenticator the more resistant to dictionary attack in the stolen database scenario, the more unique the more resistant to a substitution attack. Somebody must finally decide the balance. Saves a lot of time if salt is understood and decided before development begins.
The requirement that the value used for an authenticator cannot change too is a bit over zealous. All that is absolutely necessary is to re-create the cypher text using the authenticator’s new value when ever the authenticator column is updated. Not something you would want to do many times a second, but of far less consequence if done in a singleton query with execution frequencies as high as 1000-2000 times a day. The question to answer is how often the authenticator column is updated for all rows, not whether or not it is updated. Remember, the algorithm itself does the randomizing, not the uniqueness of the authenticator.
There is also a concern to me that when the authenticator is a clear text value stored with the data – regardless how unique -swapping the encrypted value and the authenticator or alternately all columns but those two could still be an effective attack. Furthermore, if the database or a backup is stolen, even a brute force attack could benefit from awareness that there are unique un-encrypted column(s) in the row – and even more if it is known which column is used as the authenticator. It is even something of a leg up for the database thief that the SQL Server documentation makes that exclusive recommendation. Doesn’t that make this configuration a likely vector for attackers? Maybe this has something to do with the removal of that “Authenticators” BOL page? I mean, if you use salt but it is not a secret, exactly how is that risk less than not using salt? In particular in the stolen database scenario?
Imagine now that the first developer to write T-SQL encryption for an application either thinks a bit like me (poor bastard) or has no idea what to use for an authenticator so picks the only column that can be used as an authenticator parameter without a type conversion. For example, the Standard_Title column is chosen from the salary swapping example’s table included in the documentation pages referenced above instead of the primary key Employee_Id column. Standard Corporate Job Titles are not very likely to change and there is not likely to be a large number of folks with the same job title – but it can and does happen that people may share a job title. Then along comes someone that spots the discrepancy between what was done and what Microsoft, another authority, or a compliance department recommended/require. Of course several people decide to weigh in on the matter and eventually someone decides what to do from there. Regardless the answer, the resulting interruption takes some time and slows the general progress of the application. The simple application of cryptograpic science and a standard of excellence – as opposed to the more common and proven not secure “good enough” software practices of our time – from conceptual design and then throughout the software lifecycle are essential. Making such a change after the application is live will bring distress to the application and will almost certainly be instigated by a breach.
I intend to dig more into choosing and using authenticators in the next post in the series. This post must first turn more toward the burdens T-SQL (all data layer?) encryption adds to foundational architectural decisions and application design: specifically in regard to business continuity planning. This planning should be business reason focused and may include:
- standards and compliance targets
- criteria for identifying crypto-technology candidates
- communication plan
- division of responsibility/ownership
- forward-recovery expectations
- key management
The Open Web Application Security Project’s (OWASP) “Cryptographic Storage Cheat Sheet“, for an example, may be a good general instrument to kick off the discussion.
Assuring the ability to change keys as required is essential. Disaster recovery is important but there are many requirements for key-changes that are not calamity based, e.g. key lifecycle, SLAs, compliance, key expiration, organizational security best practices, etc.; and some calamities requiring a key change that need not be a disaster, e.g., key corruption, newly uncovered vulnerabilities, intruders, hardware failures, etc..
When a column in the row is used as the authenticator, all the machine pieces required to reverse engineer the cipher text are included in the database. And all full database backups also include the encryption key, the encrypted value and the authenticator. There may still be protection by passphrase, however that is most likely the easiest value to determine brutally. This is precisely the reason the database is not often the stage for encryption of application values. It is worthwhile to consider authorizing access to the encryption hierarchy through physical presentation of a fob, blob, key file or other public/private key-pair(s) that will never spend time in SQL Server. This affords more protection than a phrase but can also add to the security burden and certainly adds risk to recover-ability.
One strength of the T-SQL authenticator parameter is that a hash of the authenticator value is encrypted with the value. It is not clear how this hashing operation compares to a HASHBYTES() operation. It is clear only that the relationship between the authenticator value and the encrypted value has to be based upon some system generated one-way digest of the authenticator so is more random than simple concatenation of the character ‘a’ as salt or authenticator. An attacker may also need to reproduce the hashing operation applied to the authenticator to crack the encryption whether the key or any other value – no matter how static or random – is used as the authenticator. For better and worse, hashes are deterministic digests so results are reproduce-able on another SQL Server instance – if not anywhere and on-demand.
The completeness of the cryptographic keys revealed can be used to seed an attacker’s dictionary automation if the database is successfully stolen, though few details about the authenticator and the hashing operation used are exposed superficially. If the database is stolen or compromised with elevated authority access, all data – encrypted or not- should be considered compromised and appropriate action initiated.
Once a malicious user is able to connect to the database, especially with elevated authority, prevention of data theft is unlikely. Layers of authentication, authorization, firewalls, proxies and intrusion detection external to the database server are necessary. Audit , including Common Criteria or C2, and alerts offer only after-the-fact forensic protection and can also therefore also be manipulated maliciously from the inside with adequately elevated authority. It’s the old put a picture of an unoccupied but brimming full bank vault in front of the security camera while you clean the place out trick.
The best protection once data has been stolen or compromised through a valid database connection is cell level encryption… and perhaps self-destructing data – what ever that is. Knowing what needs to be done to change keys and how long it will take to re-key and get your application back on line when compromised is of utmost importance to close a breech in a live environment.
An example might be the Zappos hack of Jan 16, 2012 concerning the compromise of the address and last 4 digits of credit card from apparently all(?) 24 million customers. Check out the coverage in the Onion. Sounds like they only got unencrypted data to me. Maybe even only a replica in the DMZ if they are lucky? No way to tell for sure from here. People cannot change their address, the cards will all surely have to be replaced even if the encrypted credit card numbers have not also been handed over. What good did it do to encrypt any data when enough to steal the identity was left exposed? Did Zappos notify the 24 million customers directly and if so what did they say to them? Or were these people left to find out the next time they swiped the card after being invalidated that the compromised account could not be used? Or did Zappos just wait for the embarrassment to die down and then blow it off to leave the 24 million to never be able to prove this breech had anything to do with future identity theft nightmares?
The ability to execute a complete coordinated re-key (aka HJ or Hotel Juliet to cold war cryptographers in the mid 20th century) is often among the best if not only available actions to close a breech. Application incorporating planned end-to-end key rotation at the onset are superior in their ability to re-key on demand. Retrofitting a key rotation plan is likely to involve some time offline.
The need to figuring out how to rekey while under attack may be dooming. With the database there is no possibility to go high & blind like they did back in the day to reset the keys manually. A forced re-key, i.e. without access to the old key, when data is stored in the database, will result in loss of all encrypted data. A safe re-key must re-encrypt cipher text using the old key to decrypt and the new key to encrypt. This can be a long running operation that may leave the database temporarily degraded or possibly temporarily unusable.
The need and urgency of communicating the breach to the data owners/customers is a clear and unequivocal responsibility. Without an established protocol this can be chaotic and often destructive for the organization that has been breached. Much better outcomes can be expected if individual responsibilities in such communications are clearly laid out in the business continuity plan.
All Planning Begins Now – Planning to Plan is not a Plan
The bottom line is the same here as with most software design issues. Decisions made early are highly advantageous if correct or correct-able and very costly if incorrect or delayed. Depending upon the critical-ity and the level of malicious interest in the data, a bad decision can be fatal. For the same reasons, no security strategy should be the work of one mind or be implemented over night. The more perspectives that can be shed and experience gained with this topic before and while software is built, the more likely the end result will meet the requirements. Key rotation must be a measurable and measured task from the onset.
The front loaded thought of planning can consider many recovery scenarios. A retrofit key change instead must often have as its primary goal figuring out a way to pick up the pieces and get the application running again pronto. Another – perhaps more obtuse – way to say it:
- If you don’t need a business continuity focused forward recovery plan for your application, it is hard to imagine why you would bother with T-SQL encryption.
- If you don’t have a business continuity plan but need a forward recovery plan work on fleshing out the forward recovery plan as – or better yet before – you take on T-SQL encryption.
Its also hard to image how a forward recovery plan for the application could be devised without an overarching business/organizational continuity plan unless the organization can absorb a whole lot of quickly made and often risky assumptions when under attack.
Forward Recovery Considerations by Key Type
Service Master Key (SMK) – Even applications that do not think they use encryption in SQL Server may reply upon the SMK to encrypt linked server passwords. Backup of the SMK may be critical to forward recovery and should be one of the first post installation steps for most SQL Servers. DMKs, Certificates and Asymmetric keys need not be encrypted into the SQL Server Encryption Hierarchy, though the SMK will always be used to CREATE MASTER KEY of the database before any other encryption objects other than the DEK can be created in that database. The DMK can be configured without SMK encryption, but the CREATE DDL will only and always encrypt a copy of the DMK into the master database and use the provided phrase or key to encrypt a copy into the database, then an ALTER MASTER KEY DROP ENCRYPTION BY SERVICE MASTER KEY can remove the copy in master. (The documentation shows bogus options for dropping the encryption by phrase. It doesn’t work.)
In SQL Server 2012 the DPAPI uses an AES encryption algorithm. Does this mean SQL Server 2012 will use AES-NI (Xeon onboard hardware acceleration)? It is also an interesting upgrade scenario. An in-place upgrade to SQL Server 2012 RC0 leaves the old 3DES algorithm’s SMK in place. To upgrade to AES it is necessary to run ALTER SERVICE MASTER KEY REGENERATE. This can be a long running operation if TDE is enabled. Otherwise it could just be a matter of re-encrypting the copies of DMKs that are stored in master.
A side-by-side upgrade may be able keep most of the REGENERATE time off line if the re-key is coordinated with the migration. A simple example scenario would be to keep the old server on line in read-only mode for the REGENERATE phase, then go off-line a moment to switch to the new server when the upgrade is complete. It could well be that no matter what, if you want AES encryption after an upgrade there is going to be some down-time. Consider yourself warned on that one. This is a good reason to use the same process that is used for a HJ event and for normal periodic REGENERATEs in the upgrade process.
DES has been vulnerable a long time. Triple DES is better but growing weaker by the day. In fact the expected life of AES was just over 15 years when they started thinking about it over 10 years ago and blind quantum mechanics – where the data is encoded onto photons – in cryptography is on the horizon. Today we can fit a bit in 12 bytes on an atom in the lab. More like a million bytes on your hard drive.
Consider a quarterly test restore of the SMK beginning soon after the application is ready for an integration testing environment and continue until the last row of sensitive data is no longer needed. That is too often to test a live system but the live system should still be tested to make sure the key backup is good each time it is REGENERATED. Once a year may not be often enough to REGENERATE to assure an aggressive attacker breaks your encryption. By the same token, once a week may not be enough. The lack of transparency and resistance to collaboration/cooperation around known cryptographic vulnerabilites leave it difficult to really know for sure.
Database Encryption Key (DEK) – The DEK is encrypted by a certificate or asymmetric key in the master database. That certificate or key can be built with the SQL Server encryption hierarchy (SMK–>DMK–>private key of cert or asymmetric key), or from a file source(s), or externally managed key hierarchies using EKM.
Regeneration is always an online operation for the DEK. The documentation warns about a potential performance hit, but when data change happens one row at a time the added overhead may go unnoticed on a properly sized SQL Server. Pretty sure I read somewhere it is expected to be a 5-10% hit on CPU. Compare that to an ALTER INDEX REBUILD of about the same number of bytes as the data that must be re-encrypted in the ALTER MASTER KEY REGENERATE. Just a guess though. I did see Kevin Finke’s post reporting about 46GB an hour to regenerate a DEK on SQL Server 2008 in late 2009, unknown hardware. He does not describe performance issues. Instead he shows the query he used to monitor the progress.
Like everything else about TDE, key re-generation is relatively painless compared to real cell level encryption. I hope only that data owners and DBAs are fully cognizant of the features and foibles of TDE…
One thing I really like about TDE is that you have to have the certificate used to create the DEK to restore the database on another server. All backups are encrypted when TDE is enabled in the database. And no matter where they are stored. This is about as safe as SQL Server databases backups can get. I say use it if you have it, but also consider file system encryption if you don’t. File system encryption can actually provide more protection. Everything, including swap files, cores, backups and all other user created files on the server that might unexpectedly expose the clear text of cipher text or other secrets when the server is shut down even thought the database backups cannot be used freely and anywhere. The biggest downside to file system encryption is that if a file if copied to an unencrypted network location the encryption is lost.
Database Master Key (DMK) – The DMK shares no direct dependencies with the DEK. The DMK is always encrypted into the database by a phrase. As mentioned above, a copy of the key is encrypted into master by the SMK. This copy enables the DMK to automatically open when needed. Without the copy in master, the phrase must always be provided to open the DMK and use and cell encryption certificate that are encrypted by the DMK into the database. Conversely, the database without a DMK copy in master can be restored and used on another SQL Server and can provide privacy even from members of the sysadmin fixed server.
I have seen a few troubling search strings seeking information about the DMK that managed to find on my blog. Among them I can recall off hand:
“deadlock symmetric_key_id = 101”
“steal database master key”
Hopefully nothing to be concerned about. I certainly have not experienced a DMK deadlock to my knowledge. Nor I have I seen one on sys.locks that I can recall. Of course to steal one all you need is ALTER DATABASE permissions to execute a BACKUP MASTER KEY or administrator access to the server.
A REGENERATE of a DMK will re-encrypt all dependent certificates, and keys. This too can be a long running process.
Like the DEK, the application should not have to interact directly with the DMK unless the copy in master is dropped. In that case it will have to be opened with the phrase before certificates encrypted by DMK can be opened.
Certificates – T-SQL Certificates are X.509 public key/private key pairs. The BACKUP CERTIFICATE DDL is similar to the SMK and DMK. The big difference is that there are two files output from a certificate export: one for the public key, one for the private key.
DECRYPTBYCERT() and VERIFYSIGNEDBYCERT() requires only the public key. Encryption and Signing need to used the private key. Other T-SQL methods for checking signatures (IS_OBJECTSIGNED, sys.fn_check_object_signatures() ) may only be looking in the system catalog access to verify a signature. The private key must first be opened to SIGNEBYCERT() or ADD SIGNATURE.
If you only need a certificate for DECRYPTBYCERT() it is much safe to export the certificate then drop the private key. This alone can thwart the database thief from learning the secrets. For a certificate used to verify signatures, never drop the private key.
CREATE CERTIFICATE accepts a user specified [StartDate] and [ExpiryDate] however SQL Server does not do anything with the values other than store them and expose them. With T-SQL encryption, if you want rotation automation you must supply the mechanism to use these attributes. Not so much to know when it is time to rotate as a way to tell which is active in scenarios where it makes sense to rotate keys without the need to REGENERATE by using ‘virtual partitions’ by certificate, perhaps aligned with database PARTITIONs. The application will need to know which certificate is active only to encrypt data and for signing. T-SQL DECRYPTBYCERT and DECRYPTBYKEY will use the the correct key – but only if it is already open or does not need to be opened before use.
A crypto-period is an interval of time that a key should be available for use before it is considered too likely it has been compromised. Ideally this would be the information placed in the certificate’s date attributes A crypto-period is often taken – somewhat off-hand – to be 1 year. If that ever was more effective than the “duck and cover” nuclear defense strategy I was taught in second grade, it is not at this time. If I understand the evolution of things correctly, it looks like 1 year may originally have come from the arbitrary decision by the Certificate Authorities (CA) to collect an annual fee. While one year may have never have had much empirical relevance to key compromise risk, annual renewal is still often the norm, and still the fee interval. hmmm. I have a feeling the right interval for many (most?) security focused expirations could be somewhere on the short side of annually these days. That was probably way to frequent at the time the practice started. (Moore’s Law right?)
The strength of any cryptographic key is statistically known as it’s Shannon Entropy. From that entropy, the time to guess the clear text of a cipher by brute force can be estimated. Just for example, Wikipedia estimates the time to unscramble data when a 56 bit encryption key is used as 1 second, the same algorithm with a 64bit key about 4 and half minutes and a 128 bit key almost 150 trillion years! This is oversimplified, but the far less costly 64 bit may still be useful for data that must live only a few seconds if the performance advantage out-weights the security risk. based upon the Wikipedia calculation I am unable to even fabricate a valid scenario for a 56 bit key. However if the data is compelling to the attacker, parallel processing from a rack of comparable hardware can likely cut all the times. Additionally, the reality is that some 128 bit algorithms are in fact already broken: most through some variant of the padding oracle attack and/or government mandated backdoors. Conversely, Seagate argues that 128 bit AES encryption is about equally beyond the ability of current technology to crack as 256 bit AES. That is evidence that the algorithm as a major role in key strength. And I have every reason to believe that there may be a better way than “the stronger and longer the better” some day. Until there are application performance concerns related to encryption overhead, “the stronger and longer the better” seems a better rule of thumb as far as I can tell. One thing is more clear: AES is the most widely recognized by government and industry standards. Still, I will not be surprised when someone publishes a previously unknown 60 second AES hack that works.
Symmetric Keys – Symmetric keys can be encrypted by certificates, asymmetric keys or even other symmetric keys. These methods produce a key that cannot be moved to another SQL Server unless the entire database is moved. Symmetric keys can also be created by specifying source and identity attributes. This technique can be used to produce the same symmetric key on two or more instances or 2 or more database of the same instance. However, this is not recognized – even by the often overly generous BOL documentation – for producing a key with particularly strong entropy.
I repeat, there is no BACKUP SYMMETRIC KEY statement in T-SQL. Don’t plan on moving a symmetric key to another instance unless you move the complete database. Symmetric keys are remarkably faster than encryption/decryption by certificate.
Asymmetric Keys – Asymmetric keys are similar to certificates. I don’t think this is true in other settings, but in SQL Server the asymmetric key is the functional equivalent of the certificate. They require different sources from which the public/private key pair is built. Certificates must be imported from X.509 certificate files while asymmetric keys can come from strong named assemblies, executables, key files and CRYPTOGRAPHIC PROVIDERS (EKM). Unlike the certificate, an asymmetric key is not an x.509 compliant key pair format. Like the certificate the asymmetric key is and RSA public/private key pair. Like the symmetric key, there is no BACKUP ASYMMETRIC KEY statement.
This Laurentiu Cristofor SQL 2005 SQL Security blog post recommends certificates over asymmetric keys mostly because the certificate can be exported. Occurs to me that exports can be an advantage for both forward recover-ability and the data thief.
No matter what the the technical details for any patterns used to encrypt, sign or hash a cell it is critical to first establish the overarching forward recover-ability guidance. Skipping this step guarantees a waterfall methodology for data protection. That is about where we are today: locked in an endless dance between breaches and patches that (try to) protect as much as possible against the last beach. The details may not be reachable in early design but without a framework they may never be reached.
One critical benefit of planning your stay at the Hotel Juliet is a chance to experience, and ideally eliminate, your weaknesses before you lose your data. A significant improvement over what most have now.