T-SQL Cryptographic Patterns – part 2: Hotel Juliet

SQL Server 2008 R2 Books Online clearly spells out why to consider the primary key as the authenticator when encrypting a column value in SQL Server:

If an authenticator parameter is specified when data is encrypted, the same authenticator is required to decrypt the data. At encryption time, a hash of the authenticator is encrypted together with the plaintext. At decryption time, the same authenticator must be passed to the decryption function. If the two do not match, the decryption will fail, indicating that the value has been moved since it was encrypted. We recommend using the primary key of the table in which the result will be stored as an authenticator. – http://msdn.microsoft.com/en-us/library/ms365192(v=sql.105).aspx 

The authenticator effectively protects the data from substitution attack vectors. The encrypted column in one row cannot be substituted for that same column from another row if a unique authenticator column is used, e.g. the primary key, unless of course the attacker(s) know to move the primary key to. Granted, that has it’s own set of challenges but nonetheless can be done. Or, in some cases, could not the attack swap all columns except the encrypted column and the primary key to successfully perpetrate the attack this is supposed to prevent? It would surely come to the surface faster as the executive would suddenly get a laborers wage and squawk real loud. Further, in the BOL example where all rows contain personally identifiable information (PII) stored in clear text how much protection does encrypting one column provide anyway?

The “Authenticators” page referenced above seems to be removed from the SQL Server 2012 BOL. That is too bad. It needed to be strengthened not removed. Instead the keyword is now indexed to the EncryptByKey() page – that admittedly already had much of the same information, and I certainly understand the need to consolidate. The recommendation on the new landing page is somewhat different:

We recommend using a column containing a unique and unchanging value as the authenticator.

I agree the authenticator must be unchanging but it still goes too far to one extreme to be the default usage or a dangerous recommendation without understanding the user data. Just as there are other possibilities as valid as using the primary key, there are other possibilities than a unique and unchanging value. Yes, there is protection from a cut-n-paste attack if a unique value is used as the authenticator. Is this also true if  the letter ‘a’ is always appended to the value before encrypting. Not quite. The encryption operation will a have a randomizing effect that could thwart the dictionary attacker by obfuscation even when a static salt is added, but it is portable to that column in any row that contains a value that can be decrypted using the same authenticator value – even without any nasty gyrations like trying to swap key values. This easily justifies the additional processing required to increase the uniqueness of the authenticator when compared to using a static value like “a”, but it does not have to be unique to realize the protection. Other values with high selectivity are also candidate authenticators. The less unique the authenticator the more resistant to dictionary attack in the stolen database scenario, the more unique the more resistant to a substitution attack. Somebody must finally decide the balance. Saves a lot of time if that happens before development begins.

The requirement that the value used for an authenticator cannot change too is a bit over zealous. All that is absolutely necessary is to re-create the cypher text using the authenticator’s new value when ever the authenticator column is updated. Not something you would want to do many times a second, but of far less consequence if done in a singleton query 1000-2000 times a day. The question to answer is how often the authenticator column is updated for all rows, not whether or not it is updated. Remember, the algorithm itself does the randomizing, not the uniqueness of the authenticator.

There is also a concern to me that when the authenticator is a clear text value stored with the data – regardless how unique -swapping the encrypted value and the authenticator or alternately all columns but those two could still be an effective attack. Furthermore, if the database or a backup is stolen, even a brute force attack could benefit from awareness that there are unique un-encrypted column(s) in the row – and even more if it is known which column is used as the authenticator. It is even something of a leg up for the database thief that the SQL Server documentation makes that exclusive recommendation. Doesn’t that make this configuration a likely vector for attackers? Maybe this has something to do with the removal of that Authenticators page for the documentation? I mean, if you use salt but it is not a secret, exactly how is that different than not using salt? And in particular in the stolen database scenario?

Imagine now that the first developer to write T-SQL encryption for an application either thinks a bit like me (poor bastard) or has no idea what to use for an authenticator so picks the only column that can be used as an authenticator parameter without a type conversion. For example, the Standard_Title column is chosen from the salary swapping example’s table included in the documentation pages referenced above instead of the primary key Employee_Id column. Standard Corporate Job Titles are not very likely to change and there is not likely to be a large number of folks with the same job title – but it can and does happen that people may share a job title. Then along comes someone that spots the discrepancy between what was done and what Microsoft recommended. Of course several people decide to weigh in on the matter and eventually someone decides what to do from here. Regardless the resulting interruption takes some time and slows the general progress of the application. Making such a change after the application is live can bring even more pain.

I intend to dig more into choosing and using authenticators in the next post in the series. This post must first turn more toward the burdens T-SQL encryption adds to foundational architectural decisions and application design: specifically in business continuity planning. This planning should be business reason focused and may include:

    • standards and compliance targets
    • criteria for identifying crypto-technology candidates
    • communication plan
    • division of responsibility/ownership 
    • forward-recovery expectations
    • SLAs
    • key management

The  Open Web Application Security Project’s (OWASP) “Cryptographic Storage Cheat Sheet“, for an example, may be a good general instrument to kick off the discussion.

Assuring the ability to change keys as required is essential. Disaster recovery is important but there are many requirements for key-changes that are not calamity based, e.g. key lifecycle, SLAs, compliance, key expiration, organizational security best practices, etc.; and some calamities requiring a key change that need not be a disaster, e.g., key corruption, newly uncovered vulnerabilities, intruders, hardware failures, etc..      

When a column in the row is used as the authenticator, all the machine pieces required to reverse engineer the cipher text are included in the database. And all full database backups also include the encryption key, the encrypted value and the authenticator. There may still be protection by passphrase, however that is most likely the easiest value to determine brutally. This is precisely the reason the database is not often the stage for encryption of application values. It is worthwhile to consider authorizing access to the encryption hierarchy through physical presentation of a fob, blob, key file or other public/private key-pair(s) that will never spend time in SQL Server. This affords more protection than a phrase but can also add to the security burden and adds some risk to recoverability. 

One strength of the T-SQL authenticator parameter is that a hash of the authenticator value is encrypted with the value. It is not clear how this hashing operation compares to a HASHBYTES() operation. It is clear only that the relationship between the authenticator value and the encrypted value has to be based upon some system generated one-way digest of the authenticator so is more random than simple concatenation of the character ‘a’ as salt or authenticator. An attacker may also need to reproduce the hashing operation applied to the authenticator to crack the encryption whether the key or any other value – no matter how static or random – is used as the authenticator.  For better and worse, hashes are deterministic digests so results are reproduce-able on another SQL Server instance – if not anywhere and on-demand.

The completeness of the cryptographic keys revealed can be used to seed an attacker’s dictionary automation if the database is successfully stolen, though few details about the authenticator and the hashing operation used are exposed superficially. If the database is stolen or compromised with elevated authority access, all data – encrypted or not- should be considered compromised and appropriate action initiated. 

Once a malicious user is able to connect to the database, especially with elevated authority, prevention of data theft is unlikely. Layers of authentication, authorization, firewalls, proxies and intrusion detection external to the database server are necessary. Audit , including Common Criteria or C2, and alerts offer only after-the-fact forensic protection and can also therefore also be manipulated maliciously from the inside with adequately elevated authority. It’s the old put a picture of an empty bank vault in front of the security camera while you clean the place out trick. 

The best protection once data has been stolen or compromised through a valid database connection is cell level encryption… and perhaps self-destructing data – what ever that is.  Knowing what needs to be done to change keys and how long it will take to re-key and get your application back on line when thusly compromised is of utmost importance to stop the breech in a live environment.  

An example might be the Zappos breech of Jan 16, 2012 concerning the compromise of  the address and last 4 digits of credit card from apparently all(?) 24 million customers. Check out the coverage in the Onion. Sounds like they only got unencrypted data to me. Maybe even only a replica in the DMZ if they are lucky? No way to tell for sure from here. People cannot change their address, the cards will all surely have to be replaced even if the encrypted credit card number was not also stolen. What good did it do to encrypt any data when enough to steal the identity was left exposed?

What I still wonder is if they notified the 24 million customers directly and if so what they said them? Or were these people left to find out the next time they swiped the card after being invalidated?

The ability to execute a complete coordinated re-key (aka HJ or Hotel Juliet to cold war cryptographers in the mid 20th century) is often among the best if not only available actions to close a breech. Application incorporating planned key rotation at the onset are superior in their ability to re-key on demand. Retrofitting a key rotation plan is likely to involve some time offline. 

The need to figuring out how to rekey while under attack may be dooming. With the database there is no possibility to go high & blind like they did back in the day to reset the keys manually. A forced re-key, i.e. without access to the old key, when data is stored in the database, will result in loss of all encrypted data. A safe re-key must re-encrypt cipher text using the old key to decrypt and the new key to encrypt. This can be a long running operation that may leave the database temporarily degraded or possibly temporarily unusable.

The need and urgency of communicating the breach to the data owners/customers is a clear and unequivocal responsibility.  Without an established protocol this can be chaotic and often destructive for the organization that has been breached. Much better if individual responsibilities in such communications are clearly laid out in the plan.

Planning  Begins Now

The bottom line is the same here as with most software design issues. Decisions made early are highly advantageous if correct or correct-able and very costly if incorrect or delayed. Depending upon the criticality and  level of malicious interest in the data, a bad decision can be fatal. No security strategy should be the work of one mind or be implemented over night. The more perspectives that can be shed and experience gained with this topic before and while software is built, the more likely the end result will meet the requirements. Key rotation should be a measurable and measured task from the onset.

The front loaded thought of planning can consider many recovery scenarios.  A retrofit key change instead must often have as its primary goal figuring out a way to pick up the pieces and get the application running again pronto. Another – perhaps more obtuse – way to say it:

  • If you don’t need a business continuity focused forward recovery plan for your application, it is hard to imagine why you would bother with T-SQL encryption.
  • If you don’t have a business continuity plan but need a forward recovery plan work on fleshing out the forward recovery plan as – or better yet before – you take on T-SQL encryption.

Its also hard to image how a forward recovery plan for the application could be devised without an overarching business/organizational continuity plan unless the organization can absorb a whole lot of quickly made and often risky assumptions.

Forward Recovery Considerations by Key Type

Service Master Key (SMK) – Even applications that do not think they use encryption in SQL Server may reply upon the SMK to encrypt linked server passwords. Backup of the SMK may be critical to forward recovery and should be one of the first post installation steps for most SQL Servers. DMKs, Certificates and Asymmetric keys need not be encrypted into the SQL Server Encryption Hierarchy, though the SMK will always be used to CREATE MASTER KEY of the database before any other encryption objects other than the DEK can be created in that database. The DMK can be configured without SMK encryption, but the CREATE DDL will only and always encrypt a copy of the DMK into the master database and use the provided phrase or key to encrypt a copy into the database, then an ALTER MASTER KEY DROP ENCRYPTION BY SERVICE MASTER KEY can remove the copy in master. (The documentation shows bogus options for dropping the encryption by phrase. It doesn’t work.)

In SQL Server 2012 the DPAPI uses an AES encryption algorithm. Does this mean SQL Server 2012 will use AES-NI (Xeon onboard hardware acceleration)? It is also an interesting upgrade scenario. An in-place upgrade to SQL Server 2012 RC0 leaves the old 3DES algorithm’s SMK in place. To upgrade to AES it is necessary to run ALTER SERVICE MASTER KEY REGENERATE. This can be a long running operation if TDE is enabled. Otherwise it could just be a matter of re-encrypting the copies of DMKs that are stored in master.

A side-by-side upgrade may be able keep most of the REGENERATE time off line if the re-key is coordinated with the migration. A simple example scenario would be to keep the old server on line in read-only mode for the REGENERATE phase, then go off-line a moment to switch to the new server when the upgrade is complete. It could well be that no matter what, if you want AES encryption after an upgrade there is going to be some down-time. Consider yourself warned on that one. It would be good for the business continuity forward recovery plan if it were used  co Try to use the same process that is planned for periodic REGENERATEs.

DES has been vulnerable a long time. Triple DES is better but growing weaker by the day. In fact the expected life of AES was just over 15 years when they started thinking about it over 10 years ago and blind quantum mechanics – where the data is encoded onto photons – in cryptography is on the horizon. Today we can fit a bit in 12 bytes on an atom in the lab. More like a million bytes on your hard drive.

Consider a quarterly test restore of the SMK beginning soon after the application is ready for an integration testing environment and continue until the last row of sensitive data is no longer needed. That is too often to test a live system but the live system should still be tested to make sure the key backup is good each time it is REGENERATED. Once a year may not be often enough to REGENERATE to assure an aggressive attacker breaks your encryption… 

Database Encryption Key (DEK) – The DEK is encrypted by a certificate or asymmetric key in the master database. That certificate or key can be built with the SQL Server encryption hierarchy (SMK–>DMK–>private key of cert or asymmetric key), or from a file source(s), or externally managed key hierarchies using EKM.

Regeneration is always  an online operation for the DEK. The documentation warns about a potential performance hit, but when data change happens one row at a time the added overhead may go unnoticed on a properly sized SQL Server. Pretty sure I read somewhere it is expected to be a 5-10% hit on CPU. Compare that to an ALTER INDEX REBUILD of about the same number of bytes as the data that must be re-encrypted in the ALTER MASTER KEY REGENERATE. Just a guess though. I did see Kevin Finke’s post reporting about 46GB an hour to regenerate a DEK on SQL Server 2008 in late 2009, unknown hardware. He does not describe performance issues. Instead he shows the query he used to monitor the progress.

Like everything else about TDE, key re-generation is relatively painless compared to real cell level encryption. I hope only that data owners and DBAs are fully cognizant of the features and foibles of TDE

One thing I really like about TDE is that you have to have the certificate used to create the DEK to restore the database on another server. All backups are encrypted when TDE is enabled in the database. And no matter where they are stored. This is about as safe as SQL Server databases backups can get. I say use it if you have it, but also consider file system encryption if you don’t. File system encryption can actually provide more protection. Everything, including swap files, cores, backups and all other user created files on the server that might unexpectedly expose the clear text of cipher text or other secrets when the server is shut down even thought the database backups cannot be used freely and anywhere. The biggest downside to file system encryption is that if a file if copied to an unencrypted network location the encryption is lost.

Database Master Key (DMK) – The DMK shares no direct dependencies with the DEK. The DMK is always encrypted into the database by a phrase. As mentioned above, a copy of the key is encrypted into master by the SMK. This copy enables the DMK to automatically open when needed. Without the copy in master, the phrase must always be provided to open the DMK and use and cell encryption certificate that are encrypted by the DMK into the database. Conversely, the database without a DMK copy in master can be restored and used on another SQL Server and can provide privacy even from members of the sysadmin fixed server.

I have seen a few troubling search strings seeking information about the DMK that managed to find on my blog. Among them I can recall off hand:

“deadlock symmetric_key_id = 101″

“steal database master key”

Hopefully it is nothing to be concerned about. I certainly have not experienced a DMK deadlock to my knowledge. Of course to steal one all you need is ALTER DATABASE permissions to execute a BACKUP MASTER KEY.

A REGENERATE of a DMK will re-encrypt all dependent certificates, and keys. This too can be a long running process.

Like the DEK, the application should not have to interact directly with the DMK unless the copy in master is dropped. In that case it will have to be opened with the phrase before certificates encrypted by DMK can be opened.

Certificates – T-SQL Certificates are X.509 public key/private key pairs. The BACKUP CERTIFICATE DDL is similar to the SMK and DMK. The big difference is that there are two files output from a certificate export: one for the public key, one for the private key.

DECRYPTBYCERT() and VERIFYSIGNEDBYCERT() requires only the public key. Encryption and Signing need to used the private key. Other T-SQL methods for checking signatures (IS_OBJECTSIGNED, sys.fn_check_object_signatures() ) may only be looking in the system catalog access to verify a signature. The private key must first be opened to SIGNEBYCERT() or ADD SIGNATURE.

If you only need a certificate for DECRYPTBYCERT() it is much safe to export the certificate then drop the private key. This alone can thwart the database thief from learning the secrets. For a certificate used to verify signatures, never drop the private key.  

CREATE CERTIFICATE accepts a user specified [StartDate] and [ExpiryDate] however SQL Server does not do anything with the values other than store them and expose them. With T-SQL encryption, if you want rotation automation you must supply the mechanism to use these attributes. Not so much to know when it is time to rotate as a way to tell which is active in scenarios where it makes sense to rotate keys without the need to REGENERATE by using ‘virtual partitions’ by certificate, perhaps aligned with database PARTITIONSs. The application will need to know which certificate is active only to encrypt data and for signing. T-SQL DECRYPTBYCERT and DECRYPTBYKEY will use the the correct key – but only if it is already open or does not need to be opened before use.   

A crypto-period is an interval of time that a key should be available for use before it is considered too likely it has been compromised. Ideally this would be the information placed in the certificate’s date attributes A crypto-period is often taken – somewhat off-hand – to be 1 year. If that ever was more effective than the “duck and cover” nuclear defense strategy I was taught in second grade, it is not at this time. If I understand the evolution of things correctly, it looks like 1 year may originally have come from the arbitrary decision by the Certificate Authorities (CA) to collect an annual fee. While one year may have never have had much empirical relevance to key compromise risk, annual renewal is still often the norm, and still the fee interval. hmmm. I have a feeling the right interval for many (most?) security focused expirations could be somewhere on the short side of annually these days. That was probably way to frequent at the time the practice started. (Moore’s Law right?)

The strength of any cryptographic key is statistically known as the Shannon Entropy. From that entropy, the time to guess the clear text of a cipher by brute force can be estimated. Just for example, Wikipedia estimates the time to unscramble data when a 56 bit encryption key is used as 1 second, the same algorithm with a 64bit key about 4 and half minutes and a 128 bit key almost 150 trillion years! This is oversimplified, but the far less costly 64 bit may still be useful for data that must live only a few seconds if the performance advantage out-weights the security risk. I am unable to even fabricate a valid scenario for a 56 bit key. However if the data is compelling to the attacker, parallel processing can cut all the times. Additionally, the reality is that some 128bit algorithms are in fact already broken: most through some variant of the padding oracle attack. Conversely, Seagate argues that 128 bit AES encryption is about equally beyond the ability of current technology to crack as 256 bit AES. That is evidence that the algorithm as a major role in key strength. And I have every reason to believe that there may be a better way than “the stronger and longer the better” some day. Until there are application performance concerns related to encryption overhead, “the stronger and longer the better” seems a better rule of thumb as far as I can tell. One thing is more clear: AES is the most widely recognized by government and industry standards. Still, I will not be surprised when someone publishes a previously unknown 60 second AES hack that works.        

Symmetric Keys – Symmetric keys can be encrypted by certificates, asymmetric keys or even other symmetric keys. These methods produce a key that cannot be moved to another SQL Server unless the entire database is moved. Symmetric keys can also be created by specifying the source and identity attributes. This technique can be used to produce the same symmetric key on two or more instances or 2 or more database of the same instance. However, this is not recognized – even by the often overly generous BOL documentation – for producing a key with particularly strong entropy.  

I repeat, there is no BACKUP SYMMETRIC KEY statement in T-SQL. Don’t plan on moving a symmetric key to another instance unless you move the complete database. Symmetric keys are remarkably faster than encryption/decryption by certificate.

Asymmetric Keys – Asymmetric keys are similar to certificates. I don’t think this is true in other settings, but in SQL Server the asymmetric key is the functional equivalent of the certificate. They require different sources from which the public/private key pair is built. Certificates must be imported from X.509 certificate files while asymmetric keys can come from strong named assemblies, executables, key files and CRYPTOGRAPHIC PROVIDERS (EKM). Unlike the certificate, an asymmetric key is not an x.509 compliant key pair format. Like the certificate the asymmetric key is and RSA public/private key pair. Like the symmetric key, there is no BACKUP ASYMMETRIC KEY statement.

This Laurentiu Cristofor SQL 2005 SQL Security blog post recommends certificates over asymmetric keys mostly because the certificate can be export. That is an advantage for both forward recoverability and the database thief.

-

No matter what the the technical details for any patterns used to encrypt, sign or hash a cell it is critical to first establish the overarching forward recoverability guidance. Skipping this step guarantees a waterfall methodology for data protection. That is about where we are today: locked in an endless dance between breaches and patches that (try to) protect as much as possible against the last beach. The details may not be reachable in early design but without a framework they may never be reached.

One critical benefit of planning your stay at the Hotel Juliet is a chance to experience, and ideally eliminate,  your weaknesses before you lose your data. A significant improvement over what most have now.   

Posted in Secure Data | Tagged | Leave a comment

T-SQL Cryptographic Patterns – Part 1: hashed, scrambled or over easy?

Even for the laggards among us, the time to recognize that we have not been building adequate security into most data stores is long past. Now is the time to set things straight. Predictably, mastery of the fundamentals is almost all of what is needed.

Consider the principals that are consistently left undercooked in database applications in purported pursuit of the performance and usability Sasquatch:

  • set-based processing
  • audit
  • principle of least privilege
  • defense-in-depth 
  • division of responsibility
  • protection for sensitive data

Not only is this application ruining bigfoot an indefensible justification for reduced security, but strong data security need not inhibit performance, usability or innovation. These principals are far too important to a good outcome to forego. We so easily get caught up in making and managing the organizational bias between strong security and best usability that we loose sight of the real possibility to have both. Seems to me that most of us would prefer to wait for the careful transaction at the bank or store or pharmacy  – even if impatiently – than to find out we are 5 bucks short or just swallowed a capsule of wart remover compound expecting it to be prescription pain medication. About the same is true for software. Methodical secure processing is the most direct route to secure results all other things being equal.

(Instead, two slight imperfections in the legacy from our base ancestral instincts expose our economic system as as unsustainable as any other. Greed and corruption rear their heads to value Capitalist dogma (e.g. cash flow, P/E  and me too features) above innovation and fundamental security obligations. Compliance becomes an ever more ineffective stand-in for security. The difference between leaders and shady followers, swindlers and psychopaths blurrs. But I digress…)      

Cryptographic methods will generally be leavened and baked into the security disciplines listed above to create patterns in the application. Cryptography from the Windows perspective is a set of low level system services known collectively as the CryptoAPI. SQL Server relies upon the CryptoAPI for all intrinsic encryption capabilities. The implementation of these services is an application decision. Furthermore – and almost without exception – if there is a choice between encrypting in the application layer or encrypting at the data layer using T-SQL, encrypting at the application layer has at least a couple of advantages in security. 

  1. Encrypting sensitive data at the moment of origination is the best moment to confidently prevent leaks of the clear text value.
  2. Encryption certificates and keys are stored with the encrypted data in SQL Server.

Data that originates at the database may indeed be better encrypted at the database. That is just not where most data comes from. Data that enters the system in bulk may also be candidate for encryption at the database layer but I have to wonder why it isn’t encrypted already if it needs to be encrypted upon bulk load? In this last case the larger security burden may be assuring the secure import of an encryption key to first decipher such data before scrambling it with the local key. Most data cannot quite meet the test of proximity to origin when encrypted at the database server. User entered data is best encrypted and decrypted only while in the context of a man-in-the-middle protected session where an authorized user enters, requests or otherwise requires access to the clear text data. To be in the moment of origination all transactions are best committed or rolled-back at or within the context of that user’s secure and encrypted session i.e. SSL, IPSEC, etc..

If a T-SQL encrypted database is stolen or compromised, the thief gets the keys and the data. EKM, and to a lesser extent keys that require a very strong and well protected phrase to open weaken this argument favoring application layer cell encryption via a secure encryption services layer (HSM, encryption appliance, certificate authority, etc.) over T-SQL with the keys stored in the database and encrypted under the Service Master Key. Another challenge for keys that live with the data is restore-recovery. Restores may be undermined or made more complex by necessary key rotation practices. Unresolvable overlaps in the division of responsibility may arise when encryption uses keys from the database. On the other hand key rotation can be easily automated for off hours and there will be no gaps in responsibility if at least some of the DBA staff controls both keys and data.

The CryptoAPI functionality available to T-SQL consumption is actually quite narrow when compared to the universe of properties, methods and objects exposed by the CryptoAPI. That is good and bad. It means there are limits to what can be done in T-SQL. It also means most developers will not find the learning curve to get productive and effective with T-SQL encryption too steep. Also T-SQL encryption’s advantage, it allows and supports piecemeal implementation and includes a number of features and options that may be appealing when tenable:

  • Familiar T-SQL object administration and DDL syntax
  • Built-in cryptographic functions
  • Selection of encryption algorithm 
  • Auto-decryption of the Database Master Key for authorized users by default   
  • Keys and certificates included in all full database backups  
  • More efficient utilization of CPU resources on the typically underutilized CPUs of dedicated SQL Server database servers
  • Optional passphrase only decryption of the Database Master Key for division of responsibility fidelity that overrides both database and local administrator authority

What I most wish to convey in this series, are some recipes for baking encryption into T-SQL. This is done with the presumption that you have fairly weighed all application layer alternatives and come to the empirical decision that T-SQL encryption is right for your application.

All examples are from SQLClue’s Encryption Hierarchy Backup & Restore that uses SQL Azure for secure offsite storage. The installer is a not much more than than a check list and a few T-SQL scripts at this time so can be installed without any other SQLClue components. I’ll include snippets from the installer to demonstrate several cryptographic implementation patterns. To see how the pattern is fully implemented, review the T-SQL or install the Encryption Hierarchy Backup & Restore component script in an evaluation SQL Server 2012 environment to create the schema. You can download a zip file or grab a fork of my SSMS project at https://github.com/bwunder/EHAdmin.

Most data originates in the database in this application so it meets the sniff test for using T-SQL cryptography. The schema is based upon the script in the Encryption Hierarchy Administration post. That script made only limited use of encryption. In the version now under consideration, the cryptography yields a translucent database state: where even the sysadmin role members need not have a clear text view of all data. This is very powerful with regard to establishing an enforceable separation of responsibilities.

SeparationOfResponsibilites

The example application is also interesting because it uses SQL Azure for secure offsite storage. It does involve a trick: exporting certificate and key backups to an encrypted VHD and then importing the VHD’s container file into the database to move it to a central SQL Azure hub for safekeeping and recall on demand. SQL Azure is a reasonably secure offsite repository for the heterogeneous layers of encryption bundled around the data before it is sent offsite:

  • The container file for the VHD is encrypted by a key file or phrase. Once dismounted and loaded into the database it cannot be mounted from the database location. It must be restored to the file system and the phrase or key file required supplied just to get visibility of the backup files.
  • The certificate and key export files are encrypted onto the VHD by a phrase with a strong – and configurable – phrase strength policy. Once written to the VHD the phrase is required to open. 
  • The copy of the user database’s Database Master Key is dropped from the master database. When this copy is removed, an application enforced strength policy compliant phrase must be provided.
  • All key and certificate exports created are encrypted into the output file using a key that meets the application enforced strength policy.
  • All keys and certificates used by the application require a key or certificate to be open that can only be opened by providing a strength policy compliant phrase to open the Database Master Key, a certificate or a symmetric key.

The sysadmin running the install can select either change tracking or change data capture on supported SKUs along with a linked server to an ODBC DSN. The DSN can point directly to a SQL Azure instance or to a locally configured hub and then SQL Azure’s Sync services used to maintain the offsite mirror of all keys and certificates in local environment(s). With the ODBC option the secure offsite can as easily be SQL Azure as another database on the local network or a private cloud location that serves as a hub for n number of SQL Server’s. The offsite schema does not require any ability to encrypt or decrypt data. In fact the data on the hub may be comingled with rows from other SQL instances that use different keys. This allows a common storage and recovery scheme at the hub to support multiple SQL instances with an ability to prevent one instance from accessing artifacts taken from another instance – unless you explicitly ask for it that way.  

some proposed encryption rules of thumb

Use symmetric keys for cell encryption/decryption in T-SQL. Symmetric keys add far less overhead to IO processing than public key/private key encryption/decryption with a certificate or asymmetric key. Instead use the certificate or asymmetric key to encrypt a symmetric key. 

Use each key and certificate for only one purpose. Sure you can do it all with one key but if that key is compromised, the entire data set is compromised. In this usage you might have one certificate with a sole purpose of encrypting symmetric keys used for cell encryption and under that certificate might be a symmetric key for each encrypted column.

Never trust a DML object to verify its own signature. There is no way to get a monolithic procedure to reliably tell you if its signature is in place. If hacked in an elevation of authority or insider attack, what’s to stop the hacker that modifies the stored procedure from saying the signature is just fine even if now gone? For elevated security at the database to validate the authenticity of stored procedure all stored procedures are built WITH ENCRYPTION, each stored object, e.g., stored procedure, function, trigger, view, is signed and all stored procedures call a second stored procedure also protected by WITH ENCRYPTION and a signature from the same certificate as the called stored procedure that shares the validation processing with the primary called stored procedures. The processing cannot proceed unless this two part schema authentication processing succeeds.

design objectives

The main purpose of this application is in making sure no database objects are surreptitiously added, modified or removed from the schema to assure the integrity of the data.  The aim is to create a background server side application that is able to enforce a division of responsibilities among administrators. Unauthorized users, even if they can connect to the SQL instance as a member of the sysadmin server role, cannot access stored secrets used to create the key and certificate backups because they are unable to open the Database Master Key. In contrast, authorized administrators can also chose to store additional secrets in the secure passphrase manager and to password protect any private secrets as desired. File system paths are also encrypted to thwart unauthorized user’s from mapping files to database objects.

Even if someone does figure out which files(s) there are after, they will still need the phrases used for backup encryption of each to load the key and then perhaps even another phrase to open the key.  

patterns covered in upcoming posts

  • Column encryption using the primary key column as authenticator
  • Column encryption with clear text of another encrypted column in the row as authenticator
  • Schema object signing run-time verification 
  • Using WTH ENCRYPTION to safely embedded certificate phrases in stored procedures  
  • Hashing a random seed to generate a random salt.
  • Authenticating by a hash re-using a salt persisted to an encrypted column (secret salt)  
  • Encryption of passed variables into the audit trail as data administrators and developers cannot decrypt because they cannot access the phrase encrypted key. 
  • In-flight row signature for message authentication and schema validation
  • TRY-CATCH error event data encryption to prevent unauthorized access and attack vector exploitation of exception information 
  • Sharable symmetric key that can be implemented on many instances and securely conveyed by a combination of phrase, fob and/or file  
  • In-flight encryption in a table variable/table valued parameter
  • Signing rows in sys.messages (or any other system or user table)
  • Signing a file’s VARBINARY(MAX) image
Posted in Backups, Encryption Hierarchies, Secure Data | Leave a comment

Throw FORMATMESSAGE() in the SQL Server 2012 Tool Bag

SQL Server 2012 chucks THROW into Try-Catch block error handling brewery. Throw is designed as part of the TRY-CATCH contraption giving THROW the ability to reliably relay an exception down the T-SQL call stack. However, the inability of a THROWn message to accept replacement values is a show stopper without FORMATMESSAGE() in the current implementation.

FORMATMESSAGE() will see more utilization as THROW catches on.

T-SQL’s FORMATMESSAGE() function is too often overlooked anyway. The script below shows that the Denali CTP3RC0 FORMATMESSAGE() is a little different beast than we’ve had before. Looks like it gets a lot more like a .NET String.Format() method. For one thing, it lets me stuff CASE and system functions in the parameter list with abandon. (If it would let me pass a column or variable from the local database as the format string it would be even better. I can work around that with nesting, i.e. FORMATMESSAGE() can be used in a parameter of a FORMATMESSAGE()).  More noticeably, it allows me to specify a format string in addition to referencing a message_id in sys.messages. Not that there would be anything you could do with the formatting string that you cannot do in sys.messages, just that sys.messages is in master so using messages from the table binds the application to the SQL instance. FORMATMESSAGE() has long been effective way to assure type when you must combine disparate types into a single NVARCHAR value. FORMATMESSAGE() will see more utilization as THROW catches on. Error messaging, e.g. RAISERROR(), is the obvious use case. So much so that it is easy to overlook other great places to benefit from a low level (high performance?) system method:

  • code generation
  • dynamic SQL (almost the same principles as code generation)
  • cryptography – FORMATMESSAGE() can be used in-line as the clear text or authenticator but no implicit conversions like QUOTENAME() or REPLACE()
  • output formatting – columns can be aligned and trimmed for display
  • display formatting (almost the same principles as output formatting) 
  • messaging – a native alternative to XML
  • logging 
  • data cleansing
  • data conversion

Of course the most important usage will continue to be in error handling.

November 24 Update – the T-SQL in this post continues to work with the RC0 bits (11.0.1750.32) but still no update to the documentation confirming any new powers for FORMATMESSAGE().

Application stored procedures that experience runtime type and conversion exceptions are good candidates for FORMATMESSAGE() conformity. My personal favorite example conversion exception in this moment is:

SELECT CAST(-1234567890 AS NVARCHAR(10) )

which returns only the error message;

Msg 8115, Level 16, State 2, Line 1
Arithmetic overflow error converting expression to data type nvarchar.

Cracks me up! At the same time it is embarrassing to admit how often I have cast INTs to VARCHAR(10)s for formatting into a string. Negative numbers less than -999999999  need 11 characters for display. I have benignly ignored this fact for years. I remain confounded about how that is an arithmetic overflow though.

The same problem using FORMATMESSAGE() instead of  the mushy SQL Server native type system is no problem at all.

SELECT FORMATMESSAGE('%d',-1234567890);

and the result is -1234567890 of type NVARCHAR.

FORMATMESSAGE() also adds protection against the escape sequence or truncation attack vectors that aim to execute malicious T-SQL.

Is This Really FORMATMESSAGE()?

The documentation is very clear that FORMATMESSAGE works only with messages stored in sys.messages. The Denali documentation is not changed from the SQL Server 2008 documentation – so far. Nonetheless, the example successfully uses a message string instead of a message_id even though it is not documented to do so:

FORMATMESSAGE ( msg_number , [ param_value [ ,...n ] ] )

Notice that there is no option for text instead of a msg_number. Denali CTP3 RC0 Books OnLine even adds a special Note, “FORMATMESSAGE works only with messages created using sp_addmessage,” yet here I am with the CTP3 able to use a message instead. I guess that says things could change by RTM. On a SQL Server 2008 instance The script below will fail with the message

Msg 8116, Level 16, State 1, Line 17
Argument data type varchar is invalid for argument 1 of formatmessage function.

As mentioned earlier, SQL Server 2012 introduces THROW for use in Try-Catch blocks. A THROW statement or a system generated error in the TRY block transfers control to the CATCH BLOCK.  That’s about like RAISERROR. A THROW in the CATCH block throws the error down the call stack. Again no so difficult to get that behavior out of RAISERROR. The real difference is most obvious when an unexpected exception occurs in the CATCH block. A RAISERROR implementation is likely to throw that error on the floor while a THROW implementation will abort the batch. FORMATMESSAGE() may get more attention as THROW catches on, especially given the more elegant, i.e. less prone to failure and exploitation usage profile when unexpected data is encountered.

THROW does not support the printf formatting available with RAISERROR at all. Denali Books-On-Line – the web version – shows an example using the FORMATMESSAGE() function to format a message_id from sys.messages before passing it to the THROW.

FORMATMESSAGE(), at least in the guise of RAISERROR() has been in the database engine throughout the evolution of SQL Server. In SQL Server 2005/2008/2012 FORMATMESSAGE offers some help to ‘load’ up to 20 values of disparate type into an NVARCHAR. It is necessary to specify the type of the inputs using a subset of the printf formatting codes of the standard C language. (read all about printf in your favorite C header - hint: search local drives for a file containing the text “#define printf”).

Binary data remains more difficult to work with. More examples in the documentation would probably be helpful.

As an example, here is a FORMATMESSAGE testing script that uses FORMATMESSAGE to ‘dog food’ FORMATMESSAGE statements for each of the 42 combinations of printf flag and type supported by SQL Server.  Specify a value to match the data you supply, the @Type of that data and the desired @Format of the output string format:

-- printf formatting  %[[flag][width][.precision]]type
DECLARE @Format CHAR(1)         = ''   -- '',+,-,0,#,' '
      , @Width INT              = 6    -- display field size
      , @Precision INT          = 3    -- characters to display
      , @Type CHAR(1)           = 'X'  -- s, d, i, u, o, x, X
      , @Binary VARBINARY(128)  = 0x1239
      , @Int INT                = 11
      , @String NVARCHAR(128)   = N'test';
DECLARE @$QL NVARCHAR(256) = ( SELECT FORMATMESSAGE(
N'SELECT FORMATMESSAGE(''%%s -->%%%s*.*%s<--'', @$QL, %i, %i, %s)'
                      , s.Flag
                      , f.Type
                      , @Width
                      , @Precision
                      , CASE WHEN @Type = 's'
                             THEN FORMATMESSAGE('''%s''', @String)
                             WHEN @Type = 'd'
                             THEN FORMATMESSAGE('%d', @Int)
                             WHEN @Type = 'i'
                             THEN FORMATMESSAGE('%i', @Int)
                             WHEN @Type = 'u'
                             THEN FORMATMESSAGE('%u', @Int)
                             WHEN @Type = 'o'
                             THEN FORMATMESSAGE('%o', @Int)
                             WHEN @Type = 'x'
                             THEN FORMATMESSAGE('%x', @Binary)
                             WHEN @Type = 'X'
                             THEN FORMATMESSAGE('%X', @Binary)
                             END )
   FROM (SELECT           ('') AS Flag
         UNION ALL SELECT ('+')
         UNION ALL SELECT ('-')
         UNION ALL SELECT ('0')
         UNION ALL SELECT ('#')
         UNION ALL SELECT (' ') ) AS s
   CROSS JOIN (SELECT           ('s') AS Type
               UNION ALL SELECT ('d')
               UNION ALL SELECT ('i')
               UNION ALL SELECT ('u')
               UNION ALL SELECT ('o')
               UNION ALL SELECT (LOWER('x'))
               UNION ALL SELECT (UPPER('X')) ) as f
   WHERE CHARINDEX(@Format,s.Flag) > 0
   AND ASCII(f.Type) = ASCII(@Type) );
EXEC sp_executesql @$QL, N'@$QL NVARCHAR(256)', @$QL;

You get the statement generated and the result as output. A few of samples:

  • SELECT FORMATMESSAGE(‘%s –>% *.*X<–’, @$QL, 6, 3, 1239)   –>   4D7<–
  • SELECT FORMATMESSAGE(‘%s –>%#*.*x<–’, @$QL, 10, 6, 1239)  –>  0x0004d7<–
  • SELECT FORMATMESSAGE(‘%s –>%-*.*o<–’, @$QL, 6, 3, 13)     –>015   <–
  • SELECT FORMATMESSAGE(‘%s –>%0*.*d<–’, @$QL, 10, 6, 11)    –>    000011<–
  • SELECT FORMATMESSAGE(‘%s –>% *.*d<–’, @$QL, 3, 6, 11)     –> 000011<–
  • SELECT FORMATMESSAGE(‘%s –>%+*.*i<–’, @$QL, 4, 7, 11)     –>+0000011<–
  • SELECT FORMATMESSAGE(‘%s –>%+*.*s<–’, @$QL, 6, 3, ‘test’) –>   tes<–

You can paste the result into a query window, change @$QL to blank or ‘test’ and try similar scenarios to fine tune or develop into more complex FORMATMESSAGE statements.

Too bad the script doesn’t work in SQL Server 2008…

Here is a script that generates a BACKUP CERTIFICATE statement. The point would be to store passwords and other sensitive values as ciphered-text yet be able to use them when needed, e.g. you’d want to be able to recall the passwords in the event a forward recovery from a corrupted certificate is needed.

--------------------------------------------------------------------------
-- Using FORMATMESSAGE to generate BACKUP CERTIFICATE STATEMENT
--------------------------------------------------------------------------
-- user defined table type for encrypted name/value pair
IF TYPE_ID('NAMEVALUETYPE') IS NULL
  CREATE TYPE NAMEVALUETYPE AS TABLE
      ( Name VARBINARY(8000) NOT NULL
      , Value VARBINARY(8000) NOT NULL );
GO
--temp key available to session only until session is closed
--not useful for persisted data unless source,identity & phrase persisted
IF KEY_GUID('#TestKey') is NULL
  CREATE SYMMETRIC KEY #TestKey
  WITH ALGORITHM = AES_256
     , KEY_SOURCE = 'testing 1,2,3'
     , IDENTITY_VALUE = 'This is only a test'
  ENCRYPTION BY PASSWORD = 'test';
OPEN SYMMETRIC KEY #TestKey DECRYPTION BY PASSWORD = 'test';
GO
-- setup the required inputs
DECLARE @CertificateName NVARCHAR(128) = 'CertificateName'
      , @CipherType NCHAR(2) = 'MK' -- 'NA' (EKM), 'MK' , 'PW'
      , @DbName NVARCHAR(128) = 'DbName'
      , @BackupName VARBINARY(8000) = ENCRYPTBYKEY(key_guid('#TestKey')
                                                      , N'BackupName')
      , @DMKPhraseName VARBINARY(8000) = ENCRYPTBYKEY(key_guid('#TestKey')
                                                      , N'DMKPhraseName')
      , @KeyPhraseName VARBINARY(8000) = ENCRYPTBYKEY(key_guid('#TestKey')
                                                      , N'KeyPhraseName')
      , @PublicKeyFileExt NVARCHAR(10)  = '.cer'
      , @PrivateKeyFileExt NVARCHAR(10) = '.prv'
      , @UseHash BIT                    = 1
      , @BackupNameBucket INT
      , @BackupPath VARBINARY(8000)
      , @Backuptvp NAMEVALUETYPE
      , @DMKtvp NAMEVALUETYPE
      , @Keytvp NAMEVALUETYPE;
-- checksum truncation of the random lambda
SET @BackupNameBucket = ABS(CHECKSUM(HASHBYTES('SHA2_256'
    , RIGHT(CAST(DECRYPTBYKEY(@BackupName) AS NVARCHAR(448))
    , FLOOR(LEN(CAST(DECRYPTBYKEY(@BackupName) AS NVARCHAR(448)))/2)))));
SET @BackupPath = ENCRYPTBYKEY( key_guid('#TestKey')
                              , N'Z:\BackupPath\', 1, @DbName);
INSERT @Backuptvp (Name, Value)
VALUES ( @BackupName
       , ENCRYPTBYKEY( key_guid('#TestKey'), N'@Backuptvp.Value', 1
                     , CAST(DECRYPTBYKEY(@BackupName) AS NVARCHAR(448))));
IF @DMKPhraseName IS NOT NULL
  INSERT @DMKtvp (Name, Value)
  VALUES ( @DMKPhraseName
         , ENCRYPTBYKEY( key_guid('#TestKey'), N'@DMKtvp.Value', 1
                 , CAST(DECRYPTBYKEY(@DMKPhraseName) AS NVARCHAR(448))));
IF @KeyPhraseName IS NOT NULL AND @CipherType = 'PW'
  INSERT @Keytvp (Name, Value)
  VALUES ( @KeyPhraseName
         , ENCRYPTBYKEY( key_guid('#TestKey'), N'@Keytvp.Value', 1
                 , CAST(DECRYPTBYKEY(@KeyPhraseName) AS NVARCHAR(448))));
--------------------------------------------------------------------------
-- decryption with FORMATMESSAGE
SELECT FORMATMESSAGE
        ( 'USE %s;%sBACKUP CERTIFICATE TO FILE = ''%s%s%s'' %s;%s'
        , @DbName
        , CASE WHEN @DMKPhraseName IS NOT NULL -- need to open master key
              THEN (SELECT FORMATMESSAGE
                      ( 'OPEN MASTER KEY DECRYPTION BY PASSWORD = ''%s'';'
                      , CAST( DECRYPTBYKEY( Value
                          , 1
                          , CAST ( DECRYPTBYKEY( Name ) AS NVARCHAR(448) )
                          ) AS NVARCHAR(128) ) )
                    FROM @DMKtvp )
              ELSE '' END
        , CAST(DECRYPTBYKEY( @BackupPath, 1, @DbName ) AS NVARCHAR(1024) )
        , CASE WHEN @UseHash = 1
                THEN CAST( @BackupNameBucket AS NVARCHAR(448) )
                ELSE CAST( DecryptByKey( @BackupName ) AS NVARCHAR(448) )
                END
      , @PublicKeyFileExt
      , CASE WHEN @CipherType <> 'NA'
             THEN
FORMATMESSAGE
('WITH PRIVATE KEY ( FILE=''%s%s%s'', ENCRYPTION BY PASSWORD=''%s'' %s)'
      , CAST(DecryptByKey(@BackupPath, 1, @DbName ) AS NVARCHAR(1024))
      , CASE WHEN @UseHash = 1
              THEN CAST( @BackupNameBucket AS NVARCHAR(448) )
              ELSE CAST( DecryptByKey( @BackupName ) AS NVARCHAR(448) )
              END
      , @PrivateKeyFileExt
      , ( SELECT CAST( DECRYPTBYKEY( Value
                           , 1
                           , CAST ( DECRYPTBYKEY( Name ) AS NVARCHAR(448) )
                           ) AS NVARCHAR(128) )
          FROM @BackupTvp )
      , CASE WHEN @CipherType = 'PW'
              THEN (SELECT FORMATMESSAGE
                            ( ', DECRYPTION BY PASSWORD = ''%s'''
                            , CAST( DECRYPTBYKEY( Value
                                       , 1
                                       , CAST ( DECRYPTBYKEY( Name )
                                                  AS NVARCHAR(448) )
                                       ) AS NVARCHAR(128) ) )
                    FROM @KeyTvp )
              ELSE '' END )
            ELSE '' END

        , CASE WHEN @DMKPhraseName IS NOT NULL -- open master key
                THEN 'CLOSE MASTER KEY;'
                ELSE '' END );
GO

Some sample output for the three different @CipherType values  (after pasting from the results window to the query window and adding a few CR-LFs):

-- @CipherType = 'NA', @DMKPhraseName = NULL, @UseHash = 0
USE DbName;
BACKUP CERTIFICATE TO FILE = 'Z:\BackupPath\BackupName.cer' ;

-- @CipherType = 'PW', @DMKPhraseName = NULL, @UseHash = 1
USE DbName;
BACKUP CERTIFICATE TO FILE = 'Z:\BackupPath\116889000.cer'
WITH PRIVATE KEY ( FILE='Z:\BackupPath\116889000.prv'
                 , ENCRYPTION BY PASSWORD='@Backuptvp.Value' );

-- @CipherType = 'MK', @DMKPhraseName = 'phrase', @UseHash = 1
USE DbName;
OPEN MASTER KEY DECRYPTION BY PASSWORD = '@DMKtvp.Value';
BACKUP CERTIFICATE TO FILE = 'Z:\BackupPath\116889000.cer'
WITH PRIVATE KEY ( FILE='Z:\BackupPath\116889000.prv'
                 , ENCRYPTION BY PASSWORD='@Backuptvp.Value' );
CLOSE MASTER KEY;

Posted in Code Review | Tagged , | Leave a comment

Encrypting XML Typed Data in SQL Server

Two issues:

  1. W3C XML Encryption is broken by most accounts.
  2. SQL Server encryption functions cannot digest XML (pun intended).

A double whammy!

The first issue is a bigger deal. If you are encrypting sensitive data with a web service – and you probably are – then your data is vulnerable. If that data is personally identifiable information about individuals your are doing business with then it is your duty and responsibility to stop using those solutions until the XML Encryption standard is corrected. Not until the vendors roll out their fixes and the application is appropriately updated should the application be re-enabled. To knowingly continue to rely upon the broken encryption is a criminal activity in my opinion.

Here’s why. In 2003 the Federal Trade Commission (FTC) reported that over 27 million Americans were “victims of identity theft”. In 2007 ars Technica reported that 159 million people were “affected by data breaches” between 2005 and 2007. The Privacy Rights Clearing House reports 542,355,201 records breached from 2747 data breaches between 2005 and 2011. According to the New York Daily News Facebook alone is attacked 600,000 times a day. Almost all Facebook has to offer is your person information . The trend is much more than obvious. At the same time Reuters reports that MasterCard hits an all-time high on the stock market, “MasterCard Inc shares jumped 7.4 percent to $359.12 after the credit card processor reported its quarterly profit easily beat estimates on double-digit increases in volumes.”

The Information Commissioner of the UK, Christopher Graham told the Telegraph the other day that database breaches are up 58% in the UK in the last year. FWIW, the commissioner can levy fines up to £500,000 under the Kingdom’s Data Protection Act (DPA). Sounds like they have the possibility to agree with my sentiment regarding the undeniable responsibility to protect other people’s information. Sadly, I don’t think the penalty has been leveraged much – if at all – judging by the obvious parallel trend in the USA and the UK.

The broken status quo of capitalism that places profit above human decency, human dignity and often even human rights is rampant and unraveling around the world regardless of corporate hegemony and political corruption. Infosys co-chair Kris Gopalakrishnan seems to agree with that appraisal – calling out a systemic income imbalance that advantages the wealthiest 3%-5% in western and emerging economies. Infosys is the largest IT organization in the world. He also states that the middle class in the west is already being diminished in reciprocation to the rise of the middle class in the emerging economies and that the decline in the west is needed to achieve sustainability for the world population. Another double whammy for IT workers in the US, where the distribution of income grows ever more skewed and the downward spiral of the middle class is at least on the edge if not fully out of control. The wealthiest amass more wealth while the middle class has now lost the collateral it blew to Reagan Up: consume to help the economy, and now we have nothing to do but wait for some corporation to pee its record profits down on us. And wait. And wait. How much more obvious does it need to be for intelligent people to understand that the wealth that continues to accumulate in the accounts of a few – that have no practical purpose for more wealth – is coming not from vapor but out of the pockets of workers that too often don’t quite have enough and deserve better?  

In the USA the worst that can happen to an entity that carelessly exposes other’s private and/or personal information is a bit of hate mail and perhaps a few underfunded civil suits by folks who’s lives have been ruined and accounts depleted as a direct result of corporate negligence and stonewalling. There are rumblings about accountability measures from the government. What ever the government does, it will not be adequate: too many politicians are in the pockets of the people that would make a bundle from that unsecure data. Enough is enough!. Anytime someone can waltz in and steal other people’s personally identifiable information from a data center or business network, the  systems owners and administrators were carelessly negligent. If the theft recurs or is ongoing that negligence has gone beyond the point of criminal culpability. For effective deterrence, the penalty must match the crime. That’s all I’m saying.

If the next 5 businesses that failed to protect personally identifiable details that had been demanded from customers were shut down for even a week by the government the landscape would be changed significantly for the better for as long as the threat of meeting a similar fate for similar negligence might endure. This is what the government used to do in the mines in Wyoming where I worked in the 70s when safety would get too far out of hand in the push to produce, but even then someone often had to die to get the attention of the government inspectors.  An equitable solution would be to require the leaker to appropriately compensate the folks that were violated but such fairness will not find its way into the legal system in my lifetime I am sure. The leaker, of course, could pursue the attacker to recover costs.  The essential ingredient in a solution is unavoidable accountability. effective regulation could create accountability were the bureaucracy not so corrupt. Even the DOJs National Institute for Justice cannot uncover the real cost of identity theft noting, “A serious lack of data on these issues inhibits research into possible intervention strategies that could reduce the harm.” Until the effective costs to corporations is established as more than the cost to not protect data as it to do the right thing and protect our information, the necessary change is unlikely as long as the effective costs for corporations can be avoided. According to Mike Vizard, as he lays out a a survey based accounting in “The Cost of Insecurity” (CTOEdge, DEC 2010), “… the majority of the IT organizations surveyed by [IT security vendor] Lumension have yet to deploy any of these security technologies, even as their operational security costs continue to rise.” Follow the article link ink to see the secure computing adoption rate charts by technology and 12 other mind blowing charts built from a survey of IT shops.

The simple fact is, the technologies needed to protect data and prevent the vast majority of hacker access to information is at hand. Sure it is more costly to implement those technologies than to skip them, but how can it be negative for the profit motive of a corporation when that corporation takes genuine and necessary steps to increase the customer/consumer confidence in that corporation with a comparatively small incremental rise in development costs? The payoff for a corporation that takes the high road is insufficient in today’s capitalism. It just simply isn’t done. I would argue that it is a perversion of the profit motive that we have ever allowed contrivance and deception to drive profit. Profit is not inherently bad. Dishonesty is bad. To profit from dishonesty is perverse. Corporations know they cut corners on data security. That is one of the main reasons they are corporations. So the individuals calling the shots can hide from personal responsibility. Customers/consumers know that corporations cut corners on data security. Government knows that corners are being cut on data security. Nothing changes. That is sick. And inadequate data security in corporate data centers is a primary reason people’s information is exposed and can subsequently be used to do harm to that person.

I am certainly not saying all that is necessary is to punish violations vigorously in this post to solve the worlds data security problems. Secure data solutions will be more complex with extensive defense-in-depth and well worth the effort and cost. They will probably always be under attack as well, requiring on-going hardening. The goal I advocate is to simply do what is right. There is no gray area around what of your personal information I have the right to be careless with, let alone negligently so…

Update January 13, 2012:

Consumer Reports story today on Yahoo indicates that the financial industry has found a way to profit from the fear over identity theft: http://finance.yahoo.com/news/debunking-hype-over-id-theft-080000873.html

The  story also breaks down some identity theft data for 2010. Highlights:

  • 50 million Americans paid $120 -$300 a year (that’s $6-$15 billion) for identity theft protection in 2010 – for liability protection the banks already must provide
  • Identity fraud was down 27% in 2010 to a mere 8.1 million victims
  • Personal identity theft – “in which someone uses your name, birth date, and Social Security number to open new credit accounts, tap your health insurance, earn taxable income, or commit crimes in your name” was at 765,000 households in 2010

On the second issue, the incompatibility of SQL Server ‘s encryption functions (ENCRYPTBYASYMKEY, ENCRYPTBYCERT, ENCRYPTBYKEY, ENCRYPTBYPASSPHRASE) and the XML data type is not such a big deal. Even before SQL Server had the XML type, XML has been stored in the database as VARCHAR. That conversion between the XML type and the VARCHAR type is really all that is necessary to encrypt XML at the database. From there the door is open to sending encrypted oData or XML document that may or may not rely upon XML Encryption. That’s right! You could actually work around the XML Encryption vulnerability now without breaking the web services in many case by encrypting the XML at the database and decrypting only where needed.

In addition to missing XML – and UNIQUEIDENTIFIER, and CLR types btw – compatibility, SQL Server encryption functions are also limited to VARBINARY(8000) output in length. That is not a problem for the error handler but may be for many XML documents. Large XML documents are where much of the pain of the broken XML encryption will be felt.  To work around this limitation consider encrypting only sensitive XML attributes at the database and then providing public decryption keys in a secure manner. only to those that must decrypt that data. This is more or less how W3C XML Encryption goes after the problem. One risk in a piecemeal roll-ur-own XML document encryption like I’m suggesting is that the XML attributes not encrypted may still reveal enough information for a successful attack that eventually breaks the encryption. It is of utmost importance that XML encryption is ‘fixed’ ASAP. Using SQL Server or another work-around to plug the hole should not even pretend to replace XML Encryption. SQL Server should be considered only as an augmenting and/or interim solution until the better suited technology is made trustworthy. 

When data originates in the application, SQL Server is almost always a less desirable place for encryption. In part this is because the data must move to the database unprotected.  However, there are many SOA scenarios where the data enters the distributed environment via the database. (i.e. peer to peer transfers, XML feeds, performance data, database errors, etc.) When the database is a de facto gateway for the data into the SOA then the database is a great place for the encryption.

To see how this might work to relieve the XML Encryption woes, consider the following error handler that writes errors to a SPARSE XML column of a logging table. This is data that absolutely originates at the database so is very clearly the best place to encrypt any sensitive data in the error’s message. An encrypted error message is also an effective way to combat padding oracle attacks where-in the messages returned from unsuccessful decryption attempts are mined and analyzed to eventually crack the encryption. The padding oracle attack is proving especially effective against cipher-block chained (CBC) digests such as is used in the XML Encryption . SQL Server also uses CBC digest, yet is malleable to a more resistant and somewhat slower hash based digest. (see http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation for an overview of block cipher modes. See the post https://bwunder.wordpress.com/2011/10/07/repel-dictionary-attacks-with-checksum-truncation-of-the-random-lambda/ for some of my thoughts on hash based digests.)

I don’t mean to exclude streaming modes in the possibilities here, there just isn’t any support for that possibility in SQL Server. At some level streaming encryption makes sense for XML, especially for humongous documents. In that scenario, a streaming resume-able – possibly even transactional but that may be only wishful thinking – encryption would be ideal.

The following example is based loosely on the error handler modeled in the TRY-CATCH topic of Books On-Line.  The example is intended to be extend such that the error event is persisted to a SPARSE XML column in an activity logging table. Once encrypted, the column becomes be a SPARSE VABINARY(8000) column.   

DECLARE @Id NCHAR(36);
SET @Id = NEWID();
BEGIN TRY
  RAISERROR('test', 16, 1);
END TRY
BEGIN CATCH

  SELECT ERROR_NUMBER() AS ErrorNumber
       , ERROR_SEVERITY() AS ErrorSeverity
       , ERROR_STATE() as ErrorState
       , ERROR_PROCEDURE() as ErrorProcedure
       , ERROR_LINE() as ErrorLine
       , ERROR_MESSAGE() as ErrorMessage FOR XML RAW;

  OPEN SYMMETRIC KEY ErrorKey
  DECRYPTION BY CERTIFICATE ErrorCertificate
  WITH password = 'Au&6Gf% 3Fe14CQAN@wcf?';
    SELECT ENCRYPTBYKEY ( KEY_GUID( 'ErrorKey' )
            , ( SELECT ERROR_NUMBER() AS ErrorNumber
                      , ERROR_SEVERITY() AS ErrorSeverity
                      , ERROR_STATE() as ErrorState
                      , ERROR_PROCEDURE() as ErrorProcedure
                      , ERROR_LINE() as ErrorLine
                      , ERROR_MESSAGE() as ErrorMessage
                FOR XML RAW )
            , 1
            , @Id );

    SELECT ERROR_NUMBER() AS ErrorNumber
         , ERROR_SEVERITY() AS ErrorSeverity
         , ERROR_STATE() as ErrorState
         , ERROR_PROCEDURE() as ErrorProcedure
         , ERROR_LINE() as ErrorLine
         , sys.fn_varbintohexstr(
              ENCRYPTBYKEY ( KEY_GUID( 'ErrorKey' )
                           , ERROR_MESSAGE()
                           , 1
                           , @Id ) ) AS ErrorMessage
    FOR XML RAW;

  CLOSE SYMMETRIC KEY ErrorKey;
END CATCH

The resulting  XML from the first of the three error message formats is clear text:

<row ErrorNumber="50000"
     ErrorSeverity="16"
     ErrorState="1"
     ErrorLine="5"
     ErrorMessage="test" />

The second is completely encrypted (I added line feeds for display rendering):

0x00335906BE45EA4AB8BF0D94BFB6FC9D010000005E2DAA7F20DA4D827612ECFF
506ABBCBB7BF182711576E4E71E6B3C019123A88E2A7BD47F14A9EA45FB3F55BCB
642B25CF3A5BD9B44C0F58024113774D582C77F3D87D5816E877D643C19230CFDC
70D6A8FDE3269CF136A7111C362E6CB934800DA4D28FEE3E7425F53E14330816E7
C0407E798006FC3B434EC127BD05FCA61597E66150A5E2D8F4616070711CB2C27F
A2B51D7FE242E8F9A6ABB58C6697B6A4AF5B1CFC423E5927E983E4E1D7F0B8571D
7DDCC70184016F4B7EC0D823D4DF615F30ABC5B114738D373FDC0D973DB20EB977
E5CC8246CA7F3CF2E487456DA6C1BC95851BC60890C9FD290DFBAC46B8B2

And the third is a hybrid that encrypts only one sensitive XML attribute (again I added the CR+LFs so you could see the result):

<row ErrorNumber="50000"
     ErrorSeverity="16"
     ErrorState="1"
     ErrorLine="5"
     ErrorMessage="0x00335906be45ea4ab8bf0d94bfb6fc9d010000
       00537a48339b1669c2de290be82c0895acca90a74f19011a4d00
       6b09adb0930d49b116993af0a3be9e23172b5d0cd9855b379ce8
       85a510e3ca46556ef5bf15fcf2" />

The third option represents a middle ground between encryption and storage requirements. This demonstrates the ‘divide and conquer’ tactic that may be necessary to apply the available tools to a workable solution. Chunking for SQL Server’s VARBINARY(8000) encryption result limitation may be necessary. 

(And of course in the example it is important to persist that @Id to the logging table as well if you ever want to decrypt the value. In the solution I lifted the example from the procedure name is captured by the row so instead of keeping it here as redundant encrypted data, ErrorProcedure() is not captured in the XML column.)

Posted in Code Review, Encryption Hierarchies, Secure Data | Tagged , , , , | Leave a comment

Considering Cryptography in SQL Server Database Architectures

It is useful to have a ‘big picture’ understanding of the possible layers of encryption available for data-at-rest in a data-centric application. A visual that conveys those layers is challenging due to the number of dimensions that must be considered. Here is my best shot at the visual – for today anyway. If you find it easy to understand the diagram at a glance there is probably nothing of interest in what follows (and there is a jar in the Smithsonian with your brain’s name on it):

EncryptionInDepth

On the other hand, if your are like me it may look confusing and way too busy. Makes me want another cup of coffee.  Nonetheless, I happen to know it will make perfect sense to all once broken down into 3  coherent parts. 

I will begin with the middle part. You may notice that a user database with a data file group and a FILESTREAM file group – dangling precariously off to the side of the database -is depicted. Along side the user database are the four system databases. The only piece of this picture that may be unfamiliar to some is the FILESTREAM file group that is part of the database but actually lives outside of the database. This is intended to represent a typical FILESTREAM enabled configuration. Nothing special.

  EncryptionInDepthDbgeneric db (select to view full size)

The databases are enveloped in various cryptographic building blocks:

Behind, architecturally underlying the databases are SQL Server’s Transparent Data Encryption (TDE), a file encryption container that is less than the physical volume in size (e.g. an encrypted file that is able to present a VHD) and also a fully encrypted system (e.g a drive, storage array, or complete file system).  These encryption layers are designed to encrypt everything placed in storage in the background. No application change is supposed to be required. Also, the encryption protection cannot be avoided or forgotten in development. In fact, implementation is a configuration issue and should be done by the database’s Administrator.

EncryptionInDepthContainerscontainer scoped encryption layers

The main thing to notice here is that I have stretched and skewed the building blocks to cover as many scenarios as possible in the diagram. With 3 blocks 9 dimensional states can be represented: state 1 being no encryption and at the other end state 9 – the darkest pinkest rectangle toward the center in the image above – being TDE of database pages into an encrypted file exposed as a VHD on top of an encrypted file system. In between are all the combinations visible as overlap of the blocks in the image.

Perhaps of interest, these container scoped layers are not available to a SQL Azure database.

In front of, or structurally within the database are several useful database object and column level cryptographic layers. These layers scramble data, verify that nothing has changed, protect intellectual property, protect personally identifiable information,  determine authenticity and extend the usability of data – primarily by creating HASHBYTES() or other checksum lookup surrogates. These layers are generally implemented by the application developer. They are also the most important layers in terms of the strength of data protection and therefore deserve consideration for integration at the the application layer. In almost all cases, if there is an option, cell value encryption at the application with key storage external to the database server will yield a more secure and scalable implementation (all other things being equal).  

EncryptionInDepthCellscell scoped encryption layers

With the exception of HASHBYTES(), SQL Server cryptographic functions are not available in SQL Azure. Certainly designs that rely upon equivalent client side methods are able to store encrypted values in a properly typed column in SQL Azure, but good to keep in mind that the server-side approach to protecting data in SQL Azure inadequate at this time.

There are an undetermined number of ways to successfully implement cell or column encryption. To my way of thinking this is good. If everyone does it a bit different, all the more difficult to get past for the malicious attacker: a little obscurity actually can improve a good layered defense but cannot be a replacement. There are a wide assortment of cell and column encryption schemes in the introductory Encryption Hierarchy Administration template post and even more in the hardened Encryption Hierarchy Administration 2.0 template.

In this post the goal is to create a visual that shows the possibilities. For that purpose I overlay the three explained images to arrive at that diagram that started this post. The difference is that now the many ways that the cryptographic elements can be used together to build various depths of encryption should jump right out at you. Completely demystified as promised. Toward the center of the diagram are solutions where all encryption layers might be in play. Toward the periphery of the diagram, the layers fall away leaving only clear text and unprotected storage. Performance and usability are examples of the hard to control centrifugal forces pushing application designs to the periphery, while security, authenticity and compliance are core drivers to concentrate applications toward the center.  The size, shape and location of the encryption dimensions allow a two dimensional view of the architectural possibilities. 

This is not a diagram of a dynamic and configurable run time environment mind you. Instead, this is a visual aid to explore adequately robust cryptographic solutions when designing and developing a data application with encryption requirements. The divisions and overlaps in the diagram indicate what is possible to create in all the combinations and permutations not what must be created in all scenarios. The application defines the requirements not the encryption technology. This is a tool to help define the environment needed to support the requirements.

Spend a moment contemplating the diagram now that you have some background of the diagrams structure and purpose. I expect that you will find your mind exploring some highly secure possibilities. Better yet, the next time you are working on a project with high data protection requirements take a few minutes to contemplate the diagram at that time in the context of your requirements. I am equally confident the diagram can help identify strong defense-in-depth strategies quickly and with ease when used for that purpose. 

Posted in Code Review, Encryption Hierarchies, Secure Data | Tagged , , , , , , , | Leave a comment

Repel Dictionary Attacks With Checksum Truncation of The Random lambda

One brute force attack is an all out assault that tries every possibility to guess the content of a cell. One brute force target is a ‘word’ stored in a hashed cell. A cell in SQL Server cell-level encryption is generally taken as a column.  A dictionary attack is a digital commando raid: a smaller more coordinated attack; and very likely to be more effective than brute force.  Both can target hash values stored in SQL Server whether SQL Server did the hashing (HASHBYTES) or the application generated the values. The dictionary attacker assembles the most likely phrases or keys – the dictionary, then ‘joins’ the dictionary to the values to be hacked to identify the key or phrase used to generate the hash. The dictionaries can be built off-line and before-hand to create a swift and powerful attack. The more information exposed to the attacker a priori, the more precise the possible attack. The brute force attacker uses every possible key or phrase so is usually slower. 

Dictionary attacking seems to be a popular past-time judging by the number of freely downloadable dictionary attacking applications available. There does not appear to be reliable research to be certain of the prevalence of brute force and dictionary attacks against hashed values in SQL server. My sense is that there may be a disturbing number of attacks that go completely undetected and/or unreported. And as more applications necessarily wade into the cryptographic swamp, attacks will surely increase.

It is important to mention that without encryption attackers have a cake walk. Information is lost the instant the attacker makes that first connection. No need to build a dictionary or try even a single possibility of brute force. Poor data governance will reduce or eliminate the number of brute force and dictionary attacks on your systems!

A simple hash of a value is particularly vulnerable to a dictionary attack. A simple hash being one that simply applies the encryption algorithm to the value.

HASHBYTES( N'SHA1', @Value );

Salting a value before hashing helps to confound the dictionary attacker essentially by injecting nonsense into the value. If you can use it, the more salt the better. A salt of comparable size to the value being hashed is recommended.

HASHBYTES( N'SHA1', @Value + @Salt);

With salt, the dictionary attack is useless provided the attacker does not have access to the salt. The salt must be carefully protected. A brute force attack may still succeed though may take significantly longer to complete than without salt. Salt exposed as clear text in another column of the table or in another table or even hard coded creates vulnerability to the dictionary attack once discovered by the attacker.

This underscores the relentless high-stakes battle being waged between data application owners and hackers over hashing and encryption. If a reasonably skilled hacker is determined to crack your encryption, they can eventually do so. SQL Server is particularly vulnerable because the encryption keys are stored with the data. Good defense from dictionary and brute force attacks include:

  • strong, well tested, multi-dimensional security (defense-in-depth)
  • strongest encryption possible (security above performance)
  • anonymous lookups – using salted HASHBYTES()  (decrypt only when needed)
  • securely store and backup all salt – (store local and remote copies) 
  • securely share public keys (never expose private keys)

  • encrypt data at rest and all data backups (TDE, file system encryption) 
  • backup encryption keys (store local and remote copies)

  • securely backup all encryption passphrases (store local and remote copies) 
  • rotate encryption keys and passphrases (force brute force attack restarts)

  • encryption by combination of keys and passphrases (separation of responsibilities) 
  • very strong passphrase strength

  • stringent access approval requirements (separation of responsibilities)
  • continuously test and update roll-back and roll-forward recovery plans

Dynamic key management can get messy and/or expensive real quick in recovery scenarios. However, every time you rotate in a new key most attacks reset to square one. The benefit of having a known good backup when you need it is priceless. The headache of sound key management is worth the benefit. Rotate keys faster than the secrets can be guessed and you should be good. How big that window is depends on the strength of the security, strength of the encryption algorithm, strength of the salt as well as the general level of interest in the data and the sophistication of the attacker. 

Hashing only part of the value is a common way to further explode the attacker’s dictionary. This approach includes some unknown part of the value – of length lambda – in the hash. The excluded part of the value in the hash – along with the salt – makes it that much more difficult for the attacker to assemble the dictionaries. 

HASHBYTES( N'SHA1'
         , LEFT( @Value
               , FLOOR( LEN( @Value ) / 2.0 )  + @Salt)

Hashing also increases the risk that two rows with different values will produce the same hash. It is a small possibility but that risk is always there with HASHBYTES() or any function that tries to uniquely identify a value using a type with a domain too narrow to uniquely identify all possible values. 15 years ago DATETIME columns with a 1/3 mille-second minimum resolution granularity were being used to uniquely identify rows with some success. Few would even consider a DATETIME as a candidate key today. In less than 15 years the same will be true for HASHBYTES.  It is not safe in T-SQL to presume a hash absolutely and uniquely identifies the value represented by the hashed value unless the hash is defined as a unique constraint or unique index of the table where the encrypted value is stored. When the uniqueness of the hash is enforced by DRI, it is also a good idea to always handle the duplicate key error. It is more likely that other columns must be evaluated to determine if a match on the hash is in fact a match on the value. Often – sometimes even intentionally – another column is available to resolve the query; others time – and again sometimes even intentionally – an encrypted value may need to be unencrypted to complete the join. Having to un-encrypt to fully identify the row can cause application pain if queries must decrypt large sets, but nicely reduces the vulnerability to attack. It is possible in almost all cases to first get the rows that match a specified hash value into a derived table and then to decrypt only the few resulting rows to find the one requested by a query that knows more about the value it wants than just the hash. The bigger the encrypted blobs, the more the encryption of a few resulting rows can become a significant effort for both process and database server. In this regard it is essential to attenuate the hash processing to the application. Probably a good idea to do a sanity check about the value of the hash for a large cell. In general, it is more likely that full-text query processing is needed than an exact match on a 1GB blob.

In light of this unmitigated potential for conflict a de-duplication external to the hashing operation is needed. Never treat a hash of a value as unique unless you put a constraint on the value that it must be unique in the set. I’m thinking, if the hash cannot be considered unique anyway and the logic must deal with possibility of a duplicate and the data I intend to encrypt is huge (< million rows of customer info is not huge) or under exceptionally heavy load (<5 short running concurrent batches per second is not a heavy load for my laptop), it seems reasonable to me to reduce the hash one step further and define a bucket of hashes that can be represented by the same BINARY_CHECKSUM of the hash. It is not necessary to explicitly put any values into the buckets. That will (may) happen by itself as the hashed value is persisted to the database. In most of my use cases the chance of a duplicate even at the integer level of narrowing is somewhere close to the chance of moon falling out of the sky. The important aspect to recognize is that the buckets do not necessarily represent a single hash just as the hash does not necessarily represent as single value; and then to assure that when duplicates are found in the bucket there is enough information in the query or schema to resolved which of the multiple matches is the one desired. Narrowing the type of the hash to an integer will result in an easier number to work with for the developer, a reduced execution context footprint and reduced read-ahead requirements.

The narrowing method used here also opens the door to randomly generating row identity values at the INT level – or why not SMALLINT or BIGINT or NUMERIC(7,5) or any primitive SQL type? – so that the sequence of creation is not revealed in the surrogate key.

Raul Garcia blogged a hash like Message Authentication Code (MAC) scheme that protected against dictionary attacks a while back. In Mr Garcia’s example the salt appended to the value is the hash of an encrypted NEWID(). As he points out, the encryption of the NEWID produces a non-deterministic value that is a much better random number generator than the SQL Server NEWID() or RAND() functions.  The statement he uses to compute a MAC is:

HASHBYTES( N'SHA1'
         , ENCRYPTBYKEY( KEY_GUID( N'ValueKey' )
                                 , CONVERT( VARBINARY(100)
                                          , NEWID() ) ) )

The new GUID is cast to a VARBINARY because SQL Server encryption functions  do not have the UNIQUEIDENTIFIER in its vocabulary. The GUID cast as VARBINARY is encrypted to produce a nondeterministic and therefore random value. The encryption function returns a random VARBINARY(8000) that is subsequently hashed with the SHA1 algorithm to produce a VARBINARY(20) digest that represents the encrypted new GUID in a way that has no means to be reverse-engineered. The result is non-deterministic so may be interesting as an unordered identity column – but there remains always that chance of a duplicate hash value.

CRYPT_GEN_RANDOM can also be used but the name I would have to give the technique is not near as fun. Instead of “checksum truncation of the random lambda” we get “VARBINARY truncation of the random random.” My guess is this might not scale quite as well either due to the overhead of encryption functions versus hashing functions – but I have not test that, just guessing…

CAST(
  CRYPT_GEN_RANDOM
       ( 4
       , ENCRYPTBYKEY( KEY_GUID('ValueKey')
                     , N'type align with NEWID cast'
                     , 1
                     , CAST( NEWID() AS NCHAR(36) ) )
        ) AS INT );

Instead of substring truncation as described above I now propose in most scenarios to use a BINARY_CHECKSUM of the hash to capture an integer representation of the original GUID (Correct me if I am wrong but is not that still deterministic truncation akin SUBSTRING, but in a way that considers the complete underlying value to reduce the probability of any brute force guess being correct instead of throwing part of the values unique identity on the floor?). An index of 4 bytes INT values will fit on 1/10th the number of pages as would an index of the same number of rows with about 40 bytes per row in a VARBINARY(8000). Integers will get you to about the same place in terms of selectivity of the hash value with a significantly reduced total scan length requiring  less read-ahead and therefore less memory pressure. Data keyed on whole numbers  is endlessly easier to humanly work with than data keyed on a large binary value.

For the MAC that looks something like:

BINARY_CHECKSUM( HASHBYTES( N'SHA1'
                          , ENCRYPTBYKEY( KEY_GUID( N'ValueKey' )
                                         , CONVERT( VARBINARY(100)
                                                  , NEWID() ) ) ) )

And for a deterministic hashed matching value more like:

BINARY_CHECKSUM(
        HASHBYTES( N'SHA1'
                 , LEFT( @Value
                       , FLOOR( LEN( @Value ) / 2.0 ) + @Salt) )

;where @SALT is a well protected static value, perhaps generated as a MAC akin to Raul Garcia’s original proposal?

The per row storage advantage of the INT returned by BINARY_CHECKSUM over the VARBINARY(20) returned by HASHBYTES using the ‘SHA1′ algorithm is tiny. But it will add-up in a table with millions and millions of rows. Plus, the overhead difference between the INT and the VARBINARY is significant in other ways. As already mentioned, indexes and therefore scan lengths will be somewhat shorter. Sorting will be slightly faster. Development and testing are also marginally easier with the INT type. I’m spending too much time here to be certain that any performance benefit of INT of VARBINARY is not exaggerated – if it even exists.      

It may also be controversial to claim CHECKSUM as a truncation method. CHECKSUM bucketing is a well established and accepted practice however the de facto truncation method at this time is based upon the left most (higher order) bits of the value.  The NIST publication SP-800-133, “Recommendation for Cryptographic Key Generation” seems to at least allow for some flexibility in truncation of the hash (I highlighted a few spots in red):

Let K be the directly-generated key or the seed to be used to generate a key. K is a bit string value of the following form:
K = U ⊕ V, (1)
where
• U and V are of the same length as K,
• U and V are independent of each other, and
• U is derived from the output of an approved Random Bit Generator (RBG)
within the key-generating module that is generating K; the output of the RBG may have been transformed by an approved post-processing method (see Section 5.2) to obtain U.

FIPS 198-1, “Keyed-hash Message Authentication Code (HMAC)” specification – the more recent NIST darling over the classic Cipher Block Chaining MAC (CBC-MAC) that SQL Server uses – has something to say about truncation as well:

A well-known practice with MACs is to truncate their outputs (i.e., the length of the MACs used is less than the length of the output of the HMAC function L). Applications of this standard may truncate the outputs of the HMAC. When truncation is used, the λ leftmost bits of the output of the HMAC function shall be used as the MAC. For information about the choice of λ and the security implications of using truncated outputs of the HMAC function, see SP 800-107

Note that in the HMAC specification, the bits to use are stated as an option-less directive! (“…shall be used.“) SP-800-107, “Recommendation for Applications Using Approved Hash Algorithms”  seems pretty clear that the standard truncation method for a hash is only a recommendation for the readers convenience. (Is this really how standards are born? One person writes down an example to try to be helpful. Later someone needs to borrow the captured helpful thought. e viola!  In the process of the borrow, the convenience morphs into an edict. Right or wrong, a standard is born.)

SP-800-107 says:

5  Cryptographic Hash Function Usage
5.1      Truncated Message Digest

Some applications may require a message digest that is shorter than the (full-length) message digest provided by an approved cryptographic hash function specified in [FIPS 180-3]. In such cases, it may be appropriate to use a subset of the bits produced by the cryptographic hash function as the (shortened) message digest.
For application interoperability, a standard method for truncating cryptographic hash function outputs (i.e., message digests) is provided strictly as a convenience for implementers and application developers. The proper use of a truncated message digest is an application-level issue.
Let the shortened message digest be called a truncated message digest, and let λ be its desired length in bits. A truncated message digest may be used if the following requirements are met:
1. If collision resistance is required, λ shall be at least twice the required collision resistance strength s (in bits) for the truncated message digest (i.e., λ ≥ 2s).
2. The length of the output block of the approved cryptographic hash function to be used shall be greater than λ (i.e., L >  λ).
3. The λ left-most bits of the full-length message digest shall be selected as the truncated message digest

In Raul Garcia’s MAC the lambda – the length of bits being hashed- is not exploited to forge the authentication code. It is based on a new GUID at its core, cast to VARBINARY(100) – somewhat arbitrarily using that type and size – because the UNIQUEIDENTIFIER type cannot be used in the hashing function.

In the NIST SP-800-133 sample method the left-most lambda bytes of the value the code authenticates are used. The protection from dictionary attacks is about as perfect as it gets in Mr. Garcia’s example, however the NIST method can be reproduced as needed in  hashed password authentication schemes. Raul’s method may be better for use in queries where the MAC’s value is always part of the question (e.g. DRI, joins and indexes). The NIST recommended method is the only option of the two that could correctly be used in a uniqueness test that reproduces the hash and searches for a match in the data set – or at least a pre-test to get it down to one bucket of MACs that must be cracked open to finally determine the desired row.

In all cases, the only way to assure that a hash or hash-bucket collision in T-SQL does not crash or corrupt the application it to code for collisions. The duplicate hash value collision  is a fundamental keyed set dilemma comparable to the old school design mistake of using a DATETIME as a unique key. When a data type cannot describe all the unique values that it are assigned in the set that type can be described as narrowing. Hashed values are narrowing columns for clear text values, buckets are narrowing columns for hashes. A collision occurs when a duplicate value is actually placed in a narrowing column. When a narrowing column is a candidate key an additional piece or two of uniquely identifiable information must be included to resolve the collision. The same hash value must be usable to describe two or more unique values or the application must be open to spurious duplicate key errors. Specifically, a table with a narrowing column must include additional columns in any unique index or key to assure uniqueness. A narrowing column requires query and indexing that do not presume uniqueness of hash or bucket columns along with DRI that can correctly disambiguate a collision.

Whether and how a collision resolver deals with duplicates at the VARBINARY(20) bucket size or the INT (4 bytes) bucket size is not meaningfully different. The beauty of using BINARY_CHECKSUM  as a truncation method is that instead of using a fixed subset of bytes from the value as the original value, a subset of each byte is used. With the NIST substring method, the hacker can still perpetrate a dictionary attack on the hash, but is left with the task of mapping the part of the original value not reached by the lambda. That could take a while. With Raul’s random method, a dictionary attack using his MAC is not possible. The only relationship they have is that the MAC was created because the value already existed.

Using CHECKSUM instead of BINARY_CHECKSUM will reduce the number of buckets as will removing the sign (ABS()).

Interestingly, RSA defines MAC as synonymous with checksum: http://www.rsa.com/rsalabs/node.asp?id=2177

Until convinced otherwise, I plan to make use of checksum truncation for bucketing of lookups in my encrypted data where I can.

Posted in Encryption Hierarchies, Secure Data | Tagged , , , , , , , , , , | Leave a comment

Using Authenticity, Signing, Nesting in T-SQL to Protect Sensitive Data from (even) Highly Privileged Users

Digital Non-Repudiation is about distributed communication authenticity. Non-repudiation in the digital security context is a proactive design concept intended to simulate the authentication of a hand written authorization. Evidence that this is a hard problem is found in the still widespread requirements for notarized hand written signatures in the year 2011.

Non-repudiation as a data layer design element has been widely rejected as unnecessary and invalid because the database is implicitly a trusted internal resource: why would anyone want to verify their own signature and what real use could it serve? Most argue that protecting the endpoint is protecting the data layer. While there is some short term profitability in that position,  the breakdown in data layer authenticity is rife with nightmare stories. It would be silly to ignore the risk just because non-repudiation alone does not solve it yet that seems to be the status quo.

Elevation of authority attacks are wide spread. Furthermore, the sophistication of attack as well as identified vectors of attack are expanding.

One major authenticity risk area is forgery or fakery of referenced objects.   Even in a change controlled SQL Server database, objects and configuration can be modified in a ways that are not reflected in the current version in the source control system. It is common for shops that develop T-SQL to require periodic re-synchronization between the “current” source control “version” and the actual objects on a SQL Server. It is essential to eliminate this drift. No matter how astute the data access monitoring tools (DAM), if you do not have the T-SQL source truly under control you don’t have much in the way of authenticity.

Modes of Authenticity

To combat forgery and fakery a method to determine the authenticity of the other ‘players’ in a given communication. Players in a database communication could include client applications, users, SQL Server instances, databases, tables,stored procedures, functions, triggers, queues, etc. Unless your DAM can do this you are not doing enough. Consider this table that lists several ways that are used to determine when a communication is authentic:

authority authenticator weakness strength
verbal trust deniable convenient
signature trust deniable durable
signed in triplicate signed copy deniable signature durable
signed and notarized Public Notary inconvenient + trusted authenticator
network communication trust deniable convenient
signed network communication trust broker posers & rogues non-repudiation
data access SQL principal hierarchy
posers & rogues KERBEROS 
DML objects SQL catalog posers & rogues principal of least privilege
signed DML objects SQL encryption hierarchy rogues schema authenticity

Whether schema authenticity at the data layer is non-repudiation at the network layer is unimportant provided there is some acceptance for using the strengths and weaknesses listed in identifying relative value of authenticity in various architectural styles of communication and at various structural layers to the degree that looking at non-repudiation helps provide a stronger conceptual basis to establish adequate strong authenticity at the data layer. 

Given:

that a T-SQL attacker that obtains an ability to connect with elevated sysadmin or dbo credentials is unstoppable from carrying out their will

and,

that users with dbo database credentials – including the sysadmin server role members that are automatically made dbo when they enter any database, have implicit permissions to change all database configuration and object definitions in ways that will alter results returned to the application yet go undetected by the application or by internal database signature validation of the objects directly involved in legitimate database operations

and,

that it is always a plausible scenario that a currently trusted user with elevated authority can unexpectedly become an attacker using sysadmin or dbo authority

  then,

signature based non-repudiated protocols alone cannot assure the authenticity of the simplest of database operations. 

In non-repudiation, about all that is required is common access of the two parties to a public/private key pair (CREATE CERTIFICATE) to certify the communication. The objective of non-repudiation is mostly to assure that everyone gets the message at a pragmatic level. Non-repudiation doesn’t care what the two parties do after the communication or if the information exchange was useful; only that both parties are who they say they are and perhaps the content of – or altered state resulting from – the communication.

“Non-repudiation protocols are designed for verifying that, when two parties exchange information over a network, neither one nor the other can deny having participated to this communication.”  (Klay & Vigneron)

Obviously, the ‘parties’ in a database operation that runs completely within the SQL Server’s memory space do not normally exchange information over a network; but they can if someone wants that functionality — or injects that functionality. Likewise, there should be no argument that the ‘two parties’ in a database operation are really two software components of a common host most of time and in fact are more likely to be three or more components than the always dyadic constraint of communication between so called network endpoint . The party can grow quite large when you start looking at tall the database objects that are involved – and so can be manipulated maliciously – in a database operation. Limiting the communication at the database to two parties is beyond unrealistic. This goes to the very important to a point I wish to clearly establish above all else: a malicious user with elevated authority in the database can easily modify the unprotected database in ways that can significantly alter the database and/or the data returned.

At the sports book where I did a little DBA work one clever syndicate (organized group of thieves and crooks) figured out how to flip their bets to the winner on the over and under at just the moment between the end of a game and when the business layer automation could settled the wagers for the event. As far as I know it was never possible to say they were actually caught… It is nothing short of astonishing that this could have been done without insider assistance. If you saw the complexity of what they had done to manipulate wagers and the thoroughness used to cover the tracks you would likely agree. Yet “it was outsiders” seems to be the conclusion they (the management team mostly but Interpol and even Scotland Yard were involved somehow) reached in the investigation. This all even before the crooks started to get an idea anyone was on to them, but nabbing international attackers is not such an easy thing to do; to some (large?) extent due to corruption. Another part of this particular scam involved playing poker with the dishonest winnings against another syndicate member and losing intentionally to launder the ill-gotten winnings from the tainted account where the bit had been flipped into an account not directly tied to the hack. (Probably a good table to sitting at for the honest gambler too!) I was never really in on the skinny but did hear a few of the details: investigators somehow decided that there was no insider involved. I even saw a drivers license photo of the top guy in the syndicate from the other side of the world. I also finally got the permission I had been so far hopelessly seeking to reduce the permissions on some of the legacy application level SQL Logins. Still could not get them to budge on switching to Windows Authentication because they could not get their act together to move away from the 1 ginormous database connected directly to scads of web servers in the DMZ architecture. No idea how it ended with the attackers.  It was the sort of thing that went on continuously in that industry but these guys were a little more insidious and a little more successful than most.

I predict that the state of siege that the on-line sports book has been under for the last 10 years is a precursor to what will be common in the near future.  

It is the rare application that is able to detect let alone respond to the morphing or injection of database objects and the resulting bogus operations at run time. Pretty sure that statement can be applied to all flavors of SQL. For T-SQL application DML to be aware of the integrity of the data layer I know to be both quite possible and almost never done. The accepted practice instead is an existential presumption of authenticity of referenced database objects. (e.g. tables, stored procedures, functions, triggers, views, permissions, users, roles, certificates, etc.)? For the most part, this presumption relies upon the intrinsic hierarchical role based security model and absolute confidence in the integrity of the system(s). This is most realistic when only a trusted few are able to access the database from a role at the top of the hierarchy and at the lower rungs of the hierarchy the power to make change is highly restricted or revoked.

I’m thinking that is backward. It is obviously necessary to vigilantly assure that all relevant schema objects and configuration settings have not changed. Especially when changed in such a way that the result of database operation are altered? Only when there is certainty about the integrity all relevant schema objects would it be possible to certify the authenticity of the database operation. I need an example to even feel like I am explaining my premise adequately. 

A Facetious, Fictitious, Farcical Far Too Likely to be True Example

Consider this schema that inserts a Boolean value using a stored procedure. There is no other functionality. Period.

create table dbo.tally
  ( id int identity(1,1) PRIMARY KEY
  , user_choice BIT );
go
create proc dbo.submit_choice
  ( @choice BIT )
as
  insert dbo.tally (user_choice)
  values (@choice);
go

Let’s say this is a temporary polling station set up in a location restricted to only employees yet openly accessible to all employees of a business. The poll  promises an anonymous result yet with a stated request (rule of trust) that everyone only vote once and assurance of an anonymous result: there is no way to tell who placed any vote. The kitchen user’s are asked if they would like to discontinue the currently catered snacks and lunches in the kitchen in response to a widespread outcry after a recent outbreak of food born illness among workers when the potato salad was left out over the weekend. A polling station is a laptop placed on the end of the kitchen counter by the toaster with text instructions for the one question survey loaded into Notepad telling kitchen users how to open the SSMS query tool, connect to the correct database and place a vote by executing a stored procedure from an SSMS query connection. Examples of the exact syntax for yes and no that can be copied from notepad and pasted to SSMS are included as is a request for “each user to PLEASE! vote only once” and the query that will be used to determine the result. Not a second thought is given to security or user interface due to the high confidence that only trusted users could access the location.

-- YES - paste this to query window and execute to vote yes
exec dbo.submit_choice 1;
-- NO - paste this to query window and execute to vote no
exec dbo.submit_choice 0;
-- this query will determine the winner
select TOP(1) user_choice as [winner]
     , count(*) AS votes
from dbo.tally
group by user_choice
order by votes desc; 

Drop in an few votes an see how things look:

exec dbo.submit_choice 0;
exec dbo.submit_choice 0;
exec dbo.submit_choice 1;
exec dbo.submit_choice 0;
select TOP(1) user_choice as [winner]
     , count(*) AS votes
from dbo.tally
group by user_choice
order by votes desc;

The result is

winner votes
------ -----------
0      3

That looks right to me.

What no one could have known is that a militant raw vegan PETA loony colleague – a web developer with adequate SQL skills and exceptional attitude that actually did not use the kitchen much because of her now 4 year running raw vegan ‘fad’ diet – would be one of the first to enter the kitchen after the voting begins. This person’s naked agenda is the abolition of meat and meat by products from the human diet by all non-violent means possible. She pops open the connection to the database as described in the note and needs only a few seconds to understand the schema and the stored procedure, a moment to think and a few keystrokes to add a trigger that would assure the outcome yet risk minimal additional scrutiny that might be caused by a completely outlandish results.


create trigger dbo.MSIndexMaintenance
on dbo.tally
with encryption 

instead of insert   

as
set nocount on;
insert dbo.tally (user_choice)
select case when ( t.now < .55 )
            then 1 else i.user_choice end
from inserted as i
cross join
  (select sum(cast(user_choice as float))/count(*) as now
   from dbo.tally ) as t;

after running the exact same set of inserts again (remember the other 4 votes are already there)

exec dbo.submit_choice 0;
exec dbo.submit_choice 0;
exec dbo.submit_choice 1;
exec dbo.submit_choice 0;
select TOP(1) user_choice as [winner]
     , count(*) AS votes
from dbo.tally
group by user_choice
order by votes desc;

There are now 8 rows in the table. Two (2) rows should hold the value zero (0). The other six (6) should be 1. Right? The result from the summary query is now:

winner votes
------ -----------
1      4

Oops…

This may explain why so many corporate kitchens have gone vegan! What’s worse (better?), signature checking of the compiled objects to establish a non-repudiated communication would not have helped the bottom line of a single feedlot or hot dog factory. The signature of a SQL Server table is unaffected by the addition, modification or removal of a trigger or a permission placed on that table.

Adding a trigger is not the only hi-jinx possible. A Synonym pointing to a Linked Server location that replaces the table is another possibility for ‘stealing’ votes. Or how about simply adding the trigger logic to the stored procedure? Other possibilities may occur to the reader. Many exist.

The example shows the ease and stealth with which the T-SQL outcome can be manipulated. The result is made invalid and any claim of authenticity left looking foolish by the facts after a hack if steps are not taken to assure schema authenticity. This hole in the signature seems like the utility company allowing you an electrical outlet that does not go through the meter (guess where I plug everything in). One thing is for sure: the hackers will continue to bang away at Microsoft implementations in search of vulnerabilities as long as Windows is a leader in the server space. And if there is a problem it will continue to become known before too many people get stomped on. I can only hope it does not become known with my data… Notice I mentioned only Windows and not SQL Server. That is because an elevation of authority attack will most likely occur before the stolen highly privileged user credentials are use to gain database access. The only other real possibilities are an insider attack or SQL Injection combined with patience. 

Of course the base farcical ‘mistake’ in the example application is reliance upon the honesty of people. Almost every application makes this mistake of foolish trust to one degree or another, it is admittedly presented as self portrait in caricature in the example. In computer applications trust is used as an acceptable calculate risk in the interest of efficiency and supportability. On the Internet, even that friend, loved one, colleague or boss can be convincingly spoofed. The only rule you need to stay digitally safe, especially when the Internet is in the picture is, “trust no-one”. Most of us step over that boundary far to often and don’t even realize it. We forget that computers are not people.

And there are many other errors in design in the made up example. Some even intentional. For example, the presumption that the data is safe because it is in a secure physical location is not only fallacious but proved to be an invalid assumption based solely on hope. Speculative design is not likely to produce authentic results.

The status quo is that database operations do no need to meet the test of non-repudiation. OK, fine. What I cannot agree with though is the unspoken consequential decision to simply not worry about authenticity at the data layer. It makes my head spin to think too long about all the ways the integrity of a database operation can be maliciously corrupted or compromised. Although non-repudiation sets a pretty low bar for authenticity because it presumes that communication is dyadic and because it has no interest in the process, only the participants.

Simplistic shared secret verification a la non-repudiation is inadequate to assure communication authenticity in SQL Server. Many changes to an object’s component attributes (indexes, triggers, permissions, etc.) and dependencies (called modules and referenced tables) are not encapsulated by an object’s signature when signed using T-SQLs ADD SIGNATURE statement. For example, object signing alone on a table does not see any CREATE/ALTER/DROP of triggers or indexes on the table or any GRANT/DENY/REVOKE permission applied to that table. Likewise changing the table in a way that does invalidate the signature would do nothing to the T-SQL signature of a stored procedure that references the table or a trigger attached to the table.

Toward Authenticity at the Data Layer

Actions to gain confidence in the authenticity of database operation include:

  • Control Change. Allow only schedule, reviewed and well tested changes. Be unequivocal about this. Shoot for only planned change too, but be willing to schedule the occasional one-off as long as it meets the established review and testing standards. Be willing to adjust the standard to improve results.
  • Limit change. Limit the number of individuals that can apply changes to the environment and limit the the number of changes any individual can make to those needed to do what they do. Always align the permissions of the database user with the access required by the user as precisely as practical (principle of least privilege).  
  • Digitally sign all objects in the schema at creation. Use a carefully guarded key and then ‘continuously’ verify that the signature of self and all legitimately referenced objects are in tact at run-time. The private key of a certificate can be removed from the SQL Server when not needed. Object signatures can be verified using T-SQLs IS_OBJECTSIGNED() function without the private key yet a signature cannot be reapplied until the private key is restored.  Consider a password signed certificate with a very strong password when security requirements granularity for the key is not aligned with the role/principal hierarchy.   
  • Silently monitor for changes. An asynchronous process that captures all change is essential for monitoring and forensics. One method is to use EVENT NOTIFICATIONS that include all DDL_DATABASE_LEVEL_EVENTS. Log and protect the log from unauthorized access. Maintain high confidence that not only the message but also the message delivery system is not compromised. Any process able to leverage dbo or sysadmin authority can compromise any SQL Server change control system; including event notifications.  
  • Proactively prevent unauthorized change.  The intersections of authority for a role based provisioning model and the permissions based access model can create threat holes. Some users end up with more access than they need. The most likely suspect is when an group from the domain or tree is plopped into a role in SQL Server. Most never know it. The easiest way to become convinced is to try to plug that hole. A DDL trigger that rolls back all DDL_DATABASE_LEVEL_EVENTS. stop all change by all users. And it does except for one change for any user with ALTER ANY DATABASE DDL TRIGGER permission can make: DROP TRIGGER on that DDL trigger. ALTER ANY DATABASE DDL TRIGGER   If you add another identical triggers it is not possible to drop either trigger without first disabling the other trigger. That is not very much defense since the user that can ALTER a trigger can also disable the trigger and there is no event available to the trigger that can detect the disable trigger event. SQL Trace can see DISABLE TRIGGER events. An alert from a custom trace event watching for the text “DISABLE TRIGGER” could be configured to protect against DISABLE TRIGGER. This actually could be effective in restricting a dbo authorized attacker in some scenarios but would be just another trivial obstacle for a sysadmin authorized attacker. 
  • Encrypt Sensitive Data. Use SQL Server encryption hierarchy(ies) that require strong password(s) or encryption based on keys not available to other database users. It is recommended that secure on-site and off-site backups of all keys be maintained at all times. Restrict key management to a very small group of trusted users. 
  • Verify schema integrity at run time. The most straightforward verification would be to compare the object tree in the schema to a hardcoded static list encapsulated in an encrypted executable object. The object cannot be changed by other users without the loss of the signature as the object is changed. Other users cannot reapply the well protected signature thus reducing the check for change to an inexpensive IF EXISTS  on the catalog view of encrypted objects filtered by the key/cert of your signature. In other scenarios it my provide more confidence to include signed data value validation between executable database objects a la non-repudiation – especially for critical operations. Complementary to object signing, there should be some assurance that the object signature or key used are not compromised. Also the verification must include all reference objects – expected or unexpected. Consider signing a parameter or introducing signatures based on a different well protected key than the one used to sign the objects between called database objects to build greater confidence in the objects authenticity. The key of the data signature gives you an additional common shared local resource that is expected to be difficult to compromise for cross-validation of the objects’ authenticity.

  • Use system encryption for all objects (WITH ENCRYPTION). This helps prevent the uninvited from following along and provides the added benefit of protecting strong and long secrets that can be compiled into encrypted objects without risk of exposure as clear text. The combination of DML object Encryption and SQL Trace’s built-in encryption statement masking are key to effective run-time secret protection at this time. (Windows Debugger attackers may still be able to steal secrets from buffers where they might be placed in clear text. It should be possible to make sure live systems do not have the necessary debugger symbols.) As a DBA there is nothing more frustrating than trying to support a 3rd party database with encrypted objects. I concur and will do what I can to avoid that thankless task. Encrypting yourself is not so painful because you can unencrypt in development to debug. Wipe that unencrypted database clean when done debugging and protect it extra carefully while it is visible. 
  • Disable database-chaining and trustworthy database options. This will reduce the exposure to cross database elevation of authority attacks.

  • Don’t help hackers with error details. Watch the @@NESTLEVEL value in CATCH block and always use try-catch error handling. When @@NESTLEVEL = 0 then the error is on the command line and should not be displayed. WHEN the NESTLEVEL = 1 then the caller is a command line and any error information should not be returned to the command line call. When @@NESTLEVEL > 1 then return just enough information to the caller for locating the logged error. 

This is a listing of the minimum necessary to be confident in the authenticity at the data layer.  Without meeting all of these measure in combination, database data is not safe from posers ( including elevation of authority attackers, brute force password attackers, SQL Injection attackers, etc.) – and the rogue insiders. There is more that could be done. For example in highly secured settings it may be worthwhile to sign object interactions by using a technique that looks a lot like non-repudiation.  Eventually even strong security must succumb to the law of diminishing returns and the level of perceived risk. I think it sufficient to acknowledge that perfect security is not attainable but must remain always the objective. 

A revised application with much higher assurance of object authenticity based upon that insert into a bit column would look a bit different. The example revision explodes the original applications 10 lines script to create the database objects  to over 200 lines of T-SQL. 

:SETVAR SIGNATURE_PASSWORD "a_$1LLY-n-str0nG_fraze"
CREATE DATABASE test;
ALTER DATABASE test SET TRUSTWORTHY OFF;
ALTER DATABASE test SET DB_CHAINING OFF;
GO
USE test;
-- silent monitor
CREATE QUEUE DDLChangesQueue WITH RETENTION = ON;
CREATE SERVICE DDLChangesService
ON QUEUE DDLChangesQueue
  ( [http://schemas.microsoft.com/SQL/Notifications/PostEventNotification] );
CREATE EVENT NOTIFICATION DDLChanges
ON DATABASE
FOR DDL_DATABASE_LEVEL_EVENTS
TO SERVICE 'DDLChangesService', 'current database' ;
GO
ALTER AUTHORIZATION ON DATABASE::test TO sa;
GO
CREATE ROLE testAdministrator;
-- add current user to role
DECLARE @CreateUserDDL NVARCHAR(512);
SET @CreateUserDDL = N'CREATE USER [' + ORIGINAL_LOGIN()
                   + N']; EXEC sp_addrolemember ''testAdministrator'', '''
                   + ORIGINAL_LOGIN() + N'''' ;
EXEC sp_executesql @CreateUserDDL;
GO
-- cert encrypted by DMK
CREATE CERTIFICATE SigCert
ENCRYPTION BY PASSWORD = '$(SIGNATURE_PASSWORD)'
WITH SUBJECT = 'object signatures' ;
GO
CREATE TABLE dbo.tally
  ( Id UNIQUEIDENTIFIER NOT NULL -- int identity(1,1)
  , User_choice BIT );
GO
ADD SIGNATURE TO dbo.tally
BY CERTIFICATE SigCert
WITH PASSWORD = '$(SIGNATURE_PASSWORD)';
GO
-- does not affect signature of table
CREATE CLUSTERED INDEX ixc_tally ON dbo.tally(user_choice);
GO
CREATE PROC dbo.submit_choice
  ( @choice BIT )
AS
BEGIN
  BEGIN TRY
    IF (SELECT IS_OBJECTSIGNED('OBJECT'
                              , @@PROCID
                              , 'CERTIFICATE'
                              , thumbprint )
             + IS_OBJECTSIGNED('OBJECT'
                              , OBJECT_ID( 'dbo.tally', 'U' )
                              , 'CERTIFICATE'
                              , thumbprint )
             + IS_OBJECTSIGNED('OBJECT'
                              , OBJECT_ID( 'dbo.trg_tally', 'TR' )
                              , 'CERTIFICATE'
                              , thumbprint )
             +  ( SELECT COUNT(*) AS [NbrTriggers]
                  FROM sys.triggers
                  WHERE parent_id = OBJECT_ID( 'dbo.tally', 'U' )
                  AND object_id <> OBJECT_ID( 'dbo.trg_tally', 'TR' ) )
        FROM sys.certificates
        WHERE name = 'SigCert'
        AND object_name(@@PROCID) = 'submit_choice' ) <> 3
      RAISERROR('unable to verify object signatures',16,1);
    INSERT dbo.tally (user_choice)
    VALUES (@choice);
  END TRY
  BEGIN CATCH
    SELECT  ERROR_NUMBER() AS ErrorNumber
          , ERROR_SEVERITY() AS ErrorSeverity
          , ERROR_STATE() AS ErrorState
          , ERROR_PROCEDURE() AS ErrorProcedure
          , ERROR_LINE() AS ErrorLine
          , ERROR_MESSAGE() AS ErrorMessage;
  END CATCH
END
GO
ADD SIGNATURE to dbo.submit_choice
BY CERTIFICATE SigCert
WITH PASSWORD = '$(SIGNATURE_PASSWORD)';
GO
CREATE PROC dbo.get_results
AS
  SELECT user_choice as choice
        , rank() OVER
          (ORDER BY COUNT(*) DESC) AS rank
        , count(*) AS votes
  FROM dbo.tally
  GROUP BY user_choice
  ORDER BY rank DESC;
GO
ADD SIGNATURE to dbo.get_results
BY CERTIFICATE SigCert
WITH PASSWORD = '$(SIGNATURE_PASSWORD)';
GO
-- instead trigger fires BEFORE constraints
-- constraints are applied before after triggers
CREATE TRIGGER dbo.trg_tally
ON dbo.tally
INSTEAD OF INSERT, UPDATE, DELETE
AS
BEGIN
  BEGIN TRY
    -- only inserts, only 1 row at a time
    IF (SELECT COUNT(*) FROM inserted) > 1
      RAISERROR('no multi-row', 16, 1);
    IF (SELECT COUNT(*) FROM deleted ) > 0
      IF (SELECT COUNT(*) FROM inserted) = 1
        RAISERROR('no update', 16, 1);
      ELSE
        RAISERROR('no delete', 16, 1);
    INSERT dbo.tally  (id, user_choice)
    SELECT NEWID(), user_choice FROM inserted;
  END TRY
  BEGIN CATCH
    SELECT  ERROR_NUMBER() AS ErrorNumber
          , ERROR_SEVERITY() AS ErrorSeverity
          , ERROR_STATE() AS ErrorState
          , ERROR_PROCEDURE() AS ErrorProcedure
          , ERROR_LINE() AS ErrorLine
          , ERROR_MESSAGE() AS ErrorMessage;
  END CATCH
END
GO
ADD SIGNATURE to dbo.trg_tally
BY CERTIFICATE SigCert
WITH PASSWORD = '$(SIGNATURE_PASSWORD)';
GO
CREATE TRIGGER trg_ddlChanges1
ON DATABASE
WITH ENCRYPTION
FOR DDL_DATABASE_LEVEL_EVENTS
AS
BEGIN
DECLARE @User NVARCHAR(128);
  BEGIN TRY
    SET @User = USER_NAME();
    -- error if user making the change is not DataAdministrator
    -- even though IS_MEMBER('DataAdministrator') would be false
    IF NOT EXISTS
      ( SELECT *
        FROM (SELECT ddl.event.value('LoginName[1]'
                                 , 'NVARCHAR(128)') AS LoginName
              FROM
                (SELECT EVENTDATA() AS change) this
              CROSS APPLY
                change.nodes('/EVENT_INSTANCE') AS ddl(event) ) q
        JOIN sys.database_role_members r
        ON q.LoginName = USER_NAME(r.member_principal_id)
        WHERE r.role_principal_id = USER_ID('DataAdministrator') )
    AND @User <> 'cdc'
      RAISERROR('Invalid User %s',16,1,@User) WITH LOG;
  END TRY
  BEGIN CATCH
    ROLLBACK;
    RAISERROR('fatal error',20,1, @User) WITH LOG;
  END CATCH
END
GO
DISABLE TRIGGER ALL ON DATABASE
GO
CREATE TRIGGER trg_ddlChanges2
ON DATABASE
WITH ENCRYPTION
FOR DDL_DATABASE_LEVEL_EVENTS
AS
BEGIN
DECLARE @User NVARCHAR(128);
  BEGIN TRY
    SET @User = USER_NAME();
    -- error if user making the change is not DataAdministrator
    -- even though IS_MEMBER('DataAdministrator') would be false
    IF NOT EXISTS
      ( SELECT *
        FROM (SELECT ddl.event.value('LoginName[1]'
                                 , 'NVARCHAR(128)') AS LoginName
              FROM
                (SELECT EVENTDATA() AS change) this
              CROSS APPLY
                change.nodes('/EVENT_INSTANCE') AS ddl(event) ) q
        JOIN sys.database_role_members r
        ON q.LoginName = USER_NAME(r.member_principal_id)
        WHERE r.role_principal_id = USER_ID('DataAdministrator') )
    AND @User <> 'cdc'
      RAISERROR('Invalid User %s',16,1,@User) WITH LOG;
  END TRY
  BEGIN CATCH
    ROLLBACK;
    RAISERROR('fatal error',20,1, @User) WITH LOG;
  END CATCH
END
GO
GO
ENABLE TRIGGER ALL ON DATABASE
GO
SELECT O.NAME
     , ISNULL(IS_OBJECTSIGNED( 'OBJECT'
                             , o.object_id
                             , 'CERTIFICATE'
                             , c.thumbprint ), 0) as [IsSigned]
     , c.name
FROM sys.objects o
LEFT JOIN sys.crypt_properties cp
ON o.object_id = cp.major_id
LEFT JOIN sys.triggers t
ON o.object_id = t.object_id
LEFT JOIN sys.certificates c
on cp.thumbprint = cp.thumbprint
WHERE o.is_ms_shipped = 0
GO

 

The best prevention for unauthorized sysadmin level access to the database is to keep anyone and anything that should not be doing admin level work from connecting with admin level credentials. Limiting the number of sysadmins is also nice, but the reality is pretty much anyone that can set another users password or access the local system running SQL Server as a local Windows Administrator can find a way to connect to a SQL Server with sysadmin access. Once a connection to a SQL Server is established using valid credentials, there are no permissions, privileges, rights or roles that cannot be overridden by that sysadmin empowered connection. At this time only the protection from compromise of sensitive data by unauthorized sysadmin members looks possible. Any sysadmin can delete complete data rows, drop tables and even drop databases or corrupt the data: no matter what is done to prevent data compromise. It is only possible to prevent the sysadmin from knowing the meaning of encrypted data. A mildly determined malicious user with sysadmin authority and the intent to destroy data cannot be stopped from that goal. They can be delayed and – in most cases – identified after the fact. Delaying features are also useful to help show intent during Forensic . The best protections against the malicious insider are to hire trustworthy employees and  to collect adequate forensic data into user protected containers. Ideally, an audit trail that cannot be modified by a sysadmin privileged user is needed.

The information protection of encryption/signing is durable provided the private key is safe. Encryption scrambles data with a public key that can later be unscrambled with a private key. In contrast, signing scrambles data with the private key of a key pair and later unscrambles the signature with the public key to verify the data. In either case, the encryption keys – especially the private key – must be  safeguarded.

The simple act of successfully sharing a well protected key to sign and verify a signature is often considered adequate to establish non-repudiation. A weakness in this approach is the ease with which T-SQL can be manipulated without violating the signature. It is also necessary to verify that all other relevant schema elements continue to sport their signatures, that unsigned objects have not been introduces, and that none of the expected signed objects are missing.

A digital object that is expected to check its own signature is fatally flawed. This would be like airport security using the ‘honor system’ or nuclear power regulators deferring to the plant operators to establish safety levels or airline pilots left to judge their own soberness before flight.  As we now know – self-policing strategies are effective in well over 99% of cases. And very cost effective… until there is an event that falls into the range of that other way less than 1 percent. A significantly more reliable and consistent outcome results when the signature is always verified by a trusty outsider. (Ultimately this is the reason that outside accounting firms are needed to count the votes in an awards show, there are observers at ballot locations, “fact checking” has become a political pastime and neighbors are nosey: simply to bear witness to the obvious and expose – or get mysteriously wealthy – over occasional anomalies.) 

A higher assurance that any entity is what it claims it is is will come from a reliable source external and independent from that entity than from that entity itself. This is why auditors are in great demand and the words “Fact Checking” garner such attention in the title of an article or news story.” This external check could be done – in our scenario – by:

  1. introducing a trigger behind the table
  2. via a called module – a stored procedure – placed between the external request and the table operation
  3. as a check constraint or default value.

Also, to be able to claim the current process as the scope for verification of trust, the validity of each called module and referenced object needs to be verified before use and the signature of each calling module ought to be verified by the caller before returning control. That leaves defaults and constraints less useful since they may not occur with every process that uses the data.  

A called stored procedure or a trigger will do. A view cannot be signed.  It may be a little less confounding for the developer not to have to create two stored procedures to do one thing.  Using a trigger also allows a division of responsibility and prevents everyone from having to know every business rule to succeed. Perhaps the DBA takes ownership of the triggers or called procedure(s) that verify authenticity of the participants allowing the developers to continue to code mostly unaffected by and unaware of the signature trigger logic and other defensive layers enveloping the application logic. All in all, a trigger is probably a better way to impose corroboration than the addition of a second stored procedure IMHO, but not by much. One of the few scenarios when triggers are in fact appropriate: to assure the encryption complexity does not get in the way of progress!  On the other hand, if the database supports abstraction layers (ORM, OOP)  that will suffer serious side effects or degradation from a trigger an additional stored procedure or view introduced into this now obviously secondary secure application is mandated to assure the authenticity of the communication.

If you are looking for this level of assurance of authenticity through your ORM – fagetaboutit. Man-in-the-middle protection at the ORM data access layer is essential but must be done well outside the bounds of the data layer. You still want to do everything described to protect and monitor at the database server. In addition, it is necessary to apply similar protection at the ORM layer. 

FWIW: the original hack, tacking on an instead trigger no longer works due to the one instead trigger per action limitation. The legitimate trigger uses up the one allowed trigger on all three actions.

Posted in Encryption Hierarchies, Secure Data | Tagged , , , , | Leave a comment