Translucent Databases (9780967584416) by Peter Wayner (2002) describes a database environment that reaches for security by separating information from data. Sort of a Hotel SQL Server: the data goes in but the information cannot get out. The result is an encryption driven environment where no single view inside or outside of the organization exposes all the data in the database in an unencrypted state at one time. The land where only the hash of your password is buffered, passed or persisted. Never the password. (Imagine! You, and of course that dude across the room that can memorize any 50 keystrokes you type from across the room, are the only one(s) that know your passwords! Yes there are such people…) A place where different parts of the data are encrypted with different keys, and keys belongs to disparate organizational units. Where the DBA sees a customer table with mostly VARBINARY columns and perhaps other somehow index-able values that the various key authorities or application layer indexers have generated as that authority encrypts data. Wayner, in 2002 called it Translucent Data Encryption. Was he way off base since we have not vigorously pursued the mechanisms he postulated? I don’t think so. Instead, what has happened is the standards we strive for are too weak. Furthermore, organizations fear security strategy that works because they fear the secure environment will be grinding slow and unusable, so they default to compliance. And then the organizations point helplessly to the standards while being hacked mercilessly: giving up our personally identifiable information in the process.
I suspect that RSA and Siemens and Sony and Citi and Sega and all the others were not far out of compliance with current standards yet millions of people have had pieces of their identity stolen recently in the small number of disclosed events. One is left to ponder just how many organizations are dishonest enough not to disclose breaches. (Update February 2, 2012 – you doubt it? VeriSign admits they were “successfully hacked several times in 2010” but saw no need for the admission until today. Wonder when they will get around to telling us about what happened in 2011? There must be some reason they decided to come clean all of the sudden.) Not that big of deal to disclose I suppose. The only recourse we have is whether to save, trash or recycle the sympathy card from the organization that may have just cost a customer a lifetime of financial misery. Why have organizations accepted the weak standards? Why do the organizations now verbalized that that strong security is a pipe dream? hmmm….
“Translucent Databases: A Precursor to Privacy Sensitive Databases (.pdf)“, a paper I stumbled on recently by students Palaka Bhattacharya and Morteza Karya from an advanced survey of security course at Harvard from 2004 is drawn upon Wayner’s ideas. Database scenarios for cancer research, rape victims and a suggestion box are thoughtfully and creatively considered in the paper. Several good ideas about extending the Translucent Database to the specific security requirements particular to personally identifiable information are described. That students in a survey course can so lucidly map a convergence between the real world problems and the concepts from a book says good things about the students to be sure, something about the validity of the concepts in the book, and something about how little real progress we’ve made in the last decade on data security and in particular the ever-shrinking respect for the privacy of the customer/user.
Here we are many years later:
- People steadfastly clinging to an obviously false belief that privacy policies are there to safeguard our data and that we are using best of breed solutions on problems that we must ultimately accept as unsolvable. Those privacy policies often do just the opposite. Facebook is a good example of an identity puppy mill. Google is no better. I can’t prove it from here but I am quite certain I got clobbered with a man-in-the-middle exploit when I linked my WordPress Blog to my Linked-In account. Strangely, the next day my yahoo email contact list was compromised (fixed it with a password change).
- Organizations continuing to pretending that compliance & self regulation is as good as security gets. What they really know all it is is as cheap as it gets. Real security infrastructure costs considerably more than the false front we see. Sadly, in too many cases all that is behind the false front is a couple of long wobbly sticks wedged to the ground to prop it up.
As a bit of proof that we aren’t that good at even recognizing good security standards, I offer up the message I receive when I try to upload to the BLOG from the Windows Live Write client when the local Security Policy is checking for FIPS compliance:
WindowsLiveWriter,1.8800,Fail,00025,22-Jun-2011 16:43:05.969,
"System.TypeInitializationException:
The type initializer for 'WindowsLive.Writer.CoreServices.HttpRequestHelper' threw an exception.
---> System.TypeInitializationException:
The type initializer for 'WindowsLive.Writer.CoreServices.WsseAuthenticationModule' threw an exception.
---> System.InvalidOperationException:
This implementation is not part of the Windows Platform FIPS
validated cryptographic algorithms.
at System.Security.Cryptography.SHA1Managed..ctor()
at WindowsLive.Writer.CoreServices.WsseAuthenticationModule..cctor()
The fix: disable the FIPS compliance policy to upload to the blog. Not sure if other Microsoft Cloud technology will have this problem, have not tested it.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Transparent Data Encryption (TDE) is an encryption technique offered first for Oracle 10g Databases then later introduced into Microsoft SQL Server 2008. SQL Server TDE encrypts only at the database level while the Oracle TDE can also handle column and tablespace encryption strategies. TDE is included only in SQL Server 2008 Enterprise, Data Center, and of course Developer flavors. Correct me if I am wrong, but I think you can add Oracle TDE to any 10g/11i database but it is priced separately. I mention this competition to help establish why SQL Server even offers TDE and also clarify why the technology is available only in the high end SQL Server SKUs. I think it fair to say TDE came to SQL Server as “me too” technology.
“me too”
How corporate competition stifles software innovation
Say Company O has a product in wide use. Company M enters the market with a lower cost, easier to use product that serves almost the same purpose for the customer but with fewer and different features and without interchangeable compatibility with Company O’s product. Instead of purchasing the less expensive product, customers tell Company M’s sales staff that the product is just not as robust as Company O’s product. The sales staff collects the information and funnels it back to Company M HQ. The potential customers either choose not to switch from Company O’s product or to purchase from Company O instead of Company M in more cases than not. As things get desperate, Company M’s sales & marketing folks initiate ever more urgent and decisive enhancement requests for the ‘features needed to compete’! The risk assessment and/or cash flow and/or leadership egos do not allow time for vibrant R&D and anyway, the sales staff is adamant that the risk of upsetting customer’s with educational requirements for new or significantly different product features is too great. Development becomes fixated on delivering a functional – even if not yet fully usable – version 1 feature set to match targeted competitor features. The strategic purpose is to be seen as an equal competitor to Company O – at least in the silly mirror. The message from leadership is that this vacuous short term play will solve the direct and immediate threat to long term product viability. Thus leaving the technology staff no choice but to make hay of it. Before you know it, when customers complain, account managers can say we have that feature too now. At this point the commonly practiced “wait for SP1″ avoidance behavior of the (ill-conceived?) feature(s) can begin. Eventually a sufficiently robust competitive feature list evolves while the competitor moves on as well.
The time to deliver those iterations is taken from time available for any other R&D investments. In this way, free market competition leads to a product targeting a competitor’s feature list rather than any real customer need and time available for true innovation is diminished product wide. The net effect will hopefully include increased short term earnings for corporations and, unfortunately, fewer real choices for consumers of the product.
There is likely no way to even make a good correlation between “me-too” R&D and mid or long term benefit to the corporation. Nonetheless, a business strategy that aims to beat others at their own game is risky, expensive, and ultimately counter-productive. It’s the opposite of a game changer.
(Ok, I’ve already exceeded my knowledge of Oracle’s TDE so I will limit my remarks to Microsoft SQL Server’s TDE from here.)
SQL Server 2008 Transparent Data Encryption
The Encryption Hierarchy is equally available in all SQL Server SKUs. However, TDE is included only in Enterprise, Developer and Data Center editions. TDE is an emerging version 1 “me too” technology. It has seen a couple service releases. Predictably, the adoption rate was delayed at introduction but is now picking up some steam. Fortunately, I suspect there is not a lot of code involved in TDE so the unexpected consequences for jumping in early were small. Roy Ernest found a few good ones. Given the lack of disaster stories – somewhat confirming that the code changes were not large – I can only speculate as to why the technology in not available in all $KUs. The additional logic is to implement CryptoAPI hooks into SQL Server Buffer Pool management and plop in a CREATE and ALTER DDL statement – but no DROP – to manage the Database Encryption Key (DEK) stored in a small and always unencrypted space within the encrypted database. If you are the DBA and you enable TDE and manage the keys – or nobody manages the keys – don’t expect your risk exposure to go down by much. Most of what TDE is about is removing implicit need to trust DBAs and other administrators. What DBAs can benefit from is the always encrypted backups from a TDE database and as result a mechanism to allow the organization to better control who can use that encrypted backup and where. The other main protection that TDE provides – data at rest encryption – is pretty close to never in play for a service like SQL Server that is typically running 24×7, has a lock on the file at all times, and exposes all data only in unencrypted form to all user database connections while running anyway. Most of what you need encryption for – protecting sensitive data – cannot be done with TDE. Lets face it, TDE is a low cost compliance alternative or supplement to proven application layer encryption, hashing and cryptographic signing approaches to data protection.
Compliant corporations absolve liabilities, litigate and lobby. Secure corporations protect your personally identifiable information with all their might.
SQL Server TDE occurs in low level IO processing. The data is encrypted just before the bus ride to secondary storage and then decrypted just as it transits back to primary storage. The implementation should benefit from XEON encryption acceleration. The documentation state that TDE uses the, “Windows Cryptographic API (CAPI)”. Interestingly, the number one hit in my Google for, “Windows Cryptographic API (CAPI)” is a sardonic wikipedia entry who’s author(s) seems to have shared my longstanding confusion about all these names for Microsoft Cryptographic technologies being bantered about. The second result is the link to the “Microsoft CryptoAPI“. Not one nor the other ever mentions or uses the phrase “Windows Cryptographic API” – not even the one that shows a long list of synonyms for the CryptoAPI. Lucky for me, the search engine seems to know what I want. I think this is a case where the SQL Server Documentation adds confusion where there was already enough. (It’s almost like the documentation team was trying to make the Wikipedia volunteer look ill-informed by using a name not listed. They are coming after me next I sure.)
The main benefits of TDE over other encryption techniques are:
- the encryption requires no modification to the application
- the encryption is generally recognized as compliant to PCI-DSS Data at Rest requirements
- TDE database backups are also encrypted using the DEK
- no data is exposed as clear text if files are copied, stolen or probed in-situ - unless the probing is done via a SQL Server connections, in which case all data is exposed as clear text.
Nothing about the data being better protected from application level attacks!? This is because TDE exposes only decrypted data to all application and ad hoc user connections to the database. Authenticated users connected to the database – whether for legitimate or malicious purposes – see decrypted data in all query results. There is no option to view the cipher text. The encryption is at the page level so the page must be decrypted to even find out much about the rows on that page I would imagine. There is no need for the user to provide credentials other than those presented for a ‘normal’ connection without TDE enabled. From the outside world the database is the same as it ever was. The data is encoded as it moves from primary, i.e. SQL Server data buffers to secondary storage, i.e. a RAIN (redundant array of independent NAND) of 750,000 IOP Micron p320hs SSDs or a few direct attached storage enclosures stuffed with commodity SAS drives or a zone of Brocade 96 gigabits per second encrypted LUNs.
The Database Encryption Key (DEK) is the core secret of TDE. It is not mentioned in any catalog views and is not accessible in the same way as other keys in the database. From Books Online:
“The database encryption key cannot be exported from the database. It is available only to the system, to users who have debugging permissions on the server, and to users who have access to the certificates that encrypt and decrypt the database encryption key.”
There should be no reason for application user’s to access the DMK or DEK. Execution of all DMK DDL requires CONTROL permission on the database and VIEW DEFINITION permission on any asymmetric key or certificate mentioned in the DDL. If any application user requires CONTROL of the database they inherit control the DEK. Application users requiring elevated database permission increase the of risks of a breach. (HINT: Applications that require CONTROL of the database for anything are not secure by design.)
If all the application does at the database is INSERT, UPDATE, SELECT and maybe DELETE data, yet has authority – explicitly or implicitly – to mess with the DEK then consider tuning the security dial toward the principal of least privilege. IMHO, application users should not have direct INSERT, UPDATE, or DELETE table permissions. This is even more true when the data is at all sensitive or private. Use stored procedures instead. Even for SELECT queries stored procedures are recommended – prepared statements are almost as good but the dependency chain is strongest when only stored procedures are used and they need direct table permission. With stored procedures, the only permission the application user needs is EXECUTE on the procedure. The application user should never own tables, schemas or databases. Only the rarest of application users should be a member of any Fixed Server Role. The access level of the application users must be carefully considered because application users are the most likely way bad guys will initially access the system.
Any database user can still view/copy/modify any data maliciously that they are also able to view/copy/modify legitimately. This is equally true if the user identity is taken over for malicious use or if the user identity is otherwise compromised. This is true whether the data store is protected by TDE or file system encryption (an alternative to TDE described below). TDE encryption protects the data from a file copy attack and some other trust exploitation attacks. TDE also encrypts all backup of the database. By transparent is meant: applications and ad hoc SQL Server connections never know the data at rest is encrypted. Chris Kranz, in “A NetApp Technical Diary”, blogs a similar concern about the real benefit of storage level encryption from a systems perspective.
All secrets and certificates that are used to configure the TDE must be protected in order to protect the TDE. There are more than a couple important secrets to protect. The actual number can vary depending up the tactics selected for encryption. If encryption provides a choice between a certificate and secret, most of time I choose the certificate. Ideally, the data administrator must provide one password and the HSM or key vault administrator must provide one other password in order to restore a TDE database backup anywhere but the SQL Server that generated that backup.
The architecture and configuration options for TDE are nicely covered in Zubair Ahmed Mughal’s post on “Database Encryption Key Management” in the MSDN SQL Server Security weblog. This should be in Books online.
The configuration options for hashing, signing and encryption of identified data values are not as easily condensed to a single link. Look for Raul Garcia’s posts in the SQL Server Security weblog for insights into SQL Server’s implementation. But even before you go there, a decision whether to encrypt in the application or in the database is needed. A developer/architect starting point is this “VISA BEST PRACTICES: Data Field Encryption” (.pdf) that advocates , “data field encryption as a complement to, rather than a substitute for PCI-DSS compliance requirements”
As with the EKM + HSM solution, if encryption occurs at the point of data creation or entry, the data administrators need not know the unencrypted values to administer the data. She probably cannot help you much with searching or sorting based on that data values though. Further, the data administrator is not involved in the administration of certificates or secrets other than Certificates encrypted by database master keys .
TDE is useless against 90% of the attacks on your data!
Hopefully I have painted the picture enough to clearly show that TDE really isn’t so much protection in the grand scheme of things. According to this 2009 ACE Team weblog post that provides 6 “Simple Rules to Stop the Bad Guys” 90% of attacks come at the application layer. TDE does nothing to stop that 90%. TDE targets some part of the other 10%. It’s strength is in embargoing the insider attack and securing the backup, but is only effective if the organization has the fortitude to truly protect the encryption key from the DBA and all other administrators. Few take the necessary steps. TDE in that context can help prevent data from walking out the back door. A DBA managed TDE DEK should decelerate the speed with which the back door swings wide open. Any careless use of TDE will increase the CPU load and can result in loss of data if a corruption issue requiring a restore from last known good backup is incurred only to have an untested restore process fail. The good news is, TDE is pretty much plug and play as long as nothing goes wrong.
Column level encryption, implemented in such a way that only authorized users can access the keys required to decrypt the column – perhaps in conjunction with TDE or file system encryption of the database file(s) – is a more realistic way to protect sensitive data at rest because it provides protection from application layer attacks. If the key is stored in the SQL Server it cannot be secured from all data administrator and may be more vulnerable to hacking. Keeping the encryption keys in a physical store that administrators of the database, database server OS and database subnet do not control will improve protection of the data from internal attacks. At a minimum, a good separation of responsibilities strategy would require that at least two admins be in cahoots to be able to steal a backup.
EKM enables separation of responsibilities by moving the certificate store to a Hardware Security Modules (HSM), and thereby breaking away from the single principle in control condition of the SQL Server Encryption Hierarchy. At least a few currently available HSMs advertise that they have the required Cryptographic Provider .dll for use with the SQL Server EKM. I have no hands on experience with an HSM. Unless I am mistaken, the 5 HSMs reviewed in this Internet Infrastructure Foundation commissioned HSM review (.pdf) all serve keys to the EKM? A good place to jump in and learn a little anyway.
Are there alternatives when you do not have a TDE enable SQL Server SKU, cannot change the application, and yet have important and sensitive data that must be protected when at rest for compliance?
Lets set the bar low and say we want to be able to use SQL Server 2005 EXPRESS or better. Better meaning that the solution will also work on other SQL Server 2005 SKUs or any SQL Server 2008 SKU, or any SQL Server 2008 R2 SKU -only better because SQL Server 2005 EXPRESS is the poodle in the wolf pack.
Microsoft has published a white paper titled, “Database Encryption in SQL Server 2008 Enterprise Edition“. Mostly it is a yet another how-to on implementing TDE. Though it does give a bit of valid consideration to several options. The most interesting of those options is file system encryption.
Transparent File System Encryption -vs- TDE
One option to encrypt the file system is BitLocker integrated file system encryption included with Windows 7+ Vista + Server 2008. Or its predecessor the Encrypting File System in Windows 2000+XP-Pro+Server 2003, or even one of the 3rd party disk encryption software. One vendor, gosecure, provides a nice comparison of their product to a couple other encryption software products with a list price and to TrueCrypt, an open source, free, on-the-fly and transparent encryption software that could even encrypt your MySQL databases that runs on a Linux box if you were so inclined. I’ll focus on TrueCrypt because I have actually used the product.
TrueCrypt allows three kinds encryption containers. Files, non-boot operating system logical drives, and the boot device. Encrypting the all drives including the boot device – and entering the strong password to complete the encryption key during each system startup – is the best encryption protection because the pagefile.sys and any installation and configuration files that get created on the boot device and do reveal clues about sensitive data are found on the boot device. This a far more protection than TDE affords. Encrypting all drives including the boot device also carries the most overhead. Encrypting non-boot Volumes is attractive in a SQL Server direct attached storage configuration. The optimal SQL Server configuration with separate physical RAID arrays for data, log, and tempdb could be encrypted in this configuration. On a SQL Server I would expect the addition of the boot device to the encryption scheme would not add so much burden except at start-up since the IO on a database server is almost all about the data cache. (If a SQL Server is writing to the OS paging file it is already in performance hell. If you can’t fix that problem, I recommend you not mess with even open source, open license encryption solutions until you lay your hands on adequate memory.)
A SQL Server database device cannot be encrypted by encryption software. Instead, the .mdf and .ldf files are placed in already created encryption containers. If tempdb is to be encrypted place it in a TrueCrypt devices as well. Another subtle difference is that anyone that can read from a mounted TrueCrypt drive can get to the unencrypted data. Not the encrypted container created by TrueCrypt, that is definitely encrypted. I’m referring to the drive letter that shows up in Windows Explorer when a TrueCrypt container is mounted in the file system. Although, anyone with enough access to initiate such access would have a much easier time of it and a lower risk of being caught using the same credentials and T-SQL.
Unlike BitLocker, TrueCrypt can encrypt the boot drive or a file container. There is a nicely detailed comparison of BitLocker and TrueCrypt at Tom’s Hardware.
TrueCrypt exposes the decrypted data when the encrypted container is accessed through the mounted file system device. This is an intended characteristic of transparent encryption. Once the mounting process acquires a drive letter, the drive must be secured at least as well as the SQL Server installation. If you are a Member of the Super Users Group on the machine, when you mount the container file, the permissions could allow other Super Users to access your containers.
TrueCrypt VHDs must be mounted and dismounted in the correct sequence. For encrypted disk to protect the database, a commitment to detach the database with sensitive data or alternately shut down the SQL Server Service and dis-mount the TrueCrypt device when the database is not in use and then remount and re-attach (or restart as the case may be) on demand may be a good option. Easy enough to automate…
TrueCrypt-ed data can be viewed in clear text. The TDE encrypted database is presented to everyone that connects to the database as unencrypted data. This is also true when the disk is encrypted except that in addition, decrypted data can also be exposed through the logical file system device as unencrypted data – for example if the file is copied using the logical drive to unencrypted storage the encryption is removed. The way to move encrypted TrueCrypt data is to un-mount the logical device and copy the encrypted container file. This does not integrate well with SQL Server. Better to have a good plan to start. To be safe from prying eyes, the logical device must have very restrictive (principle of least privilege) ACLs . With BitLocker a pre-boot password is required. With TrueCrypt it may only mean that a file is mounted to the already running local system before the SQL Server can be started. Although TrueCrypt can also be configured to encrypt the boot disk – and would also require the pre-boot password in this context.
TrueCrypt cannot encrypt existing files. Nor can it encrypt a SQL Server data file. Instead, the SQL Server data file must be placed inside the TrueCrypt created file container after it is mounted. When mounted, the encrypted file is assigned a logical drive letter and is used like any other drive. Care must be taken that the same drive letters are used for each device else the attach script will need to be edited before each use.
TrueCrypt is open source and free. TDE is integrated with SQL Server. TrueCrypt container encryption does not use SQL Server in any way. In both cases the encryption algorithm, certificate and password are user specified. And in both cases the FIPs complaint AES encryption algorithms and SHA1 hashing algorithm are available for selection. (Why use anything else? You want to be compliant don’t you?)
TrueCrypt uses Intel’s Advanced Encryption Standard New Instructions (AES-NI) hardware acceleration. By some accounts AES-NI hardware acceleration is over 4x faster than without. No reason not to believe that even TDE is using the hardware acceleration although I do not see much propaganda to that effect. TrueCrypt identifies AES-NI capable CPUs and allows enable/disable configuration of AES hardware acceleration.
TrueCrypt is not integrated with SQL Server. I said this already. It deserves repeating because there are so many pros and cons to that condition. With TDE there is no intermediate step like mounting a file and starting the SQL Server as is required when a TrueCrypt file container is used. SQL Server can auto-starts with the OS as designed with TDE. TrueCrypt will prevent SQL server from starting and could result in a database being marked suspect if the service is able to start but the database file cannot be found. The TrueCrypt piece of the startup/shutdown process can be automated and can include the startup/shutdown of the SQL Server service if the automation user is able to start services.
TrueCrypt can prevent users with CONTROL of the SQL Server from overriding the encryption. The password needed to mount the TrueCrypt device could, for example, belong to the data center operations group (Ops) and Ops could be excluded from CONTROL of the server and from access to the database.
With TDE you could use the EKM Cryptographic Provider (.dll) from an HSM. TDE with EKM in a configuration where the HSM administrator does not have access to database subnet or a login to the SQL server is the clear winner but the cost difference is very significant. The scenario where domain administrators do not control the SQL Server and the database administrators are not local administrators on the SQL Server box is almost achievable. (There is always a way for a local administrator to gain authorized access to a SQL Server. For this reason, it is better to entrust the key management to a group that has no elevated authority as a database user, as a local system user, or on the domain and all other ACLs of the database subnet.)
As with my recommendation for TDE: if the SQL instance can work with an encrypted file system you may as well use it. The main costs are a CPU overhead that will never bother most SQL Servers, and the time the DBA and any external key store owners must invest to verify, document and test the various security configurations and restore scenarios.
T-SQL Encryption
That VISA document above says to consider encrypting the data for defense in depth along with TDE. As if data value encryption had little or no protection to offer or was at most a supplemental option. I’m here to tell you that is 100% backwards! And the TDE documentation seems to agree with me. I want to take it one step farther and state that encryption efforts must be redoubled in scope and hardness and TDE can be used as the supplemental technology to protect the database data at rest when the SQL Server is down. If we don’t roll up our sleeves and do as has been advocated by security experts and damn soon, surfing the Internet is going to feel like a ride on Charon’s boat. To be clear, the TDE documentation gets it right. The first sentence of the article, “Understanding Transparent Data Encryption (TDE)” states,
You can take several precautions to help secure the database such as designing a secure system, encrypting confidential assets, and building a firewall around the database servers. …
Whether the SQL Server supports TDE or file encryption must be used to protect the data files and the backups, the increase level of security possible with data value level encryption is far more than the net security benefit of TDE or encrypted files. Most security guidelines recommend using transparent encryption along with data value encryption of identified sensitive values.
Encrypting at the application layer provides the most protection al la Translucent Data Encryption. Accurately identifying personally identifiable information seem to be tricky as well. You just have to do your homework. Do you care if the primary key to the customer table is exposed as clear text as long as the name, address and or the news story BLOBS that are served up to all Internet users on demand? If you were dumb enough to build customer IDs that contain any meaningful tidbits – say a three letter prefix that can be used to group customers you just may have to encrypt the primary key to be secure. If all data access occurs by stored procedure then the encoding and decoding can pretty easily be done in the stored procedures, but I’m telling you this is almost for sure not the best way, even if it is better than not encrypting data. At least the rest of the application will not have to know anything about the encrypted data, but you still have that problem with a single principal responsible for the encryption hierarchy.
In summary, until the focus is turned toward security and away from compliance in lieu of security expect many a profitable company to give up your personally identifiable data unabated and unrequited but don’t expect them to tell you it happened.