Encrypting XML Typed Data in SQL Server

Two issues:

  1. W3C XML Encryption is broken by most accounts.
  2. SQL Server encryption functions cannot digest XML (pun intended).

A double whammy!

The first issue is a bigger deal. If you are encrypting sensitive data with a web service – and you probably are – then your data is vulnerable. If that data is personally identifiable information about individuals your are doing business with then it is your duty and responsibility to stop using those solutions until the XML Encryption standard is corrected. Not until the vendors roll out their fixes and the application is appropriately updated should the application be re-enabled. To knowingly continue to rely upon broken encryption is a criminal activity in my opinion.

Here’s why. In 2003 the Federal Trade Commission (FTC) reported that over 27 million Americans were “victims of identity theft”. In 2007 ars Technica reported that 159 million people were “affected by data breaches” between 2005 and 2007. The Privacy Rights Clearing House reports 542,355,201 records breached from 2747 data breaches between 2005 and 2011. According to the New York Daily News, Facebook alone is attacked 600,000 times a day: almost all Facebook has to offer is your person information. The trend is much more than obvious. And at the same time Reuters reports that MasterCard hits an all-time high on the stock market, “MasterCard Inc shares jumped 7.4 percent to $359.12 after the credit card processor reported its quarterly profit easily beat estimates on double-digit increases in volumes.”

The Information Commissioner of the UK, Christopher Graham told the Telegraph the other day that database breaches are up 58% in the UK in the last year. FWIW, the commissioner can levy fines up to £500,000 under the Kingdom’s Data Protection Act (DPA). Sounds like they have the possibility to agree with my sentiment regarding the undeniable responsibility to protect other people’s information. Sadly, I don’t think the penalty has been leveraged much – if at all – judging by the obvious parallel trend in the USA and the UK.

The broken status quo of capitalism that places profit above human decency, human dignity and often even unalienable human rights is rampant and unraveling around the world regardless of which corporate hegemony and political corruption is in the vogue. Infosys co-chair Kris Gopalakrishnan seems to agree with that appraisal – calling out a systemic income imbalance that advantages the wealthiest 3%-5% in western and emerging economies. Infosys, by the way, is the largest IT organization in the world. He also states that the middle class in the West is already being diminished in reciprocation to the rise of the middle class in emerging economies, though he looses me when he claims that the decline of the middle class in the West is needed to achieve sustainability for the world population. His elitist perspective compounds the double whammy for workers in the West where the distribution of income grows ever more skewed and the downward spiral of the middle class is at least on the edge if not fully out of control and in the same breath implicitly reassures Western oligarchs that the wealthy are not at risk in his equation. The wealthiest amass more wealth while the middle class has now lost the collateral it blew to Reagan Up: consume to help the economy, and now we have nothing to do but wait for some corporation to pee its record profits down on us. And wait. And wait. How much more obvious does it need to be for intelligent people to understand that the wealth that continues to accumulate in the accounts of a few – that have no practical purpose for more wealth – is coming not from vapor but out of the pockets of workers that too often don’t quite have enough and deserve better?

In the USA the worst that can happen to an entity that carelessly exposes other’s private and/or personal information is a bit of hate mail and perhaps a few underfunded – so unsuccessful – civil suits by folks who’s lives have been ruined and accounts depleted as a direct result of corporate negligence and stonewalling. There are rumblings about accountability measures from the government. But what ever the government does, it will not be adequate: too many politicians are in the pockets of the people that would make a bundle from all that insecure data.

Enough is enough! Anytime someone can waltz in and steal other people’s personally identifiable information from a data center or business network, the system’s owners were carelessly negligent. If thefts recur or are ongoing then that negligence has gone beyond the point of criminal culpability. For effective deterrence, the penalty must match the crime. That’s all I’m saying. Yet, in our out of whack situation, only the occasional administrator is held to account. (Better to open your eyes on that one since 99% of the folks reading this fall into the administrator category.)

If the next 5 businesses that failed to protect personally identifiable details that had been demanded from customers were shut down for even a week by the government the landscape would be changed significantly for the better, but only for as long as the threat of meeting a similar fate for similar negligence might endure. This is a carry over from what the government used to do in the mines in Wyoming where I worked in the 1970s. Back then, when safety would get too far out of hand in the push to produce – and it did, over and over and over again. Usually someone had to die to get the attention of the government inspectors and a remedy very specific to the event would result. One that doesn’t solve the real issue but might thwart an exact repeat. A more equitable solution would be to require the leaker profiting from leaks – or mine worker death- would be to appropriately compensate the folks that were violated. But such fairness is actively purged by the legal authorities. The leaker, of course, could pursue the attacker to recover his costs. The essential – and steadfastly absent – ingredient in a solution is unavoidable accountability. Effective regulation could create accountability were the bureaucracy not so corrupt. Even the DOJs National Institute for Justice is so mired in corruption that they cannot uncover the real cost of identity theft, noting evasivley that, “A serious lack of data on these issues inhibits research into possible intervention strategies that could reduce the harm.”  Until the effective costs to corporations is established as more than the cost to not protect data – before any thefts occur – as it to do the right thing and protect our information, the necessary change is unlikely. Corporations can avoid costs and penalties by hiding behind unfair corporate legal protections.

According to Mike Vizard, as he lays out a a survey based accounting in “The Cost of Insecurity” (CTOEdge, DEC 2010), “… the majority of the IT organizations surveyed by [IT security vendor] Lumension have yet to deploy any of [the state of the art] security technologies, even as their operational security costs continue to rise.” (Follow the article link to see the current secure computing adoption rate charts by technology and 12 other mind blowing charts built from survey data from many US IT shops.)

The simple fact is, the technologies needed to protect data and prevent the vast majority of hacker access to information are at hand. Sure it is more costly up front to implement those technologies than to continue to skip them, but how can it be negative for the profit motive of an ongoing concern in the long run when necessary steps are taken to increase the customer/consumer confidence in that entity with a comparatively small incremental rise in development costs? Shamefully, it simply isn’t done. The payoff for a corporation that takes the high road is insufficient in today’s crapitalism.

I would argue that it is a perversion of the profit motive that we have ever allowed contrivance and deception to drive profit. Profit is not inherently bad. Dishonesty is bad. To profit from dishonesty is perverse. Corporations know they cut corners on data security. That is one of the main reasons they are corporations: so the individuals calling the shots can hide from personal responsibility. Customers/consumers know that corporations cut corners on data security. Yet so many mindlessly buy into the “I’ve got nothing to hide” illusion. Governments know that corners are being cut on data security. How else, they must reason, are the oligarchs supposed to buy us politicians. And so, nothing changes. That is pathetic. Intentionally inadequate data security in corporate data centers is a primary reason people’s information is exposed and can subsequently be used to do harm to those person.

I am certainly not saying that all need be done is punish violations to solve the worlds data security problems. I merely assert that such explicit culpability is the first step in the direction of our common full potential: human excellence. Secure data solutions will be more complex and will require extensive defense-in-depth. They will be well worth the effort and cost. They will probably always be under attack as well, requiring on-going hardening and improvement. The near-term goal I advocate is to simply do what is right. There is no gray area around what of your personal information I have the right to be careless with, let alone negligently so…

Update January 13, 2012:

Consumer Reports reports that the financial industry has found a way to increase profit by fear-mongering over identity theft: http://finance.yahoo.com/news/debunking-hype-over-id-theft-080000873.html

The  story also breaks down some identity theft data for 2010. Highlights:

  • 50 million Americans paid $120 -$300 a year (that’s $6-$15 billion) for identity theft protection in 2010 – for liability protection the banks already must provide without charge
  • Identity fraud in the US was down 27% in 2010 to a mere 8.1 million victims
  • Personal identity theft – “in which someone uses your name, birth date, and Social Security number to open new credit accounts, tap your health insurance, earn taxable income, or commit crimes in your name” was at 765,000 households in 2010.

Even if there is no will to do our best, nobody could deny we can surely do better.

On to the second whammy. The incompatibility of SQL Server ‘s encryption functions (ENCRYPTBYASYMKEY(), ENCRYPTBYCERT(), ENCRYPTBYKEY(), ENCRYPTBYPASSPHRASE()) and the XML data type is not such a big deal. Even before SQL Server had the XML type, XML has been stored in the database as VARCHAR. Forcing that conversion between the XML type and the VARCHAR type is really all that is necessary to encrypt XML at the database. From there the door is open to sending encrypted oData or XML documents that may or may not rely also employ XML Encryption. That’s right! You could actually work around the XML encryption vulnerabilities now without breaking the web services in many case by encrypting the XML at the database – and then apply the best practice of decrypting only and exactly when and where needed.

In addition to missing XML type encryption capabilities (and UNIQUEIDENTIFIER, and CLR types btw) SQL Server encryption functions are also limited to 8000 bytes of input in length. That is not a problem for the error handler but may be for larger XML documents. Large XML documents are where much of the pain of the broken XML encryption will remain vulnerable to attack.  To work around this database limitation consider encrypting only sensitive XML attributes at the database and then providing public decryption keys in a secure manner. And only provide these keys to those that must decrypt that data. This is more or less how W3C XML Encryption goes after the problem.

One risk in any piecemeal roll-ur-own XML document encryption like I have just suggested is that the XML attributes not encrypted may still reveal enough information for a successful attack that eventually breaks the encryption. It is therefore also of utmost importance that XML encryption is ‘fixed’ ASAP. Using SQL Server or another work-around to plug the hole should not even pretend to replace secure and proper XML Encryption. In all scenarios SQL Server should be considered only as an augmenting and/or interim solution until the better suited cryptography is made trustworthy. When data originates in the application, SQL Server is almost always a less than ideal place for encryption. In part this is because the data must move to the database unprotected and exposed. That is akin to putting on your seat belt because the air bags just popped out. However, there are many scenarios where the data enters distributed environments via the database. (i.e. peer to peer transfers, XML feeds, performance data, database errors, etc.) When the database is a de facto gateway for the data into an SOA then the database is a better possibility for the encryption.

To see how this might work to relieve the XML Encryption woes, consider the following error handler that writes errors to a SPARSE XML column of a logging table. Database errors are data that absolutely originates at the database so the database is clearly the best place to obfuscate any sensitive data in those error’s message.

An encrypted error message is also an effective way to combat padding oracle attacks where-in the messages returned by failed decryption attempts are mined and analyzed to eventually crack the encryption. Padding oracle attacks are proving especially effective against cipher-block chained (CBC) digests such as is used in the XML Encryption. While SQL Server is also  a CBC hash, it is malleable to a more resistant and somewhat slower hash based digest. (see http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation for an overview of block cipher modes. See the post https://bwunder.wordpress.com/2011/10/07/repel-dictionary-attacks-with-checksum-truncation-of-the-random-lambda/ for some of my notion for a secure hash based digests.)

I don’t mean to exclude streaming modes in the possibilities here, there simply isn’t any support for that possibility in SQL Server. At some level streaming encryption makes sense for XML and other BLOB data types. In those scenario, a streaming resume-able encryption – possibly even transactional but that may be only wishful thinking – would be ideal.

The following example is based loosely on the error handler modeled in the TRY-CATCH topic of Books On-Line.  The example is intended to be extended such that the error event is persisted to a SPARSE XML column in an activity logging table. That means log records without an error will not consume unnecessary and highly variable in length storage allocations.

SET @Id = NEWID();
  RAISERROR('test', 16, 1);

       , ERROR_SEVERITY() AS ErrorSeverity
       , ERROR_STATE() as ErrorState
       , ERROR_PROCEDURE() as ErrorProcedure
       , ERROR_LINE() as ErrorLine
       , ERROR_MESSAGE() as ErrorMessage FOR XML RAW;

  WITH password = 'Au&6Gf% 3Fe14CQAN@wcf?';
            , ( SELECT ERROR_NUMBER() AS ErrorNumber
                      , ERROR_SEVERITY() AS ErrorSeverity
                      , ERROR_STATE() as ErrorState
                      , ERROR_PROCEDURE() as ErrorProcedure
                      , ERROR_LINE() as ErrorLine
                      , ERROR_MESSAGE() as ErrorMessage 
                FOR XML RAW ) 
            , 1
            , @Id );

         , ERROR_SEVERITY() AS ErrorSeverity
         , ERROR_STATE() as ErrorState
         , ERROR_PROCEDURE() as ErrorProcedure
         , ERROR_LINE() as ErrorLine
         , sys.fn_varbintohexstr( 
              ENCRYPTBYKEY ( KEY_GUID( 'ErrorKey' )
                           , ERROR_MESSAGE() 
                           , 1
                           , @Id ) ) AS ErrorMessage 


The resulting  XML from the first of the three error message formats is clear text:

<row ErrorNumber="50000" 
     ErrorMessage="test" />

The second is completely encrypted (I added line feeds for display rendering):


And the third is a hybrid that encrypts only one sensitive XML attribute (again I added the CR+LFs just so you could see the result on this page):

<row ErrorNumber="50000" 
       85a510e3ca46556ef5bf15fcf2" />

The third option represents a middle ground between encryption and field size. This demonstrates a ‘divide and conquer’ tactic that may be necessary to apply the available tools to a workable near-term solution. Chunking may also be necessary to get SQL Server BLOBs into a set of VARBINARY(8000) encryption results may be necessary. However, in most scenarios a need for chunking will render SQL Server encryption unusable.

(And of course in the example it is important to persist that @Id value to the logging table as well if you ever want to decrypt the value. Also note that the script I lifted the above example from, the procedure name is captured by the row so instead of keeping it here as redundant encrypted data, ErrorProcedure() is simply ignored in the captured into XML.)

This entry was posted in Code Review, Encryption Hierarchies, Secure Data. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.