Granted, there are a few data concurrency controls that are more effective when implemented at the application or data access layer. Examples here would include fairly sophisticated custom processing logic for handling optimistic concurrency collision, retry/recovery processing for otherwise failed transactions, and connection attributes. Far more concurrency controls are implemented within the SQL Server instance – e.g. blocking, deadlocks and resource bottlenecks, configuration, design, indexing, etc. This post attempts to provide comparative contention cost measurements for various some identified options.
Previous posts in the series provide comparative results that are useful and representative of a broad spectrum of hardware configurations. Concurrency test results are a different beast. The concurrency capabilities of the laptop test platform are less than stellar. I can assure you that server class hardware will produce substantially different results. To measure and evaluate concurrency testing must be done on hardware representative of the intended live system. The laptop hits the wall immediately in terms of disk-IO and multithreading at all concurrency levels above 1. This is good in a way because it allows the reader to see a problematic concurrency profile. I considered adding a couple of SATA II SDDs to my ExpressCard eSATA adapter, but decided that the $1000 for two RAID1 enclosures and the 4 drives that should easily saturate the eSATA bus would not provide much useful information. (hmmm… maybe data and log on separate devices is not necessary for SDD? So much to learn….) It is essential that final concurrency testing be done using representative hardware and representative hardware configurations.
SQL Server concurrency controls are presented at the statement, object, database and server scopes.
- statement – Hints may be applied to query statements to manipulate lock type, duration, and escalation characteristics. Hinting is well covered in part 5. One consideration about hinting not mentioned already is the possibility to implement hinting via pass-through from the application as embedded SQL. It could get pretty messy in a hurry, but if only one or two persistence operations need better throughput and the application is already generating the SQL, adding hints may be a reasonable and elegant solution. Hinting is the only available control managing or restricting lock escalations from shared to exclusive (X) during persistence activities.
- object – Compile objects can set the Transaction Isolation Level for any and all transactions contained in the object or called objects. By default, locks escalate from row to page to a partition or table granularity depending upon the number of rows in the able and the number of affected by the current query. The ability to take a table level lock can be disabled, or the partition level lock can be explicitly disabled even if the table is partitioned as specified in the table definition. Row locks and page locks can be enabled or disabled in the index definition. The table highlights the configuration matrix settings needed to take control of lock granularity escalations.
TABLE DDL INDEX DDL INDEX DDL
Row Page Partition Table AUTO ON ON X X X X AUTO ON OFF X X X AUTO OFF ON X X X AUTO OFF OFF X X TABLE ON ON X X X TABLE ON OFF X X TABLE OFF ON X X TABLE OFF OFF X DISABLE ON ON X X DISABLE ON OFF X DISABLE OFF ON X DISABLE OFF OFF
- database – A default Transaction Isolation Levels is defined within each database. always makes me wonder why they called it a Transaction Isolation Level. True enough, any transaction can override a database’s default Transaction Isolation Level, although even non-transactional SELECTs play by the rules and the choice to use READ_COMMITED locking isolation or READ_COMMITTED_SNAPSHOT row versioning isolation must be made at the database scope. Read-only only is not an isolation level but maybe should be treated like one. No locks are taken when the database is READ-ONLY. Database data and log file placement and count are proven to increase server throughput.
- server – Obviously, hardware configurations that can supports the required throughput requirement are most important for throughput. Likewise, the number of concurrent threads of execution the server can support is directly tied to the database’s concurrency capabilities. Soft concurrency options at the server level include Max Concurrent connections, Non-uniform Memory Access (NUMA), Max Worker Threads, Optimize for ad hoc Workloads, Max Degree of Paralellism, and even Query Governor to some extent. Many server configuration options (see sp_configure) will affect concurrency in some scenarios.
In order to objectively identify the right locking strategy a controlled environment on reference hardware where identified options can be manipulated one at a time is needed. The options to be tested are best selected as a design stage thought problem. When too many options emerge to realistically test and evaluate them all, eliminating options that really don’t make sense for the application before time is invested to test those options is time and money well not spent. On the other hand, when not so many options are obvious when thinking about the locking strategy, including interesting options that may not make sense for the application yet expose a capability of the database server that is presently not exploited by application should not be ignored.
The following is a quick brainstorming list of high level options that could be considered in the thought problem:
- Database Server level
- IO sub-system
- disc partition alignment
- No e512 (disk sector alignment? will it never end?)
- instant file initialization
- competing processes
- SQL Server instance level
- Query Governor
- recovery interval
- optimize for ad hoc workloads
- network packet size
- defaults (fill factor, recovery mode, ANSI, NUMA, etc.)
- audit overhead
- Database level
- storage configuration
- transaction isolation level
- row versioning
- recovery model
- auto statistics
- full text search
- Object level
- sparse columns
- full text search
- Query Statement
- Common Table Expressions
- derived tables
While far from complete or exhaustive, the list will always be is too long to spend relatively expensive test cycles on each option identified. Thought tools – what ever it takes to transform the existing application and platform using the experience and knowledge of the design team members into an actionable collaboration – are most effective for paring the list to a reasonable set of options to test. This is general a software development best practice. Nothing new here. It is worth stating that the listing of possible options should be well fleshed out to assure nothing important is overlooked. Only then should the work of selecting the options to be further scrutinized and tested begin.
It is important to keep the testing focused on the database instance. As demonstrated in part 6, the overhead of even a well integrated client like SSIS can obscure test results. Client side asynchronous processing may fool the user but does not and cannot reduce the total processing. If anything, background processing will incur a net total cost increase. Nonetheless, the user experience must be a considered vantage point when optimizing the database. Just not the only vantage point.
With some amazing success the insert command of the ADO.NET DataAdapter or – to a much lesser degree – the Entity Model DataContext can be overloaded to improve throughput. Caution is advised with this approach because it is a custom solutions. Custom solutions carry unknown optimization and scalability maintenance costs. In most cases there is less risk and lower cost to keep concurrency optimizations contained within the database server.
Standardizing on compiled database objects (stored procedures, views, functions) – particularly for CRUD operations – provides an excellent encapsulating container for database optimizations without unnecessary coupling of database optimizations to upstream application layers. It is not always possible to avoid application changes to improve database throughput. It is a best practice with one important exception.
Application managed transactions are more robust and reliable than T-SQL transactions. The introduction of TRY-CATCH error handling to T-SQL certainly reduced the disparity, though by no means should T-SQL TRY-CATCH be considered on par with .NET TRY-CATCH in terms of robust error handling. There are still far too many run-time ‘fatal-to-the-batch’ exceptions that will kill a SQL Server process before it can execute the next line of T-SQL after the exception. I like to go back – here and in most considerations – to using the database to do what databases do well: SELECT, INSERT, UPDATE and occasionally DELETE. As valid as the reasoning may seem to be for using only database transactions in any application, the first time a T-SQL transaction goes viral in the connection pool the reason for application originated transactions is obvious.
In part 7 SQL Server is seen to unpredictably deadlock when two or more persistence operations are executed in parallel. Experience tells me there is reason to expect that explicit locking tactics can prevent that deadlocking. Consequently there is a valid reason to fully evaluate that expectation by testing to determine which of the several options considered will be most effective.
Using the information about row-at-a-time persistence methods and table-valued parameter persistence methods explored in part 3 and presuming an advantage for classic upserts over the new MERGE statement as illuminated in part 4 to help pare down what to test; a test plan that documents parallel execution of complete test batteries at various rates of parallel execution is determined to be appropriate.
The tests will be built from the RowUpsert and the TVPUsert test batteries. All tests are conducted on the same other-wise idle system. A complete battery of concurrency tests will involve a stand-alone execution of the test cycle – the same test used in previous posts – along with parallel executions at levels of 3, 5 and 10 concurrent executions.
In these tests an explicit hint based locking strategy will require that exclusive locks on all target objects at the beginning of the transaction wrapping each batch or set upsert. In this way the risk of deadlocks when a query attempts to escalate from a Shared lock to eXclusive lock is eliminated. The strategy will have serializing effect on rows and/or pages affected by the upsert but most of the already existing data should remain available to other queries.
Table level locks are not included in this test because previous test cycles have shown that the potential for blocking latencies and deadlock invoked rollbacks are unavoidable when table level granularity is used in the test scenario. In fact, previous results indicate that it may be desirable to disable lock escalation to the table level depending on the over-all query load seen by the target hierarchy. Table locks are most likely to be useful when a staging table is used and concurrency requirements are low.
In the test, Lock management will be hint based. Between the database transaction isolation level – pretending for now that READ-ONLY is an isolation level ALLOW_DATABASE_SNAPSHOT options, the table LOCK_ESCALATION property and the ALLOW_ROW_LOCKS and ALLOW_PAGE_LOCKS relational index switches it is possible to configure the database to prevent/reduce granularity lock escalation. When tuning 3rd party applications, code generators, ORMs and Cloud Interfaces) and complex data models it can be helpful to have these options available. In scenarios where lock escalation control is desired that allow the possibility to use them; query statement hints applied only to identified queries are less difficult to adequately evaluate and test.
ALTER DATABASE name
SET ( LOCK_ESCALATION = AUTO | TABLE | DISABLE )
ALTER TABLE name
SET ( LOCK_ESCALATION = AUTO | TABLE | DISABLE )
ALTER INDEX name ON table/view
WITH (ALLOW_ROW_LOCKS = ON|OFF|ALLOW_ROW_LOCKS = ON|OFF)
Comparative charts for the following test matrix at the parallel loading levels of 1,3,5,10 are shown below. The Report used to generate the screen shots is a better way to do the comparisons because the report allows any selected methods to be shown side-by-side on the same graph
|test case||Upsert Methods||Isolation level||Lock granularity|
The most remarkable piece of information from the row-at-a-time test cycles is unremarkable the effects of manipulating the selected options. Pretty sure you will agree that there is not much of interest here. (Do keep in mind that deadlocking when the explicit eXclusive table lock is taken occurs at a high frequency. No deadlocking observed for any of the tests in this post) . The TVP test cycles are more diverse.
Charts are posted consecutively with no comments to make it a little easier to compare them.
That fat pink hash in the TVP charts is a reminder of where the row-at-a-time upserts fit in the picture.
As you can see. My laptop has serious scalability problems in concurrent scenarios. In all test cases, running 3 concurrent processes was enough load to upset the apple cart. Page locks using row versioning appears to have the best scalability (e.g. the lest difference between the one-test-at-a-time curve and the 10-tests-at-a-time curve.) however it is also the only TVP option that is never better than row-at-a-time load methods. I have conducted this test on more than one well configured server and I can assure you the test is considerably more interesting on good hardware.
I put together a Reporting Services concurrency matrix with sparklines in each concurrent processes column grouping. From the screen shot it is apparent the ms/row is increasingly erratic for the TVP load the farther below 1000 the set size. The height of the chart rows is controlled by the range of the chart so tall rows indicate at least one occurence of poor throughput. Each test sample is a black dot on the blue line so the sample size and the number of wildly out of whack data points can be gleaned at a glance.
Here is the complete chart as a Word doc.
Next up, I’ll try to close out this series on persisting data with test results from a couple of hybrid load methods that combine techniques that work well from the test results covered so far with some easy contention reducers to maximize throughput.