SQL Server Integration Services Data Type Conversion Testing

There are two ways of converting data types within SQL Server Integration Services (SSIS).  Here is an overview of those options:

  • One way is using T-SQL code. Using the CAST function in your code can convert one data type into another. The scope of this tip is not intended on going into detail about the CAST function, but I will show a quick example.
  • The second way is using the Data Conversion Transformation Task. This task will convert the data in an input column to a different data type and then copy it to the output column. The task itself isn’t too complicating to setup as I’ll show you in this tip.

SQL Server Integration Services Data Conversion Task to Convert Data Types

Let’s look at the Data Conversion task first…

First, open Visual Studio (or Business Intelligence Dev Studio if you’re using pre SQL Server 2012) and create an SSIS project. Next, we can go ahead and configure a Connection Manager to our database. Right click Connection Managers in Solution Explorer and choose New Connection Manager:

SQL Freelancer SQL Server SSIS Data Type Conversion
Choose your Connection Manager type. In this example, we’ll use OLEDB. Next, configure the Connection Manager to point to your dataset. In this example, I’ll use localhost and the AdventureWorks2008R2 database:
SQL Freelancer SQL Server SSIS Data Type Conversion
Test the connection and click OK. Next, drag a Data Flow task from the SSIS toolbox onto the design screen:

SQL Freelancer SQL Server SSIS Data Type Conversion
Right click the Data Flow task and choose Edit. You are now inside the data flow task. This is where all the action happens. Drag an OLEDB source task from the SSIS toolbox to the design screen:

SQL Freelancer SQL Server SSIS Data Type Conversion
Right click the OLEDB task and choose Edit. This screen is where we will define the Connection Manager we created earlier. Under OLEDB connection manager choose the connection you created. Leave data access mode as Table or view. Change the name of the table or the view to the table that contains the data types to change. In this example, I’ll use a table named Sales.CurrencyRate:

SQL Freelancer SQL Server SSIS Data Type Conversion
To preview the data click Preview. In my example, I want to change the CurrencyRateDate and the ModifiedDate columns from a datetime data type to a date data type:

SQL Freelancer SQL Server SSIS Data Type Conversion
Click OK to close the OLEDB Source task. Drag the Data Conversion Transformation task onto the design screen. Connect the OLEDB Source task to the Data Conversion task:

SQL Freelancer SQL Server SSIS Data Type Conversion
Right click the Data Conversion task and choose Edit. Here is where we will convert our data types. Since I am converting CurrencyRateDate and ModifiedDate I will click on each of them in the Available Input Columns list:

SQL Freelancer SQL Server SSIS Data Type Conversion

Click here to view the rest of this post.

 

 

SQL Server Simple and Forced Parameterization

There are two different parameterization options that one can use in SQL Server. Simple parameterization and Forced parameterization. Let’s discuss each a little more in detail.

Simple Parameterization

When you execute a SQL statement without parameters, SQL Server internally will add parameters where needed so that it can try to reuse a cached execution plan. For example, if you look at the execution plan of the following statement you will see that SQL Server changes the WHERE value to a parameter (@1):

SELECT * FROM AdventureWorks2012.Sales.CreditCard WHERE CreditCardID = 11

SQL Freelancer SQL Server Parameterization

SQL Server builds this execution plan as if a parameter was the input instead of the number 11. Because of this parameterization, the following two statements show an example of SQL Server reusing the same execution plan even though the data results are different:

SELECT * FROM AdventureWorks2012.Sales.CreditCard WHERE CreditCardID = 11
SELECT * FROM AdventureWorks2012.Sales.CreditCard WHERE CreditCardID = 207

SQL Freelancer SQL Server Parameterization

This is the default behavior for Simple parameterization, however, it only applies to a small class of queries. If you want all your queries parameterized, you will need to enable the option, Forced parameterization.

Forced Parameterization

Forced parameterization is when the database engine parameterizes any literal value that appears in a SELECT, UPDATE, INSERT, or DELETE statement submitted in any form, but there are a few exceptions.  Refer to this article for a list of these exceptions.

Some applications use queries that pass in literals as opposed to stored procedures that pass in parameters. For these type of applications you may want to experiment with enabling Forced parameterization to see if it has a positive effect on the workload by reducing query compilations.

Running the following query in Simple parameterization produces the following execution plan where the WHERE clause is not parameterized:

SQL Freelancer SQL Server Parameterization
Click here to view the rest of this post.

Detecting CPU Pressure

There are a few different ways to detect CPU pressure, for example, Performance Monitor or SQL Profiler, but in this tip we’ll discuss the wait type SOS_SCHEDULER_YIELD and the DMV sys.dm_os_schedulers.

Some of you might have seen the wait type SOS_SCHEDULER_YIELD and wondered what in the world is this? Books Online (BOL) describes this wait type as:

Occurs when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed.

SOS_SCHEDULER_YIELD is a fairly common wait type and occurs when there is CPU pressure. SQL Server runs multiple threads and tries to allow all threads to run without problems. However, if all the threads are busy on each scheduler and can’t let other threads run, it will yield itself for another thread, which in turn creates the SOS_SCHEDULER_YIELD wait type.

You can see the top wait stats and the percentage of waits at the current point in time by running the following query by Tim Ford and Glenn Berry.

WITH Waits AS 
( 
SELECT 
wait_type, 
wait_time_ms / 1000. AS wait_time_s, 
100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct, 
ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn 
FROM sys.dm_os_wait_stats 
WHERE wait_type 
NOT IN 
('CLR_SEMAPHORE', 'LAZYWRITER_SLEEP', 'RESOURCE_QUEUE', 
'SLEEP_TASK', 'SLEEP_SYSTEMTASK', 'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 
'CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT') 
) -- filter out additional irrelevant waits 
SELECT W1.wait_type, 
CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s, 
CAST(W1.pct AS DECIMAL(12, 2)) AS pct, 
CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct 
FROM Waits AS W1 
INNER JOIN Waits AS W2 ON W2.rn <= W1.rn 
GROUP BY W1.rn, 
W1.wait_type, 
W1.wait_time_s, 
W1.pct 
HAVING SUM(W2.pct) - W1.pct < 95; -- percentage threshold;

After running this query on one of my servers I can see that only 2.22% of my waits are due to CPU pressure so I can rule that out as a cause for poor performance. (This is only current since the last SQL Service restart):

SQL Freelancer SQL Server CPU Pressure DMV

If you see a high percentage of this wait type then one way of troubleshooting this is to use the DMV sys.dm_os_schedulers. BOL describes sys.dm_os_schedulers as:

Returns one row per scheduler in SQL Server where each scheduler is mapped to an individual processor. Use this view to monitor the condition of a scheduler or to identify runaway tasks.

This DMV has several columns including number or workers, active tasks, and the status of each scheduler, etc., which can help troubleshoot certain problems, but the most important column is the one used for measuring queued tasks, the runnable_tasks_count column. This column displays the count of tasks yielding to other tasks resulting in a SOS_SCHEDULER_YIELD wait type. If this column is frequently greater than 0 then CPU pressure may be present and blocking may occur.

To see an average of current tasks and current waiting tasks you can use the following query:

SELECT AVG(current_tasks_count) AS [Avg Current Task], 
AVG(runnable_tasks_count) AS [Avg Wait Task]
FROM sys.dm_os_schedulers
WHERE scheduler_id < 255
AND status = 'VISIBLE ONLINE'

SQL Freelancer SQL Server CPU Pressure DMV
Click here to view the rest of this post.

Client Statistics

A feature often overlooked while tuning queries is Client Statistics located right on SSMS editor bar.

SQL Freelancer SQL Server Client Statistics
Other ways of opening Client Statistics include:

SHIFT + ALT + S and Menu Bar – Query -> Include Client Statistics

SQL Freelancer SQL Server Client Statistics

Client Statistics is useful when the user needs to gather information about execution times, processing times, the amount of data sent between client and server, etc. it’s very easy to use. Simply turn it on using one of the methods above and execute your query.

In this example I’m using the following query:

SELECT pc.FirstName
,pc.LastName
,pc.EmailAddress
,he.Title
,hh.Rate
FROM Person.Contact pc
JOIN HumanResources.Employee he ON pc.ContactID = he.ContactID
JOIN HumanResources.EmployeePayHistory hh ON he.EmployeeID = hh.EmployeeID
WHERE hh.Rate > 10
ORDER BY hh.Rate

After executing the query you will notice a new tab labeled Client Statistics:

SQL Freelancer SQL Server Client Statistics

After looking at the statistics provided for Trial 1 you can see the processing time = 39ms, execution time = 46ms, etc.

After the running the same query again I get the following results under Trial 2:

SQL Freelancer SQL Server Client Statistics
This time processing time went up to 45ms and execution time went up to 51ms. It also has an average column to compare overall results. The green and red arrows represent differences between trials. Green arrows indicate improved statistics and red arrows indicate degrading statistics.

A max of 10 trials can be run with the 11th trial dropping the 1st trial and so on. To reset statistics, on the menu bar, go to Query, Reset Client Statistics.

SQL Freelancer SQL Server Client StatisticsThis is great when tuning indexes because you can see different statistics as you add/delete indexes.

Statistics IO and Statistics TIME

Statistics IO and Statistics TIME can help in performance tuning from a granular level and saves time compared to Execution Plans, SQL Server Profiler, etc. Let’s looks at an example of these two commands:

There a few different ways in which you can turn on these commands inside SQL Server Management Studio. If you want every query window to open up with statistics you can go to Tools, Options, Query Execution, SQL Server, Advanced and check SET STATISTICS TIME and SET STATISTICS IO:

SQL Freelancer SQL Server SET STATISTICS TIME SET STATISTICS IOMost of the time you will just want to use statistics for a certain troublesome query. To turn these commands on for a specified query window, open the query window and go to Query, Query Options, Execution, Advanced and check SET STATISTICS TIME and SET STATISTICS IO:

SQL Freelancer SQL Server SET STATISTICS TIME SET STATISTICS IOOr you can simply use T-SQL and type the following to turn statistics on

SET STATISTICS IO ON
SET STATISTICS TIME ON

And to turn off simply use:

SET STATISTICS IO OFF
SET STATISTICS TIME OFF

In this example I’m going to run a query from AdventureWorks using statistics:

SET STATISTICS IO ON
SET STATISTICS TIME ON

SELECT pc.FirstName
,pc.LastName
,pc.EmailAddress
,he.Title
,hh.Rate
FROM Person.Contact pc
JOIN HumanResources.Employee he ON pc.ContactID = he.ContactID
JOIN HumanResources.EmployeePayHistory hh ON he.EmployeeID = hh.EmployeeID
WHERE hh.Rate > 10
ORDER BY hh.Rate

After the query finishes it will show you results like normal but if you click on the Messages tab you will see some extra information:

SQL Freelancer SQL Server SET STATISTICS TIME SET STATISTICS IO

You will see IO information at the top:

Scan Count – Number of index or table scans
Logical Reads – Number of pages read from the data cache
Physical Reads – Number of pages read from disk
Read-Ahead Reads – Number of pages placed into the cache for the query
LOB Logical Reads – Number of text, ntext, image, or large value pages read from the data cache
LOB Physical Reads – Number of text, ntext, image, or large value type pages read from disk
LOB Read-Ahead Reads – Number of text, ntext, image, or large value type pages placed into the cache for the query

Execution TIME information will be directly below:

CPU Time: How long the query worked with the CPU
Elapsed Time: How long the query took to gather data

Statistics, in general, is an excellent starting point to see why you might have a poor performing query. You can gather results such as execution time, compile time, reads, writes, cost, etc.

SQL Server performance tuning for each layer of an application

Performance tuning is a big subject and there are a lot of different pieces to troubleshooting a poor performing database or application. I like to use the 5 level process shown below:

  1. Server Hardware
  2. Operating System
  3. SQL Server
  4. Database
  5. Application

SQL Freelancer SQL Server Performance Tuning

Hardware

When troubleshooting poor performance don’t always assume it’s something wrong with the database itself. The problem often lies deeper. We’ll start with hardware. If the hardware isn’t up to par, your OS, database, and application will suffer.

One of the best tools to monitor hardware are counters that are part of the Performance Monitor, or PerfMon for short. I’m not going to go into how to use PerfMon, which can be covered in a different tip, but I will tell you some of the most important counters to watch and a description of what they do.

Network Counters:

  • Network Interface: Bytes Total/sec – Bytes Total/sec is the rate at which bytes are sent and received over each network adapter, including framing characters. Network Interface\Bytes Total/sec is a sum of Network Interface\Bytes Received/sec and Network Interface\Bytes Sent/sec. This value should be pretty low.
  • Network Interface: Output Queue Length -Output Queue Length is the length of the output packet queue (in packets). If this is longer than two, there are delays and the bottleneck should be found and eliminated, if possible. Since the requests are queued by the Network Driver Interface Specification (NDIS) in this implementation, this should always be 0.
  • Network Interface: Packets Outbound Errors – Packets Outbound Errors is the number of outbound packets that could not be transmitted because of errors. This value should stay at 0 also.

SQL Freelancer SQL Server Performance TuningClick here to view the rest of this post.