Sql Server SQL Notes
Basics
Now() equivalent
DateTime
GETDATE()
-- or
GETUTCDATE()
For GETDATE(), the ANSI compliant way is 'CURRENT_TIMESTAMP', for example:
SELECT CURRENT_TIMESTAMP
DateTime2
SYSDATETIME()
-- or
SYSUTCDATETIME()
GO and ;
GO means the finish of a batch of SQL statements, while ';' means end of an SQL statement.
GO is only used in client tools like SSMS, and it won't even be sent to server side. It can also be changed to some other words...
If there are 2 SQL statements and the first fails, when they are separated with GO, the 2nd statement can still get executed; when separated with ';', then after the 1st fails, the 2nd won't be executed.
Note that GO must be used before some DDL statements such as 'CREATE VIEW', 'ALTER VIEW' and so on, otherwise there will be the following error:
Incorrect syntax: 'CREATE VIEW' must be the only statement in the batch.
GO also can be used to repeat the statement several times:
CREATE TABLE dbo.han_test_table (a INT)
GO
INSERT INTO dbo.han_test_table VALUES (1)
GO 10 -- exec insert statement 10 times
SELECT * FROM dbo.han_test_table
GO
Null & Empty
Comparison
- NULL like '%aaa%': false
- NULL NOT like '%aaa%': false
Coalesce function
coaleasce(expr1, expr2, ..., expr_n) is interpreted as the following case expression:
case when (expr1 is not null) then expr1
when (expr2 is not null) then expr2
...
when (expr_n is not null) then expr_n
end
so an expression in its parameter list can be evaluated twice! If the result of the expression can change (such as a query that may change according to the change of the data), it's suggested to evaluate it first and store it, then pass the result to coalesce().
BTW, isnull(expr) will only evaluate the expression once of course.
Isnull function
isnull(expr, replacement_value): valuate expression, if it's not null, return the value; otherwise, return 'replacement_value'. So isnull() returns a value that is 'not null', while return value of coaleasce() is nullable.
Nullif function
nullif(expr1, expr2)
: if expr1 = expr2, then return null; otherwise, return expr1.
nullif(expr1, '')
: if value of expr1 is ' ', it still returns null.
Check is null or empty
Check if field is null or empty string or string made up of white spaces:
nullif(field, '') is null
Check if field is null or empty or certain string:
isnull(nullif(field, ''), 'N/A') = 'N/A'
Assign variables
In SQL, you can use SET (standard) or SELECT (SQL Server specific) to assign values to variables. SET can only assign one value, while SELECT can assign one or multiple values. When there is no value in table, SET assigns null, while SELECT won't do anything (ie. the variable holds its previous value).
declare @age as integer;
set @age = (
select age
from employee
where id = 123
);
declare @name as varchar(max) = 'N/A';
declare @age as integer = 0;
select @name = name,
@age = age
from employee
where id = 123;
Concat strings with group by
select key1, key2, stuff( (
select ', ' + t2.str
from myTable t2
where t2.key1 = t1.key1
and t2.key2 = t1.key2
order by t2.sortField
for xml path ('') ), 1, 2, '') concattedString
from myTable t1
group by t1.key1, t1.key2
Using 'for xml path' is a hack: there is no field name and tag name, so it only shows the concat'ed string, instead of an XML. But if in the strings you have things like '&', '<', '>', etc., they will be replaced into things like '&'. To work around this issue, it can be written as:
select key1, key2, stuff( (
select ', ' + t2.str
from myTable t2
where t2.key1 = t1.key1
and t2.key2 = t1.key2
order by t2.sortField
for xml path (''), TYPE).value('(./text())[1]', 'VARCHAR(MAX)'), 1, 2, '') concattedString
from myTable t1
group by t1.key1, t1.key2
'TYPE' directive makes it to return a value of 'xml' type instead of a string, so the special chars won't be escaped in XML way. Then use '.value()' function (case sensitive!!!) to retrieve the value, inside which XML Path is used, and the result is converted into SQL type specified in it.
Note that sometimes it's convenient to use function stuff(str, start, count, replaceStr) to get sub string, since using sub string function you need to specify string itself as well as the length, which you need to evaluate the string again, and in the above case, it's extremely inconvenient.
With
with pseudoTable as (
select ...
)
select ...
from myTable mt
join pseudoTable pt on ... -- pseudoTable can be used as a table in a query
where ...
but it doesn't work to use with in 'insert into select':
-- won't work
insert into myTable
with pseudoTable as (...)
select ...
in the above case, select into can be used:
with pseudoTable as (...)
select ...
into myTable
from ...
Of course, you can also do this:
with pseudoTable as (...)
insert into myTable
select ...
Case
Syntax:
case
when then ...
[else ...]
end
case
when then ...
[else ...]
end
Examples:
select
case name_field
when 'aaa' then 'AAA'
when 'bbb' hten 'BBB'
else 'ERROR'
end
from my_table;
select
case
when price is null or price <= 0 then 'Not for sale'
when price < 100 then 'Under $100'
else 'Expensive'
end
from my_products;
Case can also be used in 'order by', update, having, etc.
Coalesce function
coaleasce(expr1, expr2, ..., expr_n) is interpreted as the following case expression:
case when (expr1 is not null) then expr1
when (expr2 is not null) then expr2
...
when (expr_n is not null) then expr_n
end
so an expression in its parameter list can be evaluated twice! If the result of the expression can change (such as a query that may change according to the change of the data), it's suggested to evaluate it first and store it, then pass the result to coalesce().
BTW, isnull(expr) will only evaluate the expression once of course.
Isnull function
isnull(expr, replacement_value): valuate expression, if it's not null, return the value; otherwise, return 'replacement_value'. So isnull() returns a value that is 'not null', while return value of coaleasce() is nullable.
Data Conversion
Change date to number 'yyyymmdd'
declare @date date = '2001-01-01';
SELECT CONVERT(BIGINT, CONVERT(VARCHAR(8), @date, 112));
The MS page of the datetime conversions:
https://www.mssqltips.com/sqlservertip/1145/date-and-time-conversions-using-sql-server/
Data Type
datetime
Use the time, date, datetime2 and datetimeoffset for new work. These types align with the SQL Standard. They are more portable. time, datetime2 and datetimeoffset provide more seconds precision. datetimeoffset provides time zone support for globally deployed applications. datetime2 has larger date range, a larger default fractional precision, and optional user-specified precision. Also depending on the user-specified precision it may use less storage.
date range:
- datetime: 1753-9999
- datetime2: 0001-9999
precision:
- datetime: 3 1/3 ms
- datetime2: 100ns
while there is increased precision with datetime2, some clients doesn't support date, time, or datetime2 and force you to convert to a string literal. If you're concerned more about compatability than precision, use datetime.
Be careful when adding a .NET
DateTime (soon) value as a parameter to an
SqlCommand (soon), because it likes to assume it's the old datetime type, and you'll get an error if you try to write a
DateTime (soon) value that's outside that 1753-9999 year range unless you explicitly specify the type as System.Data.SqlDbType.DateTime2 for the
SqlParameter (soon).
DateTime:
GETDATE()
-- or
GETUTCDATE()
For GETDATE(), the ANSI compliant way is 'CURRENT_TIMESTAMP', for example:
SELECT CURRENT_TIMESTAMP
DateTime2:
SYSDATETIME()
-- or
SYSUTCDATETIME()
Window Functions
Window functions
- lag(field) over (order by ...): get value of 'field' of the previous row
- lead(field) over (order by ...): get value of 'field' of the next row
- row_number() over (partition ... order by ...): get row number
SELECT
lag(v) OVER (ORDER BY v),
v,
lead(v) OVER (ORDER BY v)
FROM (
VALUES (1), (2), (3), (4)
) t(v)
Performance
usually it's faster to retrieve data from a temp table than a table variable. And you can create index for temp table.
Example: in trade settltment report, I wrote a big query, with a lot of 'with' tables. For the ones that gets rid of duplicates in trade settlement failure, it's quite slow to use it in later queries. I separated it out to a temp table, and created an index for it (index should be created after populating all data into it), the program becomes much faster.
creating a view usually won't improve the performance, but an indexed view can, although it has many restrictions to create one. And if the data changes frequently, we'd better not use an indexed view, for it might be more costly to create the index.
get row number for table without id
Usually we need to do 'select row_number() over (order by my_field) from ...'. But if we don't want to specify a field to sort in order to keep the original order, then what we can hack is:
select row_number() over (order by (select 1)) from my_table
That is, just provide something there... So similarly you can also do the following way:
select row_number() over (order by (select null)) from my_table;
select row_number() over (order by (select 'a')) from my_table;
An application of this trick: select the bottom n rows:
select top 10 *
from (
select row_number() over (order by (select 1)) as id, * from my_table
) my_table1
order by id desc
Programability
Create table-valued function
User defined table-valued function usually is the preferred way comparing with stored procedures and views. A big problem with stored procedure is: if it returns a table, we cannot use SQL to filter the data. While for table-valued function, we can use it in a SQL query as if it were an actual table.
create function dbo.my_function (@param [as] integer)
returns @my_table table (
col1 varchar(8),
col2 varchar(8)
)
begin
insert into @my_table values ('a', 'b');
return
end;
But there are also restrictions to use table-valued function: you cannot write to DB. For example, if temp table is needed, ie. you need to create table and insert / update data in it, then table-valued function cannot be used, and you have to choose to use stored proc. If temp table can be replaced with table variable, then it's fine, but it's possible that you have to use temp table due to performance reasons.
Execute function
if the function is a table-valued function, then it should be put at the position of a table:
select * from dbo.GetFxForwardsForBrokerRec ('20180530', 'citi', null)
where ...
order by ...
Default Parameters
In a stored proc, you can have default parameters, and when not specified when calling the stored proc, default values are used.
But in a user defined function, there is no way to have this behavior, though it's still possible to have default parameters. The caller should fill in 'DEFAULT' as value when using the default value.
CREATE FUNCTION dbo.testFunc(@n AS INT = 1)
RETURNS INT
AS
BEGIN
RETURN @n;
END;
GO
SELECT dbo.testFunc(2);
SELECT dbo.testFunc(DEFAULT);
Or you can also handle 'null':
CREATE FUNCTION dbo.testFunc(@n AS INT)
RETURNS INT
AS
BEGIN
RETURN ISNULL(@n, 1);
END;
GO
SELECT dbo.testFunc(2);
SELECT dbo.testFunc(NULL);
Put values in variables
CREATE TABLE #IndexedNames (i int, name varchar(MAX));
WITH Names AS (
SELECT DISTINCT name
FROM dbo.Staffs
)
INSERT INTO #IndexedNames
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS i, code
FROM Names;
DECLARE @i AS INT;
DECLARE @n AS INT;
-- assign with 'SELECT'
SELECT @i = MIN(i) - 1, @n = MAX(i)
FROM #IndexedNames;
CREATE TABLE #Errors (
name VARCHAR(MAX) NOT NULL,
oldValue DECIMAL(38, 0),
newValue DECIMAL(38, 0)
);
DECLARE @oldValue AS DECIMAL(38, 0);
DECLARE @newValue AS DECIMAL(38, 0);
WHILE @i < @n
BEGIN
-- assign with 'SET'. or: SELECT @i = @i + 1;
SET @i = @i + 1;
SELECT @name = name
FROM #IndexedNames
WHERE i = @i;
--PRINT STR(@i) + ', ' + @name;
SELECT @newValue = salary
FROM dbo.StaffSalary
WHERE name = @name;
-- the way to assign value to variable returned from a stored proc
EXEC @oldValue = dbo.GetSalary @name = @name;
IF @oldValue <> @newValue
BEGIN
PRINT 'ERROR: name = [' + @name + '], oldValue=' + STR(@oldValue) + ', newValue=' + STR(@newValue);
INSERT INTO #Errors (sec, oldValue, newValue) VALUES(@sec, @oldValue, @newValue);
END
END
PRINT 'Done!';
DROP TABLE #IndexedNames;
SQL Loop
In Sql Server, the support of loop is very weak: it only has 'WHILE'. And there is no array, so usually we need to declare a table with only one column and put the items of an array as rows of the table. Usually cursor is used to loop throught the values.
Example:
USE MyDB;
DECLARE @date AS DATETIME;
DECLARE @dates AS TABLE(d DATETIME);
INSERT INTO @dates VALUES
('20181120'),
('20181228'),
('20181231'),
('20190103');
DECLARE dateCursor CURSOR FOR
SELECT d FROM @dates;
OPEN dateCursor;
WHILE 0=0
BEGIN
FETCH NEXT FROM dateCursor INTO @date;
IF @@FETCH_STATUS <> 0 BREAK;
-- use @date to do something
PRINT 'Date: ' + CONVERT(VARCHAR, @date, 23);
END
CLOSE dateCursor;
DEALLOCATE dateCursor;
Batch Template
USE MyDB;
BEGIN TRY
BEGIN TRANSACTION;
-- do something here
-- when error, do following
-- THROW 50000, '<error message here>', 0;
COMMIT;
PRINT '';
PRINT 'Done.';
END TRY
BEGIN CATCH
IF @@TRANCOUNT > 0
ROLLBACK;
PRINT ERROR_MESSAGE();
END CATCH