Home > IT > Database > no title

Sql Server SQL Notes

Basics

Now() equivalent

DateTime
GETDATE()
-- or
GETUTCDATE()

For GETDATE(), the ANSI compliant way is 'CURRENT_TIMESTAMP', for example:

SELECT CURRENT_TIMESTAMP
DateTime2
SYSDATETIME()
-- or
SYSUTCDATETIME()

GO and ;

GO means the finish of a batch of SQL statements, while ';' means end of an SQL statement.

GO is only used in client tools like SSMS, and it won't even be sent to server side. It can also be changed to some other words...

If there are 2 SQL statements and the first fails, when they are separated with GO, the 2nd statement can still get executed; when separated with ';', then after the 1st fails, the 2nd won't be executed.

Note that GO must be used before some DDL statements such as 'CREATE VIEW', 'ALTER VIEW' and so on, otherwise there will be the following error:

Incorrect syntax: 'CREATE VIEW' must be the only statement in the batch.

GO also can be used to repeat the statement several times:

CREATE TABLE dbo.han_test_table (a INT)
GO

INSERT INTO dbo.han_test_table VALUES (1)
GO 10  -- exec insert statement 10 times

SELECT * FROM dbo.han_test_table
GO

Null & Empty

Comparison

NULL like '%aaa%': false
NULL NOT like '%aaa%': false

Coalesce function

coaleasce(expr1, expr2, ..., expr_n) is interpreted as the following case expression:

case when (expr1 is not null) then expr1
        when (expr2 is not null) then expr2
        ...
        when (expr_n is not null) then expr_n
end

so an expression in its parameter list can be evaluated twice! If the result of the expression can change (such as a query that may change according to the change of the data), it's suggested to evaluate it first and store it, then pass the result to coalesce().

BTW, isnull(expr) will only evaluate the expression once of course.

Isnull function

isnull(expr, replacement_value): valuate expression, if it's not null, return the value; otherwise, return 'replacement_value'. So isnull() returns a value that is 'not null', while return value of coaleasce() is nullable.

Nullif function

nullif(expr1, expr2): if expr1 = expr2, then return null; otherwise, return expr1.

nullif(expr1, ''): if value of expr1 is ' ', it still returns null.

Check is null or empty

Check if field is null or empty string or string made up of white spaces:

nullif(field, '') is null

Check if field is null or empty or certain string:

isnull(nullif(field, ''), 'N/A') = 'N/A'

Assign variables

In SQL, you can use SET (standard) or SELECT (SQL Server specific) to assign values to variables. SET can only assign one value, while SELECT can assign one or multiple values. When there is no value in table, SET assigns null, while SELECT won't do anything (ie. the variable holds its previous value).

declare @age as integer;

set @age = (
    select age
    from employee
    where id = 123
);
declare @name as varchar(max) = 'N/A';
declare @age as integer = 0;

select @name = name,
       @age = age
from employee
where id = 123;
Concat strings with group by
select key1, key2, stuff( (
    select ', ' + t2.str
    from myTable t2
    where t2.key1 = t1.key1
    and t2.key2 = t1.key2
    order by t2.sortField
    for xml path ('') ), 1, 2, '') concattedString
from myTable t1
group by t1.key1, t1.key2

Using 'for xml path' is a hack: there is no field name and tag name, so it only shows the concat'ed string, instead of an XML. But if in the strings you have things like '&', '<', '>', etc., they will be replaced into things like '&'. To work around this issue, it can be written as:

select key1, key2, stuff( (
    select ', ' + t2.str
    from myTable t2
    where t2.key1 = t1.key1
    and t2.key2 = t1.key2
    order by t2.sortField
    for xml path (''), TYPE).value('(./text())[1]', 'VARCHAR(MAX)'), 1, 2, '') concattedString
from myTable t1
group by t1.key1, t1.key2

'TYPE' directive makes it to return a value of 'xml' type instead of a string, so the special chars won't be escaped in XML way. Then use '.value()' function (case sensitive!!!) to retrieve the value, inside which XML Path is used, and the result is converted into SQL type specified in it.

Note that sometimes it's convenient to use function stuff(str, start, count, replaceStr) to get sub string, since using sub string function you need to specify string itself as well as the length, which you need to evaluate the string again, and in the above case, it's extremely inconvenient.

With

with pseudoTable as (
    select ...
)
select ...
from myTable mt
join pseudoTable pt on ...  -- pseudoTable can be used as a table in a query
where ...

but it doesn't work to use with in 'insert into select':

-- won't work
insert into myTable
with pseudoTable as (...)
select ...

in the above case, select into can be used:

with pseudoTable as (...)
select ...
into myTable
from ...

Of course, you can also do this:

with pseudoTable as (...)
insert into myTable
select ...

Case

Syntax:

case 
    when  then ...
    [else ...]
end

case
    when  then ...
    [else ...]
end

Examples:

select
    case name_field
        when 'aaa' then 'AAA'
        when 'bbb' hten 'BBB'
        else 'ERROR'
    end
from my_table;

select 
    case
        when price is null or price <= 0 then 'Not for sale'
        when price < 100 then 'Under $100'
        else 'Expensive'
    end
from my_products;

Case can also be used in 'order by', update, having, etc.

Coalesce function

coaleasce(expr1, expr2, ..., expr_n) is interpreted as the following case expression:

case when (expr1 is not null) then expr1
        when (expr2 is not null) then expr2
        ...
        when (expr_n is not null) then expr_n
end

BTW, isnull(expr) will only evaluate the expression once of course.

Isnull function

Data Conversion

Change date to number 'yyyymmdd'

declare @date date = '2001-01-01';
SELECT CONVERT(BIGINT, CONVERT(VARCHAR(8), @date, 112));

The MS page of the datetime conversions:

https://www.mssqltips.com/sqlservertip/1145/date-and-time-conversions-using-sql-server/

Data Type

datetime

Use the time, date, datetime2 and datetimeoffset for new work. These types align with the SQL Standard. They are more portable. time, datetime2 and datetimeoffset provide more seconds precision. datetimeoffset provides time zone support for globally deployed applications. datetime2 has larger date range, a larger default fractional precision, and optional user-specified precision. Also depending on the user-specified precision it may use less storage.

date range:

datetime: 1753-9999
datetime2: 0001-9999

precision:

datetime: 3 1/3 ms
datetime2: 100ns

while there is increased precision with datetime2, some clients doesn't support date, time, or datetime2 and force you to convert to a string literal. If you're concerned more about compatability than precision, use datetime. Be careful when adding a .NET DateTime (soon) value as a parameter to an SqlCommand (soon), because it likes to assume it's the old datetime type, and you'll get an error if you try to write a DateTime (soon) value that's outside that 1753-9999 year range unless you explicitly specify the type as System.Data.SqlDbType.DateTime2 for the SqlParameter (soon).

DateTime:

GETDATE()
-- or
GETUTCDATE()

For GETDATE(), the ANSI compliant way is 'CURRENT_TIMESTAMP', for example:

SELECT CURRENT_TIMESTAMP

DateTime2:

SYSDATETIME()
-- or
SYSUTCDATETIME()

Window Functions

Window functions

lag(field) over (order by ...): get value of 'field' of the previous row
lead(field) over (order by ...): get value of 'field' of the next row
row_number() over (partition ... order by ...): get row number

SELECT
  lag(v) OVER (ORDER BY v),
  v, 
  lead(v) OVER (ORDER BY v)
FROM (
  VALUES (1), (2), (3), (4)
) t(v)

Performance

usually it's faster to retrieve data from a temp table than a table variable. And you can create index for temp table.

Example: in trade settltment report, I wrote a big query, with a lot of 'with' tables. For the ones that gets rid of duplicates in trade settlement failure, it's quite slow to use it in later queries. I separated it out to a temp table, and created an index for it (index should be created after populating all data into it), the program becomes much faster.

creating a view usually won't improve the performance, but an indexed view can, although it has many restrictions to create one. And if the data changes frequently, we'd better not use an indexed view, for it might be more costly to create the index.

get row number for table without id

Usually we need to do 'select row_number() over (order by my_field) from ...'. But if we don't want to specify a field to sort in order to keep the original order, then what we can hack is:

select row_number() over (order by (select 1)) from my_table

That is, just provide something there... So similarly you can also do the following way:

select row_number() over (order by (select null)) from my_table;
select row_number() over (order by (select 'a')) from my_table;

An application of this trick: select the bottom n rows:

select top 10 *
from (
    select row_number() over (order by (select 1)) as id, * from my_table
) my_table1
order by id desc

Programability

Create table-valued function

User defined table-valued function usually is the preferred way comparing with stored procedures and views. A big problem with stored procedure is: if it returns a table, we cannot use SQL to filter the data. While for table-valued function, we can use it in a SQL query as if it were an actual table.

create function dbo.my_function (@param [as] integer)
    returns @my_table table (
        col1 varchar(8),
        col2 varchar(8)
    )
begin
    insert into @my_table values ('a', 'b');
    return
end;

But there are also restrictions to use table-valued function: you cannot write to DB. For example, if temp table is needed, ie. you need to create table and insert / update data in it, then table-valued function cannot be used, and you have to choose to use stored proc. If temp table can be replaced with table variable, then it's fine, but it's possible that you have to use temp table due to performance reasons.

Execute function

if the function is a table-valued function, then it should be put at the position of a table:

select * from dbo.GetFxForwardsForBrokerRec ('20180530', 'citi', null)
where ...
order by ...

Default Parameters

In a stored proc, you can have default parameters, and when not specified when calling the stored proc, default values are used.

But in a user defined function, there is no way to have this behavior, though it's still possible to have default parameters. The caller should fill in 'DEFAULT' as value when using the default value.

CREATE FUNCTION dbo.testFunc(@n AS INT = 1)
RETURNS INT
AS
BEGIN
   RETURN @n;
END;
GO

SELECT dbo.testFunc(2);

SELECT dbo.testFunc(DEFAULT);
Or you can also handle 'null':

CREATE FUNCTION dbo.testFunc(@n AS INT)
RETURNS INT
AS
BEGIN
   RETURN ISNULL(@n, 1);
END;
GO


SELECT dbo.testFunc(2);

SELECT dbo.testFunc(NULL);

Put values in variables

CREATE TABLE #IndexedNames (i int, name varchar(MAX));

WITH Names AS (
   SELECT DISTINCT name
   FROM dbo.Staffs
)
INSERT INTO #IndexedNames
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS i, code
FROM Names;

DECLARE @i AS INT;
DECLARE @n AS INT;

-- assign with 'SELECT'
SELECT @i = MIN(i) - 1, @n = MAX(i)
FROM #IndexedNames;

CREATE TABLE #Errors (
   name VARCHAR(MAX) NOT NULL,
   oldValue DECIMAL(38, 0),
   newValue DECIMAL(38, 0)
);

DECLARE @oldValue AS DECIMAL(38, 0);
DECLARE @newValue AS DECIMAL(38, 0);

WHILE @i < @n
BEGIN
   -- assign with 'SET'. or: SELECT @i = @i + 1;
   SET @i = @i + 1;

   SELECT @name = name
   FROM #IndexedNames
   WHERE i = @i;

   --PRINT STR(@i) + ', ' + @name;

   SELECT @newValue = salary
   FROM dbo.StaffSalary
   WHERE name = @name;

   -- the way to assign value to variable returned from a stored proc
   EXEC @oldValue = dbo.GetSalary @name = @name;

   IF @oldValue <> @newValue
   BEGIN
      PRINT 'ERROR: name = [' + @name + '], oldValue=' + STR(@oldValue) + ', newValue=' + STR(@newValue);
      INSERT INTO #Errors (sec, oldValue, newValue) VALUES(@sec, @oldValue, @newValue);
   END
END

PRINT 'Done!';

DROP TABLE #IndexedNames;

SQL Loop

In Sql Server, the support of loop is very weak: it only has 'WHILE'. And there is no array, so usually we need to declare a table with only one column and put the items of an array as rows of the table. Usually cursor is used to loop throught the values.

Example:

USE MyDB;

DECLARE @date AS DATETIME;
DECLARE @dates AS TABLE(d DATETIME);

INSERT INTO @dates VALUES
   ('20181120'),
   ('20181228'),
   ('20181231'),
   ('20190103');

DECLARE dateCursor CURSOR FOR
SELECT d FROM @dates;

OPEN dateCursor;

WHILE 0=0
BEGIN
   FETCH NEXT FROM dateCursor INTO @date;
   
   IF @@FETCH_STATUS <> 0 BREAK;

   -- use @date to do something
   PRINT 'Date: ' + CONVERT(VARCHAR, @date, 23);
END

CLOSE dateCursor;
DEALLOCATE dateCursor;

Batch Template

USE MyDB;

BEGIN TRY
   BEGIN TRANSACTION;
   
   -- do something here
   -- when error, do following
   -- THROW 50000, '<error message here>', 0;

   COMMIT;

   PRINT '';
   PRINT 'Done.';
END TRY
BEGIN CATCH
   IF @@TRANCOUNT > 0
      ROLLBACK;

   PRINT ERROR_MESSAGE();
END CATCH

Top Page

Links