SQL Server

  

These are my personal notes that I use as a quick help in my work.
You are welcome to read them.

Contents of current page Top-level home page
 
Index  Java Internet Oracle Notes
Linux Basics Web Basics SQL Notes
Informatica Servlets Apache BkpRstore SQL*Plus
Teradata   LDAP Storage PL/SQL
Windows     Tables OEM
UML   Net8 Portal
SQL Server Python perl Performance OLAP
Vmware Visual Basic PHP/MySQL User Mgmt  
Git        
More technical pages here

Contents

 


Introduction

 

master system area
model templates
msdb used by ms-sql agent
northwind
pubs
examples
tempdb temporary storage

 

Query analyzer is equivalent to SQL*plus

The Enterprise Manager can view all servers

 

 Create a new db:

Create a new user: create "Login". For properties, right-click the user then properties > user mapping. Select the database. Then, for that database, grant "public" and eventually "db_owner" roles.

Backup a database:

Note that when deleting databases, the files automatically are removed from the file system, yet the database have to manually removed from the database maintenance.

When the log file has grown so much that the disk is full, the log file must be truncated:

Transaction logs fill up until they are backed up. But warning: you cannot back up a transaction log that is completely full because some space is needed to record the backup. It will be necessary to truncate the log before backing up. Then back up the whole database... of course.

 

SQL Backup

Types of backup

Suggestion: Full once a day, transaction backup several times an hour, differential a few times a day.

Recovery modes:

Implement with:

Restoring:

See also:

 

NT/W2000 services:

SQL Server 2005 services

SQL Server 2005 architecture:

Web applications use http. A web browser (html) connects to the web server. A web service needs a web service client and uses SOAP/XML over http. Now, SQL Server can be a web server, but only for SOAP

The "Surface Area Configuration" is the tool to enable/disable the various features of SQL Server. In particular:

Architecture

The basic unit of data storage is a page of 8KB. Each page contains a 96-byte header with system information, including page number, type, free space. Page types include data, index entries, text/image (large objects), allocation/free space pages. Row offsets are stored at the end of the page. Large rows cannot span pages, but parts of row can be stored in other pages (row overflow).

Pages are grouped into extents of 8 continguous pages (64KB). Extents can be associated with one object or can be mixed.

 

Security

SQL injection:

See, among others, http://www.unixwiz.net/techtips/sql-injection.html and http://www.owasp.org/index.php/OWASP_Guide_Project

Best Practices

Auditing

Auditing:

XML data:

Advantages:

Disadvantages


Environment

GO
delineates batches of T-SQL statements, limits scope of local user-defined variables. One syntax error and the whole batch fails.
 
SET NOCOUNT ON/OFF
Show the number of rows affected (ON=do not show)
EXEC
execute ...

Use SQL files (like Oracle)
to run, open in SQL Query Analyzer

add comments: ctrl+shift+C
remove cmts:  ctrl+shift+R
upper case / lower case: ctrl+shift+U or L
bookmark: add with ^F2, remove with ^shift+F2, navigate with F2 or shift+F2

SP_WHO2: see current processes

If needed, change the configuration parameters (set the value, then apply the value):
sp_configure 'option_name', value
GO
RECONFIGURE | RECONFIGURE WITH OVERRIDE
GO

Enterprise manager: Define the connection username and password : edit sql registration properties.

use database
Define the default database for the queries. Or, in query analyzer, use the drop-down list.

 

SET ANSI_NULLS = ON NULL = something --> NULL
SET ANSI_NULLS = OFF NULL = NULL --> true
NULL = something --> false

 

 

Oracle SQL Server
schema and tablespace database
data file data file
tablespace filegroup
segment heap/index
extent extent
DB blocks pages
System tablespace + control file master database
Rollback segments and redo logs transaction log
temp tablespace tempdb database
data dictionary System Catalog (resides in part in system database and in part in individual databases)
alert log file error logs (6 by default). Can be viewed with notepad.
startup normal start database system
startup restrict alter database restricted_user
alter database open read only alter database read_only
alter tablespace off/online alter database off/online

Instead of multiplexing the control and redo logs, use striped and mirrored devices (RAID 0+1) for the transaction logs.

 

Installation

Note that multiple installations are possible. But the fullowing must be managed: installation directory, service name for the SQLServerAgent, main registry hive, default TCP/IP port, named pipe addresses and performance counters. But some things can only exist once on a server, in particular the full-text search service.

To check the rights that I have with my SQL Server user account, use XP_CMDSHELL:
xp_cmdshell 'dir \\theserveer\theShare'

Installation options:

The database files:

Maintenance

Database maintenance: exec sp_helpdb [[@dbname = ] 'database_name'].
Use sqlmaint.exe to run administrative commands.

Memory management is done automatically. But can be configured at server level and at database level. SQL Server tries to use the maximum amount of memory possible without swapping. See SQL Server's Performance Monitor utility for memory management.

Process management: At startup, executables create a process, which in turn spawns threads. Threads have the advantagle of being able to share memory space. In some cases, fibers can be used. Fibers are lightweight threads.

Space management: Increase file size by script, manually in Enterprise Manager or by setting an automatic increase in size. Shrinking is possible. But auto-shrink should be used with caution. Removing a database file is possible, but first remove all objects.

System stored procedures:

sp_helpsort
sort order defined for the database
sp_who
Who is running (also sp_who2 ?).  do "select @@spid" then do "exec sp_who @@spid"
exec sp_lock
For information on locks, or see Enterprise manager: Management > current Activity > view by object or process id
sysprocesses
information on processes
sysprocesses.lastwaittype
information on waits
sp_help table name
Description of the table
sp_helptext view or stored procedure
Text of the view
sp_depends
Objects used
exec sp_createstats
Create statistics
UPDATE STATISTICS table_name | view_name [index_name | statistics_name]
Update statistics
exec sp_helpindex [@objname = ] 'table_name'
 
sp_change_users_login @Action='Report';
Detect orphaned users, which are users defined on a database but not defined as users in the database server.
sp_change_users_login @Action='update_one', @UserNamePattern='<database_user>', @LoginName='<login_name>';
Relinks the server login account <login_name> with the an orphaned database user <database_user>.
USE master
GO
sp_password @old=NULL, @new='password', @loginame='<login_name>'
change password (need "alter any login" permission).

 

SQL Server 2005:
By default, the server is closed to remote connections. To open, use start menu > "configuration tools" > "SQL Server Surface Area Configuration"
Open the database engine to both named pipes and TCP/IP connections. Then restart (only) the database engine service.

 

Default file structure:

Location: \Program Files\Microsoft SQL Server\Mssql$instance1

Sub-directories at this location:

Backup
Default location for backup file
Binn
executables
Data
System and sample database files
Ftdata
Full-text catalog files
Install
Scripts and output
Jobs
temporary job output
Log
Error logs
Repldata
Working directory for replication tasks
Upgrade
Upgrade from SQL Server 6.5 to SQL Server 2000
 
 
 
 
 
 
 
 
 
 

 


SQL Server Infrastructure

 

 

Performance tool in windows

64 bit allows

Number of CPUs is determined by

So, dual-core and hyperthreading: like four CPUs

PowerPath: allows a "fail-over" within one node of a cluster from one I/O device to another. Basically for SAN storage. Assumes 2 or more network cards (and even 2 or more connection addresses to target). When the connection fails, then the PowerPath goes through another network card or another connection to the target.

Note that there are no active-active SQL Server clusters. MS clusters assume active-passive only.

EMC storage: Disks grouped into LUNs (Logical units) The LUNs are mapped by the host into the disks.

Network traffic consumers:

Non-paged pool: files that cannot be swapped out. Used for kernel files.

one instance: the number of connections will be more of a problem Several instances on one machine: the sum of maximum memory of the instances must < memory available

On a 32bit cpu, to make available 5 GB of memory to the first instance, you must set the /3GB and /PAE switches in the boot.ini file and change the SQL Server instance configuration to allow AWE.

Data movement options:

Replication:

Log shipping:

Standby databases

extents are 64K, with 8 pages of size 8K
character data cannot span pages, therefore maximum length is 8K. SQL Server 2005
allows to go beyond by implementing another structure.

 

Upgrade to SQL Server 2005

Compatibility level is 9.0 for SQL Server 2005.

Following are some tips for upgrading to SQL Server 2005.

Tips

Upgrade SQL Server database engine (see also http://msdn2.microsoft.com/en-us/library/ms144267.aspx):

Checklist for (any) upgrade:

Some comments resulting from an upgrade of the i... server:

Resources:

 


Syntax

set rowcount 0100
????????????

SELECT TOP n [PERCENT] [WITH TIES] columnA ..... --> show first n rows, rows tied with last row are included if option is specified
ORDER BY columnB;

Aggregate functions: AVG, COUNT (values in expression), COUNT(*) selected rows / only function to return 0 if no rows / not affected by NULLs , MAX, MIN, SUM, STDEV (st dev for all values), STDEVP (st dev for the population ??ques??), VAR, VARP(??ques??)

SELECT ...
GROUP BY .. ORDER BY ..
COMPUTE { AVG, COUNT, MAX, MIN, SUM , (and others???) } [BY expr] (??ques??)

SELECT
GROUP BY ALL -->
show all rows with null values, even if WHERE clause not satisfied (??ques??)
                  useless a HAVING clause is specified on that column

SELECT ... col, GROUPING(col_name),
  GROUP BY ...
  ROLLUP  
  --> Shows additional lines with summaries. Null values in result or a "1" in GROUPING column indicate the level of summaries. GROUP BY ALL option not possible.

SELECT ... col, GROUPING(col_name),
  GROUP BY ...
  CUBE   
   --> Shows additional lines with summaries of column values (2**n additional lines, where n is the number of columns). Null values in result or a "1" in GROUPING column indicate summarized data. GROUP BY ALL option not possible.

SELECT ...
  ORDER BY ...
  COMPUTE SUM(col)
--> SUM at end of result set

COMPUTE { AVG, COUNT, MAX, MIN, SUM , (and others) } (one-of-the-columns) [BY column-list]
Note that the column list must start with the same columns as in the ORDER BY list, and be in the same sequence
Example compute sum(sum(expr)) where sum(expr) is in the select clause.

SELECT ...
  ORDER BY sort_col
  COMPUTE SUM(colA) BY sort_col
--> SUM for each value of sort_col (same order of columns as specified in ORDER BY)

SELECT columns
  INTO new_table

SELECT
  FROM table1 join-type JOIN table2 ON cond [INNER] JOIN, LEFT|RIGHT|FULL [OUTER] JOIN, CROSS JOIN
(cartesian prod)

Subqueries:

 

select ... from ...
UNION [ALL]
select ... from ...;

select ... from ...
EXCEPT --
equivalent to MINUS
select ... from ...;

select ... from ...
INTERSECT
select ... from ...;

 

SELECT IDENTITYCOL   --> ?
     , ROWGUIDCOL    --> ?
  FROM ...
  WHERE expr { = | <> | > | >= | < | <= } expr
        str_expr [NOT] LIKE str_expr ESCAPE 'escape_char'
               % 0 or more chars
               _ a single char
               [] a single char in a range
               [^] single char not in range
        .. [NOT] BETWEEN .. AND .. (including end values)
        .. IS [NOT] NULL
        CONTAINS ( column or * , 'search condition')
        FREETEXT ( column or * , 'free text string')
        .. [NOT] IN (subquery)
        .. [NOT] IN (expr, expr, ...)    -- treated the same way as OR
        expr { = | <> | > | >= | < | <= } {ALL | SOME | ANY} (subquery)
        EXISTS (subquery)

 

Checksum:

CHECKSUM ( * | expression [ ,...n ] ) --> int
BINARY_CHECKSUM ( * | expression [ ,...n ] )

Noncomparable data types are text, ntext, image, and cursor, as well as sql_variant with any of the above types as its base type. NULLs have a representation. Checksum and binary_checksum differ for string values where locale influences how strings are compared.p

CHECKSUM_AGG ( [ ALL | DISTINCT ] expression )
ALL is the default, DISTINCT returns checksum of unique values. If one of the values in the expression list changes, the checksum of the list also usually changes. However, there is a small chance that the checksum will not change. Examples:
SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM Products
SELECT CHECKSUM_AGG(CAST(UnitsInStock AS int)) FROM Products

Insertions

INSERT [INTO] tablename [(columns)] VALUES (...)
INSERT        tablename SELECT ... FROM ...
SELECT ... INTO temp_table FROM ...
   --> insert into new tables
#temporary local table, valid for session
##temporary global table, valid for session until the last T-SQL statement completes
The default mode is autocommit
Implicit (needs an API to set it): transaction starts at next statement (??ques??)

Pick up the key of a row that was just inserted:
insert ....
select @theLogID = cast(IDENT_CURRENT('LoadingLog')AS INT)

 

Deletions

DELETE [FROM] tableA [FROM table_for_joining INNER JOIN tableA ON ... ] ...
TRUNCATE TABLE tablename
  -- this is not logged

Updates

UPDATE table_to_update
  SET col = { expr | DEFAULT | NULL }
    , @var= expr
    , col = tableOther.col
  [ FROM table_to_update INNER JOIN tableOther ON table_to_update.key = tableOther.key
  WHERE cond
Rows are updated only once.

 

 

SQL Server 2005:
By default, the server is closed to remote connections. To open, use start menu > "configuration tools" > "SQL Server Surface Area Configuration"
Open the database engine to both named pipes and TCP/IP connections. Then restart (only) the database engine service.

 

Error Management

TRY...CATCH uses error functions to capture error information.

RAISERROR (N'text....',
           10, -- Severity. 1 to 19, int
           1); -- state, int
raiserror in the try part with severity 11 to 19 provokes a catch.

RAISERROR ('Member ''%s'', ''%s'' does not exist', 16, 1, @FirstName, @LastName)

Rank

select ... , rank() over ([partition by ..] order by ... desc) Other ranking functions: row_number, dense_rank, ntile(int) where the int is size of the grouping.

Recursive Queries Using Common Table Expressions

When using CTE, define it with two table definitions joined by a "UNION ALL":
The first table is the anchor. No reference to the CTE. It corresponds to the start of the hierarchy.
UNION ALL
select the children of previous. Navigate using a reference to the start, using the name of the CTE.

WITH cte_name ( col1, col2, col_level )
AS
(select x, y, the_level
from
-- do not mention cte_name
where y is null -- assuming that this defines the starting points
UNION ALL
select xx, yy, cte_name.the_level + 1
from a join cte_name on a.y = cte_name.y)
--
Use the CTE immediately after the definition above:
select * from cte_name;

 

Checksum

CHECKSUM('a string')
Checksum of one field
CHECKSUM ( * | expression [ ,...n ] ) --> int  
Noncomparable data types are text, ntext, image, and cursor, as well as sql_variant with any of the above types as its base type. NULLs have a representation 
BINARY_CHECKSUM ( * | expression [ ,...n ] )  
checksum and binary_checksum differ for string values where locale influences how strings are compared. 
CHECKSUM_AGG ( [ ALL | DISTINCT ] expression ) 
ALL is the default, DISTINCT returns checksum of unique values. If one of the values in the expression list changes, the checksum of the list also usually changes. However, there is a small chance that the checksum will not change.  

Examples:
select primarykey,CHECKSUM_AGG(BINARY_CHECKSUM(col1,col2,...,colN)) from tablename group by primarykey
SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM Products
SELECT CHECKSUM_AGG(CAST(UnitsInStock AS int)) FROM Products

 


Transactions

 

 

Transactions

BEGIN TRANSACTION -- this is the explicit mode
IF @@ERROR <> 0
  BEGIN
    RAISERROR ('....', 16, -1)
    ROLLBACK TRANSACTION
  END
COMMIT TRANSACTION

BEGIN TRY
  BEGIN TRANSACTION
  ....
  COMMIT
END TRY
BEGIN CATCH
  If @@TranCount > 0 ROLLBACK
  raiserror (..)
END CATCH

Isolation Levels

READ UNCOMMITTED

READ COMMITTED

REPEATABLE READ

SNAPSHOT

SERIALIZABLE

Concurrency control

Set isolation level with:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ

Note that the isolation level is set for the whole session!, and will continue for the rest of the session. If isolation level is set within a stored procedure, the level will revert to the previous value when exiting the stored procedure.

See isolation level with: DBCC USEROPTIONS

Note that hints may modify the behavior for a single query.

Locks

 

Nested transactions:

The inner transaction commits are ignored, if an outer transaction is rolled back. If the outer transaction is committed, the inner transactions are committed. If the outer transaction is rolled back then the inner transactions are rolled back too, even if they were individually committed. The nesting level is given by @@TRANCOUNT. If @@TRANCOUNT=0, then not in a transaction.

 


Comments, etc

-- comment
/* ...multiple line */

names: alphabetic letters, _, @, #, numerals
@local_variable
#temporary_table
##global_object
[Bracketed identifiers]
"Quoted identifiers" (only if QUOTED_IDENTIFIER is on)

 


VARIABLES

DECLARE @a_var       type
      , @another_var type
SET @a_var = ...
SELECT @a_var = column_name FROM .....;
SELECT @a_var AS display_name
  --> displays the value

SET DATEFORMAT dmy --> sets the default display for datetime data to dmy.

system variables:

@@spid
Process ID
@@version
Version
@@servername
Server name
@@rowcount
For example:
update ...
if @@rowcount=0 print 'no rows updated'

DECLARE @DBID INT;
SET @DBID = DB_ID();
DECLARE @DBNAME NVARCHAR(128);
SET @DBNAME = DB_NAME();

 

See data types

 


operators

- modulo %
- concat +

Precedence:
()
* / %
- +
+ (concat)
NOT
AND
OR

 


types

 

Non-unicode uses code pages, but an appropriate code page does not always exist for a given character. So, better to use unicode.

If a database is unicode, but the ODBC is not (version 2.6 and earlier), then there is an implicit conversion to code pages. This leads to a possible corruption of data if the appropriate code page does not exist.

Use CONVERT() and CAST() to convert between unicode and non-unicode
Use UNICODE() and NCHAR() instead of ASCII and CHAR
Prefix Unicode character string constants with the letter N: N'a string'

 


Functions

Conversion functions, logical functions

CONVERT (target-type, date or string, style)
See below for more details
examples:
CONVERT(VARCHAR(15), getdate(), n) with n=102, 111, 113
Note: if no style is necessary, use CAST(expr AS datatype)
CONVERT ( type, GETDATE(), n)
examples:
CONVERT(VARCHAR(30), getdate(), n) with n=102, 111, 113
isnull(a.eclipseEndDate, getdate())
if value is null, then use current date (nvl in Oracle)
user_name()
current user name
app_name()
application

Arithmetic Functions

round(n, m[, t])
Round the number n. If m > 0 then this indicates positions to the right of the decimal point. If m < 0 then the number is rounded to the number of positions to the left of the decimal point. If t = 0 then the number is rounded. If t <> 0 then the number is truncated.
 
 
 
 
 
 

String functions

len(string)
Length of string
substring(string, start, len), left(string, 4), right(string, 4)
String functions
CHARINDEX ( 'a char sequence' , column or text containing the sequence [ , start_location ] )
Returns location of first occurrence
nchar(0x0F12)
Returns unicode characters (argument is decimal or hexadecimal numbers)
unicode(N'g')
Returns unicode for a character
ASCII ( character_expression ) CHAR ( integer_expression ) NCHAR ( integer_expression )
conversion char <--> integer
CHARINDEX ( expression1 , expression2 [ , start_location ] )
Return position of expression2 in expression1
DIFFERENCE ( character_expression , character_expression )
 
LEFT LEN LOWER UPPER LTRIM RTRIM RIGHT
 
PATINDEX ( '%pattern%' , expression )
 
QUOTENAME
 
REPLACE ('string to search', 'string to find', 'replace with this string')
Replace one substring with another
REPLACE(REPLACE('line1[CR-LF]line2' , char(13),' '),char(10),' '):removes end-of-line characters
REPLICATE ('string expr', n)
Repeat the string several times
REVERSE
 
SOUNDEX
 
SPACE (n)
Add n spaces. For other characters, see function replicate
STR (float expr, length, decimals after period)
Convert a float to string
STUFF (source string expr, deletion/insertion point , number of characters to delete, string to insert)
For the deletion/insertion point: 0 returns null, 1 is before the first char, 2 inserts just after the first char
For the number of characters to delete: put 0 to only insert and not delete
To pad with zeros: stuff(the_str, 1, 0, replicate('0', the_total_length - len(the_str)))
SUBSTRING
 
UNICODE
 
 

CASE epxression WHEN expression THEN result
               ,WHEN expression THEN result
                ELSE                 result
END

Case WHEN a_field is not null and rtrim(a_field) <> ''
  THEN
    Case When cond
    THEN value
    ELSE value
    End
  Else 'UNKNOWN'
  END

 

Manipulating dates

GetDate()
Get current date
Datepart(d, GetDate())
Extract day of month from current date
convert(date, GetDate())
Today, just date part, without time
DATEADD(date part, n, date)
Date part: Year (yy, yyyy), quarter (qq, q), Month (mm, m), dayofyear (dy, y), Day (dd, d), Week (wk, ww), Hour (hh), minute (mi, n), second (ss, s), millisecond (ms)
 
DATEADD(d, -2, GetDate())
Two days ago, same time
DATEADD(d, -2, convert(date, GetDate()))
Two days ago, at 00 hours
DATEADD(hour, -1, GetDate())
One hour ago
DATEDIFF(date part, date1, date2)
Date part: see list above in DATEADD
 
SELECT DATEDIFF(s, '1970-01-01 00:00:00', GETUTCDATE())
Current epoch time

CONVERT ( data_type [ ( length ) ] , expression [ , style ] )
example: select CONVERT ( varchar , getdate() , 105 ) --> string mmm dd yyyy
example: select CONVERT ( datetime , '2005-08-04 00:00:00' , 120 ) --> a date
Note select CONVERT ( varchar , '2005-08-64 00:00:00' , 120 ) --> string, but watch out, because it is string to string with no validation

Style Input/Output
100 mon dd yyyy hh:miAM
101 mm/dd/yyyy
102 yyyy.mm.dd
103 dd/mm/yyyy
104 dd.mm.yyyy
105 dd-mm-yyyy
106 dd mon yyyy
107 Mon dd, yyyy
108 hh:mm:ss
109 mon dd yyyy hh:mi:ss:mmmAM
110 mm-dd-yyyy
111 yyyy/mm/dd
112 yyyymmdd
113 dd mon yyyy hh:mm:ss:mmm
114 hh:mi:ss:mmm
120 yyyy-mm-dd hh:mi:ss
121 yyyy-mm-dd hh:mi:ss.mmm
126 yyyy-mm-dd Thh:mm:ss:mmm (ISO8601, XML)
130 dd mon yyyy hh:mi:ss:mmmAM
131 dd/mm/yyyy hh:mi:ss:mmmAM
Take off 100 for display without the century

DATEPART ( datepart , date ) with datepart = (without quotes) yy, yyyy or qq, q or mm, m or dy, y or dd, d or wk, ww or dw or hh or mi, n or ss, s or ms

DAY, MONTH, and YEAR functions are synonyms for DATEPART(dd, date), DATEPART(mm, date), and DATEPART(yy, date),

remove seconds: CAST(getdate() AS smalldatetime)

see help page "CAST and CONVERT" for details

 


dynamic query

use db
declare @dbname varchar(30), @table_name varchar(30)
set @dbname = '...'
set @table_name = '...'
execute (' use ' + @dbname @ ' select * from ' + @table_name)
GO

 


DDL

 

if (exists(select * from information_schema.tables where dbo.table_name = 'the_table'))
     drop table dbo.the_table;

CREATE TABLE dbo.the_table
    (theID identity (2, 1) -- start at 2, increment by 1
    is_red int check (is_red in (0,1) )
    TestDate datetime Default Getdate(),
    CONSTRAINT pk_the_table PRIMARY KEY CLUSTERED (theID)
     )

CREATE TABLE dbo.detail_table
    (
    , theID int CONSTRAINT fk_the_table FOREIGN KEY REFERENCES dbo.the_table (theID)
    );

Create table from another: SELECT ... INTO table_name (equivalent to "CREATE TABLE AS SELECT" in Oracle)

Alter table:

Indexes

A covering index is an index that has all the columns retrieved for a query. In this case, the data pages are not accessed. It is faster because there is more data in a page, and therefore there are fewer pages to retrieve. Remember that non-clustered indexes have the indexed field and the clustered field. Use covering indexes as a sort of materialized view.

Non-clustered index: pointers to the data where it is.

Clustered index helps where the retrieval is done on a range, for which output is sorted or on which results are aggregated. Cluster the index on the field that is retrieved by ranges. Warning: clustered intexes need more maintenance on inserts. Look at fill factor Note that non-clustered indexes in a clustered tables uses the full value of the field as the pointer! So cluster on a small field! Non-clustered indexes in a non-clustered table store row pointers with the indexed field.

cluster on the sort column: improvement shows. Not enough to offset the drawbacks.

Index with included columns (create nonclustered index ... on ... include (columns, ..);) allows an index to "cover" the needs for a query.

Notion of density. For MS, high density is where you have many rows with few distinct values in the index. Highest density: for 10 ajdacent rows, all have same value. Intuitively, high density is where you have many distinct values in the index. Highest density: for 10 adjacent rows, all have distinct values.

In heap, when a line is modified and does not fit, then it is inserted elsewhere and the old location has a "forwarding address". This is a forwarding pointer.

list the indexes:

print 'Indexes'
SELECT object_name(i.object_id) AS TableName,
      i.name AS IndexName,
      c.name AS ColumnName,
      CASE ic.is_descending_key
      WHEN 1 THEN 'DESC'
      ELSE 'ASC'
      END AS ColumnSort
      ,i.type_desc -- also use i.type
      , case i.is_unique when 1 then 'UK' else ' ' end is_unique
      , case i.is_primary_key when 1 then 'PK' else ' ' end is_primary_key
      , case i.is_unique_constraint when 1 then 'U constr' else ' ' end is_unique_constraint
  FROM sys.indexes i
    JOIN sys.index_columns ic ON i.object_id = ic.object_id
      AND i.index_id = ic.index_id
    JOIN sys.columns c ON ic.object_id = c.object_id
      AND ic.column_id = c.column_id
      and object_name(i.object_id) not like 'sys%'
  ORDER BY TableName, IndexName, ic.key_ordinal

SQL Server indexes:
data page size: 8KB, in groups of 8 pages called extents

Index organized table: Use "CLUSTERED" keyword in the column definition.

Temporary table:
CREATE TABLE #table_name ...
CREATE TABLE ##table_name ...

SQL Server also has a type of column "computed column". The value is not physically stored.

Indexes:
CREATE [UNIQUE] [NONCLUSTERED] INDEX index_name ON table_name (column_name, ... )
FILLFACTOR is equivalent to Oracle PCTFREE (FILLFACTOR 90 = PCTFREE 10)

Rebuild:
DBCC INDEXDEFRAG (database name, table name, index_name) : can be done on-line, without stopping queries.
or
DBCC DBREINDEX ( [dbname.owner.]table, { index_name | '' }, fillfactor) : rebuild indexes, but with locks. Fillfactor = 80 for example.
or
CREATE INDEX.WITH drop_existing

Drop index: DROP INDEX index_name

Triggers

Triggers exist in SQL Server too. Syntax is slightly different. See doc.
exec sp_helptrigger [@tabname = ] 'table_name'
exec sp_helptext [@objname = ] 'trigger_name'

Constraints

 

Views:

select tab.TABLE_CATALOG, tab.TABLE_SCHEMA, tab.TABLE_NAME, tab.TABLE_TYPE
   from INFORMATION_SCHEMA.TABLES tab
-- only objects for which user has permission are visible
   order by tab.table_type;

select a1 from ( select 'print '''+ table_name +':''' as a1, table_name as a2, 1 as a3 from information_schema.tables where table_name like 'WEBH%' union all select 'select '''+ table_name +''', count(*) from ' + table_name + ';' as a1 , table_name as a2, 2 as a3 from information_schema.tables where table_name like 'WEBH%' union all select 'select top 10 * from ' + table_name + ';' as a1 , table_name as a2, 3 as a3 from information_schema.tables where table_name like 'WEBH%' ) a order by a2, a3;

SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE
JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS
ON INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE.CONSTRAINT_NAME = INFORMATION_SCHEMA.TABLE_CONSTRAINTS.CONSTRAINT_NAME
WHERE INFORMATION_SCHEMA.TABLE_CONSTRAINTS.TABLE_NAME = 'the table' AND CONSTRAINT_TYPE = 'PRIMARY KEY'

SELECT fkey.Table_name, fkey.Column_name
FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE rkey JOIN
Information_schema.REFERENTIAL_CONSTRAINTS Ref on rkey.CONSTRAINT_NAME = Ref.Unique_Constraint_Name
JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE fkey ON Ref.CONSTRAINT_NAME = fkey.CONSTRAINT_NAME
WHERE rkey.Table_Name = 'Users'

 

-- Create a hash index.
ALTER TABLE xyz ADD cs_Pname AS checksum(ProductName) -- create the column as a function
CREATE INDEX Pname_index ON Products (cs_Pname) -- add the index

 

Create a trigger:
CREATE TRIGGER [ schema_name.]trigger_name ON { table | view }
{ FOR | AFTER | INSTEAD OF }
{ [ INSERT ] [ , ] [ UPDATE ] [ , ] [ DELETE ] }
[ NOT FOR REPLICATION ]
AS sql_statement [ ; ]

 

Enable or disable a constraint:
ALTER Table theTable   CHECK CONSTRAINT ALL
ALTER Table theTable NOCHECK CONSTRAINT ALL

 

Data abstraction

Data abstraction consists of using views and stored procedures so as to create a layer around the tables.

Generalization

newID generates only a new UniqueIdentifier
int field can have identity property on it, with a seed value.

 

 


Objects

EXEC sp_helptext 'name'
this describes the object.

 


VIEWS

CREATE VIEW ... AS
SELECT ...
GO

Cannot use COMPUTE, COMPUTE BY, ORDER BY (except with TOP), INTO

View source of view: exec sp_helptext view_name

 


Stored procedures

Triggers linked to tables or not. Part of the transaction that fired the trigger.

CREATE PROCEDURE procname AS
...

CREATE FUNCTION owner.function_name
  ( [ @param type [=default] [, ...] ] )
RETURNS type
[WITH function_option]
AS BEGIN
...
RETURN ...
END

Latches are simpler than locks. They are held for much shorter periods.

 

if condition
  begin
    ..
  end
  else
  begin
    ..
  end

 

Cursor:

CREATE PROCEDURE proc_name AS
Declare @Temp_Total numeric
Declare @Total_Premium numeric
select @temp_total = 0
select @total_premium = 0

Declare c1 cursor scroll for
select ....

open c1
fetch next from c1 into @Temp_Total
while @@fetch_status = 0
   begin
    Set @Total_Premium = @Temp_Total + @Total_Premium
    fetch next from c1 into @Temp_Total;
   end
close c1
deallocate c1
GO

 

Example:

use Data_mart
go
set nocount on
go
drop table dbo.a_table
CREATE TABLE dbo.a_table (
    a varchar (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
    b bigint NOT NULL CONSTRAINT bb DEFAULT (0)
    ) ON [PRIMARY]
GO

drop function dbo.a_function
go
ALTER FUNCTION dbo.a_function (@one_char nchar(1))
RETURNS nvarchar(50)
AS BEGIN
declare @the_result nvarchar(50)
    if @one_char is null set @the_result = ''
    else if @one_char = ' ' set @the_result = 'space'
    else if 0x0000 <= unicode(@one_char) and unicode(@one_char) <= 0x007F set @the_result = 'Latin'
    else set @the_result = 'unknown (code=' + cast(unicode(@one_char) as nvarchar(20)) + ')'
return @the_result
END
go

begin
    set @the_result = dbo.a_function(@one_char)
    if @the_result <> 'Latin' and @the_result <> ''
        set @the_result =    N'The character "' + isnull(@one_char,'?') + N'"' + cast(unicode(@one_char) as nvarchar(10)) + N' is in block ' + @the_result     else
        set @the_result = N''
    return @the_result
end

execute a procedure:
       exec dbo.character_a_function_counts_store @table_name , @column_name, @country_code, @one_block

execute a dynamic SQL statement:
set @the_sql = 'declare @one_string nvarchar(300), @country_code varchar(2)' + char(13) + char(10) +
     '' + char(13) + char(10) +
     'Declare c1 cursor scroll for ' + char(13) + char(10) +

print @the_sql
exec (@the_sql)

More samples

if(@diagramname is null)
begin
  RAISERROR ('Invalid value', 16, 1);
  return -1
end

EXECUTE AS CALLER
select @theId = DATABASE_PRINCIPAL_ID() -- UNDONE: more work
select @IsDbo = IS_MEMBER(N'db_owner')
if(@owner_id is null)
select @owner_id = @theId
REVERT

 

Characters:

DECLARE @CRLF char(2)
SET @CRLF = CHAR(13) + CHAR(10)

SET @Task = 'INSERT #TEMP_TABLE FROM ' + CHAR(39) + ... + CHAR(39) + '...' -- CHAR(39) is the single quote
EXECUTE (@Task)
SET @Return = @@ERROR

set nocount on
begin
declare @i int, @one_line nchar(128), @line_len int
set @line_len = 64 -- This is the length of the line to be displayed
set @one_line = N' '
set @i=0
while @i <= 65536 -- this is the maximum character to display
begin
    if (@i >= 32) -- do not display the control characters
      set @one_line = @one_line + nchar(@i)
    else
      set @one_line = @one_line + N' '

    if ((@i+1) % @line_len) = 0
    begin
        print N'line ' + right('    (last is ' + cast(@i as varchar(10)), 15) + '):' + isnull(@one_line, 'null')
        set @one_line = N' '
    end
    set @i = @i + 1
end
print N'line ' + right(' (last is ' + cast(@i-1 as varchar(10)), 15) + '): ' + isnull(@one_line, 'null')
end
go

 


Tuning

 

 

MS-SQL Server methodology:
schema design --> query optimisation --> indexing --> locking --> server tuning

The first step is to establish a baseline for comparison (include numbers for day-time and batch-time) Use the performance monitor for one instance. Or use third party software for the whole enterprise.

Tools:

Locks: See activity monitor (need "view server state" permission) server in object explorer > management > SQL Server logs > activity monitor

Identify locking and blocking:

SQL Server profiler

Manual baseline:

-- Flush cache
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

DECLARE @Unit SmallINT
SET @Unit = 1  -- Edit to the current unit
DECLARE @StartTime DateTime

SET @StartTime = getdate()

--execute...

INSERT PerfTest (Unit, Code, Duration )
	VALUES (@Unit, 'procedure name', GetDate() - @StartTime)
SELECT Code, [1],[2],[3],[4],[5], [6]
  FROM
  (Select Unit, Code,
    (DatePart(n, Duration) * 60000) +
    (DatePart(s,Duration) * 1000) +
    DatePart(ms, Duration) as ms
  from PerfTest) p
PIVOT
  (
  avg(ms)
  FOR Unit IN ([1],[2],[3],[4],[5], [6])
  ) AS pvt
ORDER BY Code

Show the execution time declare @startTime datetime
SET @StartTime = getdate()
exec the_stored_procedure
select 'Execution in milliseconds: ' , datediff(ms, @StartTime, getdate())

Statistics

Statistics degrade over time. SQL Server determines when it is degraded by over, e.g., 10% and automatically runs the analysis. Sometimes, SQL Server will decide just before running a query that the statistics need updating! Note: run update statistics before establishing a baseline.

Statistics are done only on data.

 


OLAP

An intersection is a data point
A slice is a "qualified data reference"

In ANSI92 SQL:
select ...
from
group by ...
with cube
with rollup
This returns aggregation lines in the result.

MSSQLserverOLAPservice is a service (see services)

MOLAP:
can't move the files but backup and restore on another server.
The cube can be read by other tools. But if it is modified, MSsqlOLAP cannot
read it again.
ROLAP: use when the data is fairly static
HOLAP: not really useful

 

Properties for dimensions are located both in the dimension editor and in the cube editor
Properties for the measures are only in the cube editor.
Member properties also show in the upper right-hand corner.
Private dimensions have to be edited in the cube editor,
shared dimensions have to be edited outside the cube editor.

Note that member key column may be different than the member name column.

Snowflake with levels:
======================

Use wizard.
The key for the leaf level is probably the ID for the table.
In snowflake design, start with lowest or highest table, not a middle table.

Parent-child dimensions:
========================

They are changing, i.e. if there is a change, the dimension does not have to be reprocessed.
But this means that the queries are slower because they are not pre-calculated.
Always create with the wizard: member key is the lower level, parent is the upper level.
Add the member name (generally the name corresponding to the ID).
Property:
Members with data: leaf members only / non-leaf members hidden (bizarre results) / non-leaf members visible
Data member caption template: how the non-leaf members are displayed. Empty: same as other members.
member property "Level Naming Template": when entering with the "...", click in box next to "*"
member property "skipped_levels": allows the level naming from template to correspond to levels.


Time dimension
==============

use wizard. Choose "time dimension".
The levels in a time dimension are identified by the "level type" property of the levels.

Text time: add the same column three times. There is a hierarchy based on the same column.
Modify the member key first (left(,n) or right(,m) or substring(,n,m)), then the member name.


Hierarchy of dimensions
=======================

Click the "create hierarchy of dimensions" at the end of the wizard.
dim_name.xyz
for time dimensions, one is the default time dimension, the other is standard.
But make it "time" when creating and then change to standard in the advanced dimension properties.


Grouping
========

Add the same field, and put for example left(..., 2) in both the key and the name
If needed, set "hide member if" to "no name". This hides the blank members.

formulas / functions
====================

convert ID to string for display and concatenation
rtrim(convert(char, "dbo"."Product"."Product_ID")) + ' - ' + "dbo"."Product"."Product_Name"
rtrim because the convert adds a space.

Some examples of measures
source column whatever, aggregate function: max --> returns the maximum value

Calculated members:
margin: [Measures].[Warehouse Sales]-[Measures].[Warehouse Cost]
Difference from one year to the next: [Time].[Year].&[1998]-[Time].[Year].&[1997]
Total of USA and Canada: Sum({[Store].[Store Country].&[USA], [Store].[Store Country].&[Canada]})
SUM(set). The brackets {} return a set of two members.
Average: [Measures].[Store Sales]/[Measures].[Unit Sales]
Average of all members: Avg([Customer].[Customer].Members) (parent member is the "all")

Time from date: DatePart('yyyy',a date)
DatePart('q',a date)
DatePart('m',a date)
Format (a date, 'mmmm') --> full month

Integer to string: format$()


member properties
=================

member properties of members show in the upper right-hand corner when browsing.
Note that these virtual dimensions are not stored in the cude (no storage space) but have
to be extracted each time. So use for adhoc queries.
Create a member property by dragging a field in the "member properties" below the member name.

Virtual dimensions
==================

First create the member property, or click "Display member keys and names" in the wizard
Wizard is best for creating virtual dimensions.
A virtual dimension can be transformed into a non-virtual dimension (I guess that this means
that the aggregation values get stored in the database)
A virtual dimension always has a value for the "depends on dimension"
Another way is to create a virtual dimension:
- define the "Depends on Dimension" D
- set the value of the dimension to the field which is the member property in dimension D
- set "virtual" property to true


Types of dimensions
- star schema
- snowflake
- parent-child
- virtual dimension
- (mining model)

Dimension Properties
Top level:
All level. If yes, one last aggregation is shown. The caption is defined in property "All Caption".

Type: more of an indication than a real way to change properties.

Default member: often empty. Put the member that is displayed by default.

Depends on dimension: "(none)". Used for virtual dimensions.

Changing: Generally false.
True for example for parent-child dimensions

Write-enabled: false generally

Member keys unique: if true, the processing can be faster
Member names unique: if true, the processing can be faster
Allow duplicate names:

Source table filter:

Storage mode: MOLAP, ROLAP or hybrid

Enable real-time updates:

Virtual: if true, indicates a virtual dimension (see above).
The aggregates are not stored in the cube.

All member formula.

Members with data: see parent-child
Data Member Caption template.

Double quotes around table and field names, single quotes around the strings.
"customer"."lname" + ', ' + "customer"."fname" + ' ' + "customer"."mi"

Count mesaure: add the key/ID to the measures, and set the
aggregate function to count or distinct count.


One table has several fields, which constitue a hierarchy:
start with highest in hierarchy (or leaf/lowest level), not the middle level.
Create the dimension. Then drag and drop the the other fields in.


Unary operator:
===============

if true, a field is defined for the rollup.
The field, which is stored in the database, should contain +, / or ~.
Use this in a child-parent dimension. The unary operators property is on the level.

Storage
=======

1) Type of storage: MOLAP, ROLAP or hybrid.

2) Aggregation options:
The tool sets the aggregation options (but aggregation is actually calculated when
the cube is processed). The processing will build aggregates until either all aggregates
have been built, or the maximum size is reached (storage option), or the until a percentage
of performance is reached (start with 10% and increase if querying the cube is too slow,
i.e. there is not enough aggregation, generally 50% is enough) or until click (watch when
the performance gain levels off in the graph).

3) Save the options for processing later or process now.

Updating
========

Updating dimension data.
If new data is available for the dimension, i.e. new rows in the dimension source table,
I can add it incrementally. The advantage is that the cubes do not have to be fully
reprocessed. Just after an incremental process of the dimension, the new data shows
in the cube in the dimensions, but with no facts. In this case, process the cube with the
refresh option.
Before re-processing, the ROLAP cube will show the updated detail at the lowest level, but
not any change in aggregations. The aggregations do not correspond to the details, because
the details are retrieved directly from the relational tables.

An incremental processing of the cube (not dimensions) using the "normal" source data
will in effect double all measures. If this is done, refresh the cube so as to get
the correct data again.

The big question: refresh or incrementally update
Refresh: remove and redo everything
Incremental: add to existing data

One option, if I can control what is being added to the dimension.
Suppose that I incrementally update the dimension and I know that I am adding element X
So I incrementally update the dimension D then I incrementally update the cube with a filter
on D='X' so as to only add the new data and I do not have to refresh the whole cube.
However, I have to know which members are new to the dimension.

It is smart to always keep the original RDBMS fact tables, even if it is on a tape.

Optimize
========

Define the member key as the foreign key in the fact table instead of the
primary key in the reference table. Do this in the cube editor: menu tools > optimize
This can only be done with shared dimensions. Private dimensions do not work this way.


Partitions
==========

It is smart to always keep the original RDBMS fact tables, even if it is on a tape.

Each partition has separate storage and storage characteristics.
N.B. Options exist to define both slice and partition filters. Confusing mix: be careful.
Drill-through is also defined differently for each partition.
Processing is done partition by partition.
View the slices used in the metadata. The edit forces a re-design of the partition.
The filter info has to be viewed with the editor. Just click cancel at the end.

In merging partitions, be careful in the cases when the source data is from different tables.

Partitions can be dropped easily
Partitions cannot be renamed. I created a new partition and dropped the original one.

It is useful to use drill through with partitions. Drill through are less useful without partitions
because actions are equivalent to drill through if there are no partitions.


Calculated members
==================

In cube editor
Note that adding a calculated member does not need a re-processing of the cube.
Compare one year to another in time dimension: create a calculated member in
the time dimension as (2001)-(2000). This member will calculate measures for 2001 minus values for 2000.
Examples:
Avg([Product].[Category].Members)
Sum([Product].[Subcategory].Members)/Count([Product].[Category].Members) (equivalent to average)
Sum({[Store].[Store Country].&[USA],[Store].[Store Country].&[Canada]})

Virtual Cubes
==============

Cubes in the virtual cube must have at least one dimension in common.
Rename measure names with F2

Pivot tables in Excel
====================

Menu data > pivot table and chart
Option "external data source"
Get data: tab "OLAP cubes". New data source. Provider "...OLAP Services 8.0"
Choose the analysis server option and enter the server name (localhost). UserID blank.
Create local data cube with "client-server" options in the pivot table drop-down menu.

Actions
=======

For example: HTML "<html><head>...." + dimension_name.currentmember.properties + ".href=""literal with two double quotes""....." + dimension_name.currentmember.name + "...</html>"
To a certain extent, actions are similar to drill through (when there are no partitions)

Write-back
==========

A separate table stores the delta. The original data is not modified. The client displays
the values from the cube plus the deltas from the write-back table.
There is an option to merge write-back data into a new partition: old partition plus write-back deltas.

 


Miscellaneous

SQL Server does the following:
parse statement, then resolve, optimize (determine the execution plan), compile and execute.
After compilation, kept in procedure cache, so subsequent execution skips all up to the compilation.

Add query plans to the source control (query plans from test then from production server) so as to have a baseline for comparing back to.

Database engine tuning advisor is the indexing wizard

DDL (create, alter, drop)
DCL (data control language, grant, deny, revoke)
DML (.. manipulation)

 

Query Analyzer

CSV formatted output in Query Analyzer: menu > tools > options > tab "results". Choose "results to text" in the "default output format". Then choose "Comma Delimited" in the "Results Output Format". Make sure that the number in the "Maximum Characters per Column" is larger than the longest string. In the SQL query, manually put double quotes around long text, text that could contain commas or carriage returns with:
"'" + table.text_col + "'" -- That is double quote, single quote, double quote

Output to CSV

 

Search Service

SQL Server's solution for full text indexing. Requires:
Putting a single-column primary key (or unique key column) on tables
Full-text index, which keeps track of the significant words in the texts. Reside on file system.
Full-text catalogs: collections of full-text indexes. They reside on the file system. One or more catalogs per database.

See documentation about stored procedures and properties.

WHERE CONTAINS (column_name, 'expression')
WHERE FREETEXT (column_name, 'free text')
WHERE CONTAINSTABLE (table, column_name, 'expression')
WHERE FREETEXTTABLE (table, column_name, 'free text')

OSQL

osql -? --> syntax
Example: osql -S "SERVER" -E -i "script_file.sql"
osql -S "SCDW002" -E -i "glu.sql"

osql -E -Q "the SQL" -o "C:\Program Files\Microsoft SQL Server\File.fmt" -s "" -w 8000

Use osql.exe like SQL*Plus

osql -S "MIA-SQL\SQLINST1" -E -Q "create database Accounts
ON(FILENAME='E:\Labfiles\Starter\Preparation\Accounts.mdf')
LOG ON (FILENAME='E:\Labfiles\Starter\Preparation\Accounts_Log.ldf')
FOR ATTACH"

osql -S "MIA-SQL\SQLINST1" -E -Q "EXECUTE sp_detach_db 'Accounts'"

sqlcmd -E -S MIA-SQL\SQLINST1 -i "UT Create Orders.sql"

Notification Service

Notification Service

The suscriptions define who gets notified.

 

Traps

See SQL Notes/some traps

 

XML

DECLARE @XMLDepartments XML;
SET @XMLDepartments =
'<Departments>
  <Department Name="Accounting">
    <GroupName>Executive General and Administration</GroupName>
  </Department>
  <Department Name="International Sales">
    <GroupName>Sales and Marketing</GroupName>
  </Department>
  <Department Name="Software Development">
    <GroupName>Research and Development</GroupName>
  </Department>
</Departments>';

SELECT Doc.Dep.value('@Name', 'NVARCHAR(50)') AS DeparmentName,
  Doc.Dep.value('GroupName[1]','NVARCHAR(50)') AS GroupName
FROM @XMLDepartments.nodes('//Departments/Department') AS Doc(Dep);


--DeparmentName                                      GroupName
---------------------------------------------------- --------------------------------------------------
--Accounting                                         Executive General and Administration
--International Sales                                Sales and Marketing
--Software Development                               Research and Development
--
--(3 row(s) affected)
--


DECLARE @docHandle INT;

EXEC sp_xml_preparedocument @docHandle OUTPUT, @XMLDepartments;

SELECT * FROM OPENXML(@docHandle, '//Departments/Department')
WITH (DeparmentName    NVARCHAR(50) '@Name',
      GroupName      NVARCHAR(50) 'GroupName');


--DeparmentName                                      GroupName
---------------------------------------------------- --------------------------------------------------
--Accounting                                         Executive General and Administration
--International Sales                                Sales and Marketing
--Software Development                               Research and Development
--
--(3 row(s) affected)
--