Categories
Bookmarked MS SQL Server SQL

SQL Update if Record Exists or Insert

Quite often we need an approach where we need to Insert record if it is not already present. Sometime to manage relationship. There are few different way of doing it, like “If Exists”, or “Insert on Duplicate Key” (mysql) but the best I found is one from MSDN Blog Entry ( http://blogs.msdn.com/b/miah/archive/2008/02/17/sql-if-exists-update-else-insert.aspx)

The approach is to simply try update the record, if record present it got updated, if not then it insert new based on “Row Effected variable”.

Categories
Bookmarked Database MS SQL Server SQL

Advance SQL: Finding Hierarchical data using Recursive CTE

Often we have a Table that store Hierarchical data, such as any shopping cart will have product category in table that store parent table within same table. We often use such information. The typical structure of table is

ID, Name , ParentID

Where ParentID is ID within same table or for Top level it is either Null or Zero. In such tables we often want to find Child of Child in order we can list them in tree view, i.e.  

L0
  L1
    L2
  L1-1
    L2-1
 ....

Now, to find this type of result you have two option: 1. Write complete logic in your code 2. Make SQL do it for you. For those who prefer this method here is the sample Query I used

 

DECLARE @ParentCompanyID INT = 9;
WITH RecComp
AS
(
    SELECT  crt.CompanyID,
            crt.Name,
            crt.ParentCompanyID,
            1 AS Lvl,
            N'/' + CONVERT(NVARCHAR(4000),crt.CompanyID) + N'/' AS CompanyNode_AsChar
    FROM    @Company crt
    WHERE   crt.ParentCompanyID = @ParentCompanyID
    UNION ALL
    SELECT  cld.CompanyID,
            cld.Name,
            cld.ParentCompanyID,
            prt.Lvl + 1,
            prt.CompanyNode_AsChar + CONVERT(NVARCHAR(4000), cld.CompanyID) + N'/'
    FROM    RecComp prt -- parent
    INNER JOIN @Company cld ON prt.CompanyID = cld.ParentCompanyID
)
SELECT  *,
        CONVERT(HIERARCHYID, CompanyNode_AsChar) AS CompanyNode
FROM    RecComp
ORDER BY CompanyNode;

This query make use of CTE feature of SQL server [I personnally test it on SQL server 2008 R2, 2012 and SQL Azure] and HierarchyID to show the result as desire. More can be read in my thread on StackOverFlow here

 

 

Categories
Blog: My thoughts MS SQL Server Programming SQL SQlite

Reading Large Binary Files: Child Play

Yes, that was funny title, but after your experience it you will agree to me. So, here is the story. I have been working on a software that read some recording from hardware device to database, we have 45 records per second for 30 days, so it is about 30x24x60x60 record entry with 45 columns in it. We have to make a desktop application so we choose .NET for it. The first version of software was released by my company 3 yrs ago, and the reading well, we were inexperience at that time to manage that data, and what we get is about 2 hrs to read all that data to database. Oh, I forgot to tell that 30 days entry was from one hardware device and we have 3-4 device :), so we took 2 hr to read all say 4 devices. Now that is not acceptable thing. So, we decide to rewrite the complete software to make use of some parallelism, as that is only way my team thing it is going to work.

I start the rewrite, and with only hope to reduce to 2 hr work to 30-45 minute I start writing code, but this time we make a exception from last time, instead of using TEXT ascii file or SQLite database, we opt to use SQL Server to store our data. Reason, well first pre-release version of software use Text file, we never get that part working for more than 15 days, and it always get out of memory for one or other reason. Then we start using Sqlite which is 5 times lighter on hardware and speed the reading and information access, however, there is part which still use text file. So, in order to avoid two source we opt for database only, and Since client already have Sql Server on seperate machine, we thought it is good to have seperate machine storing database, for long term and obviously LAN benefits. Since client already have SQL Server and we are using .NET I decide to go with Sql Server, only.

So, we start reading binary file, and start putting insert query for each record [just for testing the lowest speed], and it goes through in 12-13 minutes. wow, we already reduce 30-40 minute job to 12 minutes, just by using full time database. Now, the next challenge is to speed it up with known bulk import methods. couple of them that I tried are

1. Using Dataaset and Update feature of Dataset,

2. Using Long Insert query to send 100 or 1000 records in one “Execute”

3. Using SqlBulkCopy feature in .NET. This is obvious choice on speed, but in few cases it fails for me, so I have to look for first two options as well.

So, at end we get SqlBulkCopy as our tool to go with, now it doesn’t simply import the data in Database. We have to prepare background for it, SqlBulkCopy is used to send data from CSV files, so we create CSV file from binary read, than import this file to Staging Sql Table, and then transfer data from Staging Table to main table, all this is done in 2-3 minute flat. Yes, a 40 minute work is done in 3 minutes. Period.

The trick is, we reduce the number of operation needed to write to disk, we still create CSV, but perviously we are creating XML File, secondly, we have multiple file write procedure going, we remove all this. Infact to achieve that speed we even stop writing LOG file to trace error, we use SQL server to record those error for us. Infact removing Log Text file with SQL error log, itself speed things from 5 minutes to 3 minutes. though logs ar enot that long, but disk writing is very slow as compare to database insert.

All said, I will hardly use plain text file in future for long data read, it will be some sort of database for sure.

Categories
Article Blog: My thoughts Database MS SQL Server MySQL SQL SQlite

Database Optimization

Well, most of program use database for their web application. Eventually all web application does have database. But how many of those web application really works good. Well quite few. I am in web programming for more than 7 years now and during these years I have seen lot of developers building website using PHP or ASP.NET or ASP or PERL using SQL SERVER, MYSQL, Oracle or other database, but I always have a sense of making best application than them. But today I when I look back I realise that I was bad at writing code and making application in my initial years. Not that I don’t want to write a good code, but the fact my approach was not right. Some of the important thing that I learn over the year based on my experience are listed below. However before that I want to tell that not just me but all of us have read them in books but most of us might not have used it thinking we don’t need it as things works otherwise as well. So major stuff are:

1. Fast Database

2. Object oriented Code

3. Serialize flow of application.

In this post I try to mention Only Fast database. Well with Fast database I do not mean to pick the Fastest database on planet through bookmark testing, nor does I mean to use heavy hardware to make database run faster. These two things can really make your database appears to be faster but not really fast. I often ask a question in forum that I have few million records to parse, and can the DB engine handle it. I always get one answer if a database cannot handle few million record why we really need it.

This was very true statement, but do you really facing problem when you have only half million record and your site start taking 20-30 second to access database ? If so, then welcome to my past world. Well don’t worry you don’t need new hardware, you need new approach of programming your software and most importantly you need better database structure.

Starting with most commonly used stuff and I guess most importantly forgotten stuff.. Indexes. Yes, this is most common solution to make your site fast. Okay, so you already have define primary key in your table and it auto create index for it, but still not good enough. Well you need to index columns that you have used in your search and or in Join. Making index of one column does help, but making index of combination of 2 columns make it much better in queries that use the search on those two fields. For example, if you are searching Employee Last Name and their City from Employee table, having index on LastName and City seperately produce result slower than having index on Lastname and city name.

Now moving to next part of it. I have a sql query that has Client Code and User code in seperate column and I have a text box that allow user to enter ClientCode – UserCode in there, and I have to search it in database. I got little lazy and what I did I took the Textbox value as it is and did something like

where ClientCode + "-" + UserCode = @MyClientCode

Now, this looks good and works well for me, but this is not a good approach of writing it. In where clause never do the combination of column like that. Search easy column seperately, i.e.

where clientCode =@ClientCode and UserCode =@UserCode

the difference make your search works 10-15 times faster if you tend to have lot of records.

Third thing that comes to my mind is normalization of database and then creating views with indexes on view to join them if needed. I often see that database is not normalized and lot of Text field are used to store combination of values. For example if you want to have a product in different category then make a table Product_category, which store product ID and CategoryID, instead of storing comma seperate category IDs in product table. This seems awkward at first to change, but advantages are real. You can search product on category much faster as only Integer values are compare against the Regular expression match in comma seperated values. You can index the value in relation table, but not in text field. This approach is very common in developer in sub continent. If I am not wrong 70% of sites I see from other developer has this mistake.

Above are some error in design of database I have seen and some of them I did at my start of career, but I change my approach and today I am handling database with more than 10-20 million entries easily. Infact my site took less than 1 sec to search with a three table join. I remember the old version of this join that took 15 secs. I did great optimization there.

Anyways, making a software is your work, making it great is your choice.

Categories
Database MS SQL Server SQL

Case Sensitive Search in SQL Server

Mysql mostly provide a Case sensitive search because of its default collate configuration. However SQL server use different collate and hence by default it is not case sensitive. It is good to not having case sensitive search by default, as except password compare you most often doesn’t need case sensitive search. Do you? well I never really need that, anyways. Those who wonder how they can get case sensitive search should look this good article

http://blog.sqlauthority.com/2007/04/30/case-sensitive-sql-query-search/

It simply case to change the collate value of column for which you want to get case sensitivity, by defining COLLATE Latin1_General_CS_AS to field declaration.

You can check existing collate of each field using 

EXEC sp_help [tablename]
Alter table [tablename] Alter Column Password Varchar(32) COLLATE Latin1_General_CS_AS

However why are you wondering for case sensitive for password field? aren’t you doing MD5 or other conversion on your password for security already ? :). 

Categories
Database MS SQL Server SQL

Sql optimization

Today, I experience a slow down in one of website I just finish building up. It was a intereting project which has ability to create Dynamic form and hence the related dynamic database tables. But the problem was in part where we preload user information and later show them on our dynamic forms when user code matches from database. During development I was using list of about 30-40K records and it works just fine, but within couple of days of project going live we encounter that our user database is about 800K records and now the search takes as many as 30 seconds to returns the results.

I create few more indexes on concern table as the SQL was operating on joining 3 tables in first table we have 800K records, other tables has about 200 records each. After indexes the time was shorten to 25s, which again is high time for me. So, I thought to rewrite the query in different manner but suddenly I realise that for user code matching we are using 2 fields from main table and I was concatinating those two fields in one and passing combine value to query. I then thought to test it without concatination and magic happens now the same joins take less than a second to fetch records.

So morale of story is Concatination at runtime is not recommended in where clause, specially.

Categories
Database MS SQL Server SQL

Reseed Auto Increment Value in SQL Server

You can reseed the Auto increment field to desire number by using following SQL statement [I think it is specific for SQL Server only] other Database have their own methods.

DBCC CHECKIDENT (<TableName>, RESEED, 0)

Works great for me on Sql Server 2008