Skip to main content

Batch insert script for large table- Optimize Large SQL Server Insert by Using Batches

Batch insert script for large table- Optimize Large SQL Server Insert  by Using Batches

Here the scenario is to perform large insert of millions of records on SQL Server tables  in an optimized way.

Issue
Sometimes you may have to performs large insert, like bringing millions of data from the application table to a datawarehouse tables.

Challenge
If we perform this as a usual single insert , it may end up taking hours\days to complete ,and may lead to locks/blocks and fill-up the logs.
  
To avoid large log growths and minimize locks on tables,  we can follow a better optimized way of batchwise insert.

Implementation: 
We can do it in many ways. Here is a sample script which bring data batchwise from application table to its corresponding DW table.
Here the insert is limited to 100000 records per batch. Once it insert 100000 rows it take next 100000 records and insert. It also check if data is already exist in destination table. 

Here the source and destination tables are in different db servers. So a linked server approach is used to connect to the source db.

Sample script:

declare @Count int
set @Count = 1
while @Count > 0
   begin
INSERT  [Audit] ( --Destination DW Table
AuditID,
ModifiedDate,
ModifiedBy,
CreatedDate,
CreatedBy,
DWIsDeleted,  
DWStartDate, 
DWEndDate
)  
SELECT top (100000)      
AuditID,
ModifiedDate,
ModifiedBy,
CreatedDate,
CreatedBy,
0 AS DWIsDeleted,  
[CreatedDate] AS [DWStartDate], 
NULL AS DWEndDate
FROM    OPENQUERY(
[AppDB], --Linked server to access application db
'SELECT
AuditID,
ModifiedDate,
ModifiedBy,
CreatedDate,
CreatedBy
FROM [stgdb].[dbo].[Audit] AS x WITH (NOLOCK)' --source db
)AS x  
WHERE NOT EXISTS ( --data not exist in destination db check
SELECT 1  
FROM [Audit] AS dw WITH ( NOLOCK )  
WHERE dw.DWEndDate IS NULL  
AND dw.[AuditID] = x.[AuditID]
)
set @Count = @@ROWCOUNT
end
GO


Conclusion:
This is one of the way for doing batchwise insert for large tables. There are other ways too to accomplish the same. Which to choose would depends on each scenarios.

Comments

Popular posts from this blog

SQL Server script Error: Insufficient result space to convert uniqueidentifier value to char.

Error:    G etting below error while trying to copy data from one table to another: Msg 8170, Level 16, State 2, Line 1004  Insufficient result space to convert uniqueidentifier value to char.   Fix: UserID column which would be having Unique identifier data require more space (36 character). Size is specified as 15 here. To fix it just specify 36 as varchar length or specify Uniqueidentifier as data type for UserID  column.

While executing Select query to pull data from a different server, SQL Server is showing following error:

Msg 7202, Level 11, State 2, Line 1 Could not find server 'LAPTOP-09O6NE3K\SQLEXPRESS03' in sys.servers. Verify that the correct server name was specified. If necessary, execute the stored procedure sp_addlinkedserver to add the server to sys.servers. Completion time: 2024-02-05T17:18:24.6869756+05:30 Solution: Step1:  Check if the server is available in the sys.servers table SELECT name from sys.servers Step2:  If no, add the new server to the sys.servers table EXEC sp_addlinkedserver @server = 'server name' Eg" EXEC sp_addlinkedserver @server = 'LAPTOP-09O6NE3K\SQLEXP RESS03' Step3:  Recheck if the new server name is added to the sys.servers table. If yes, try then rerunning the select query which throwed the error. Now it should pull data from the table in the new server successfully.    

Overcoming Performance Issues with Memory Optimized tables - SQL Server 2014 CPT2

Overcoming Performance Issues with Memory Optimized tables - SQL Server 2014 CPT2 1.        Problem Tuning    databases so that they provide the best possible performance is one of the most important responsibilities of database administrators (DBAs).   SQL Server was originally designed at a time when it could be assumed that main memory was very expensive, so data needed to reside on disk except when it was actually needed for processing.   As the amount data    within an organization increases, accumulation of data affects the productivity of DBAs, who must spend considerable time and effort on achieving the high levels of database performance. 2.        Solution Nowadays memory is much cheaper and a working data set can be stored entirely in memory. The assumption that main memory was very expensive is no longer valid as memory prices have dropped enormously. Memory-opti...