developer tip

로그없이 SQL에서 테이블의 대용량 데이터를 삭제하는 방법은 무엇입니까?

optionbox 2020. 8. 14. 07:40
반응형

로그없이 SQL에서 테이블의 대용량 데이터를 삭제하는 방법은 무엇입니까?


큰 데이터 테이블이 있습니다. 이 테이블에는 1000 만 개의 레코드가 있습니다.

이 쿼리에 대한 가장 좋은 방법은 무엇입니까?

   Delete LargeTable where readTime < dateadd(MONTH,-7,GETDATE())

  1. 해당 테이블의 모든 행을 삭제하는 경우 가장 간단한 옵션은 테이블을 자르는 것입니다.

    TRUNCATE TABLE LargeTable
    GO
    

    Truncate table은 단순히 테이블을 비우고 WHERE 절을 사용하여 삭제되는 행을 제한 할 수 없으며 트리거가 실행되지 않습니다.

  2. 반면에 데이터의 80-90 % 이상을 삭제하는 경우 총 1,100 만 개의 행이 있고 1 천만 개를 삭제하려는 경우 다른 방법은이 1 백만 개의 행을 삽입하는 것입니다 (보관하려는 레코드 )를 다른 스테이징 테이블에 추가합니다. 이 대형 테이블을 자르고이 100 만 행을 다시 삽입하십시오.

  3. 또는이 테이블을 기본 테이블로 사용하는 권한 /보기 또는 기타 개체가이 테이블을 삭제해도 영향을받지 않는 경우 상대적으로 적은 양의 행을 다른 테이블로 가져 와서이 테이블을 삭제하고 동일한 스키마를 가진 다른 테이블을 만들고 가져올 수 있습니다. 이 ex-Large 테이블에 행을 다시 넣습니다.

  4. 내가 생각할 수있는 마지막 옵션은 데이터베이스를 변경 Recovery Mode to SIMPLE한 다음 이와 같은 while 루프를 사용하여 더 작은 일괄 처리로 행을 삭제하는 것입니다.

    DECLARE @Deleted_Rows INT;
    SET @Deleted_Rows = 1;
    
    
    WHILE (@Deleted_Rows > 0)
      BEGIN
       -- Delete some small number of rows at a time
         DELETE TOP (10000)  LargeTable 
         WHERE readTime < dateadd(MONTH,-7,GETDATE())
    
      SET @Deleted_Rows = @@ROWCOUNT;
    END
    

그리고 복구 모드를 다시 전체로 변경하는 것을 잊지 마십시오. 완전히 적용되도록하려면 백업을 수행해야한다고 생각합니다 (변경 또는 복구 모드).


@ m-ali 대답은 맞지만 각 청크 후에 트랜잭션을 커밋하지 않고 체크 포인트를 수행하면 로그가 많이 커질 수 있음을 명심하십시오. 이것이 내가 그것을하는 방법 이며 성능 테스트 및 그래프와 함께 http://sqlperformance.com/2013/03/io-subsystem/chunk-deletes 를 참조로 사용합니다.

DECLARE @Deleted_Rows INT;
SET @Deleted_Rows = 1;


WHILE (@Deleted_Rows > 0)
  BEGIN

   BEGIN TRANSACTION

   -- Delete some small number of rows at a time
     DELETE TOP (10000)  LargeTable 
     WHERE readTime < dateadd(MONTH,-7,GETDATE())

     SET @Deleted_Rows = @@ROWCOUNT;

   COMMIT TRANSACTION
   CHECKPOINT -- for simple recovery model
END

GO + 동일한 쿼리를 실행하려는 횟수를 사용할 수도 있습니다.

DELETE TOP (10000)  [TARGETDATABASE].[SCHEMA].[TARGETTABLE] 
WHERE readTime < dateadd(MONTH,-1,GETDATE());
-- how many times you want the query to repeat
GO 100

M.Ali 의이 변형은 저에게 잘 작동합니다. 일부를 삭제하고 로그를 지우고 반복합니다. 나는 로그가 커지고, 떨어지고, 다시 시작하는 것을보고 있습니다.

DECLARE @Deleted_Rows INT;
SET @Deleted_Rows = 1;
WHILE (@Deleted_Rows > 0)
  BEGIN
   -- Delete some small number of rows at a time
    delete top (100000) from InstallLog where DateTime between '2014-12-01' and '2015-02-01'
    SET @Deleted_Rows = @@ROWCOUNT;
    dbcc shrinkfile (MobiControlDB_log,0,truncateonly);
END

@Francisco Goldenstein, 사소한 수정입니다. COMMIT는 변수를 설정 한 후에 사용해야합니다. 그렇지 않으면 WHILE이 한 번만 실행됩니다.

DECLARE @Deleted_Rows INT;
SET @Deleted_Rows = 1;

WHILE (@Deleted_Rows > 0)
BEGIN
    BEGIN TRANSACTION

    -- Delete some small number of rows at a time
    DELETE TOP (10000)  LargeTable 
    WHERE readTime < dateadd(MONTH,-7,GETDATE())

    SET @Deleted_Rows = @@ROWCOUNT;

    COMMIT TRANSACTION
    CHECKPOINT -- for simple recovery model

END

If you are willing (and able) to implement partitioning, that is an effective technique for removing large quantities of data with little run-time overhead. Not cost-effective for a once-off exercise, though.


I was able to delete 19 million rows from my table of 21 million rows in matter of minutes. Here is my approach.

If you have a auto-incrementing primary key on this table, then you can make use of this primary key.

  1. Get minimum value of primary key of the large table where readTime < dateadd(MONTH,-7,GETDATE()). (Add index on readTime, if not already present, this index will anyway be deleted along with the table in step 3.). Lets store it in a variable 'min_primary'

  2. Insert all the rows having primary key > min_primary into a staging table (memory table if no. of rows is not large).

  3. Drop the large table.

  4. Recreate the table. Copy all the rows from staging table to main table.

  5. Drop the staging table.


You can delete small batches using a while loop, something like this:

DELETE TOP (10000)  LargeTable 
WHERE readTime < dateadd(MONTH,-7,GETDATE())
WHILE @@ROWCOUNT > 0
BEGIN
    DELETE TOP (10000)  LargeTable 
    WHERE readTime < dateadd(MONTH,-7,GETDATE())
END

Another use:

SET ROWCOUNT 1000 -- Buffer

DECLARE @DATE AS DATETIME = dateadd(MONTH,-7,GETDATE())

DELETE LargeTable  WHERE readTime < @DATE
WHILE @@ROWCOUNT > 0
BEGIN
   DELETE LargeTable  WHERE readTime < @DATE
END
SET ROWCOUNT 0

Optional;

If transaction log is enabled, disable transaction logs.

ALTER DATABASE dbname SET RECOVERY SIMPLE;

Shorter syntax

select 1
WHILE (@@ROWCOUNT > 0)
BEGIN
  DELETE TOP (10000) LargeTable 
  WHERE readTime < dateadd(MONTH,-7,GETDATE())
END

If you are using SQL server 2016 or higher and if your table is having partitions created based on column you are trying to delete(for example Timestamp column), then you could use this new command to delete data by partitions.

TRUNCATE TABLE WITH ( PARTITIONS ( { | } [ , ...n ] ) )

This will delete the data in selected partition(s) only and should be the most efficient way to delete data from part of table since it will not create transaction logs and will be done just as fast as regular truncate but without having all the data deleted from the table.

Drawback is if your table is not setup with partition, then you need to go old school and delete the data with regular approach and then recreate the table with partitions so that you can do this in future, which is what I did. I added the partition creation and deletion into insertion procedure itself. I had table with 500 million rows so this was the only option to reduce deletion time.

For more details refer to below links: https://docs.microsoft.com/en-us/sql/t-sql/statements/truncate-table-transact-sql?view=sql-server-2017

SQL server 2016 Truncate table with partitions

Below is what I did first to delete the data before I could recreate the table with partitions with required data in it. This query will run for days during specified time window until the data is deleted.

:connect <<ServerName>>
use <<DatabaseName>>

SET NOCOUNT ON;
DECLARE @Deleted_Rows INT;
DECLARE @loopnum INT;
DECLARE @msg varchar(100);
DECLARE @FlagDate datetime;
SET @FlagDate =  getdate() - 31;
SET @Deleted_Rows = 1;
SET @loopnum = 1;

/*while (getdate() < convert(datetime,'2018-11-08 14:00:00.000',120))
BEGIN
    RAISERROR( 'WAIT for START' ,0,1) WITH NOWAIT   
    WAITFOR DELAY '00:10:00'
END*/
RAISERROR( 'STARTING PURGE' ,0,1) WITH NOWAIT   

WHILE (1=1)
BEGIN
    WHILE (@Deleted_Rows > 0 AND (datepart(hh, getdate() ) >= 12 AND datepart(hh, getdate() ) <= 20)) -- (getdate() < convert(datetime,'2018-11-08 19:00:00.000',120) )
      BEGIN
       -- Delete some small number of rows at a time
         DELETE TOP (500000)  dbo.<<table_name>>
         WHERE timestamp_column < convert(datetime, @FlagDate,102)
         SET @Deleted_Rows = @@ROWCOUNT;
         WAITFOR DELAY '00:00:01'
         select @msg = 'ROWCOUNT' + convert(varchar,@Deleted_Rows);
         set @loopnum = @loopnum + 1
         if @loopnum > 1000
             begin 
                 begin try
                        DBCC SHRINKFILE (N'<<databasename>>_log' , 0, TRUNCATEONLY)
                        RAISERROR( @msg ,0,1) WITH NOWAIT
                 end try
                 begin catch
                     RAISERROR( 'DBCC SHRINK' ,0,1) WITH NOWAIT  
                 end catch
                 set @loopnum = 1
             end
        END
WAITFOR DELAY '00:10:00'
END 
select getdate()

참고URL : https://stackoverflow.com/questions/24213299/how-to-delete-large-data-of-table-in-sql-without-log

반응형