当前位置:  开发笔记 > 数据库 > 正文

DBMS级别的管道和过滤器:拆分MERGE输出流

如何解决《DBMS级别的管道和过滤器:拆分MERGE输出流》经验,为你挑选了3个好方法。

我的理解是,主要障碍是OUTPUTSQL Server中子句的限制.它允许将结果集返回给调用者的一个OUTPUT INTO table和/或一个OUTPUT.

您希望以MERGE两种不同的方式保存语句的结果:

MERGE收集统计信息影响的所有行

只插入行 queue


简单的变种

我会用你的S2解决方案.至少从一开始.这是很容易理解和维护,应该是相当有效的,因为资源最密集的操作(MERGETarget本身将只执行一次).下面有第二个变体,比较它们在实际数据上的表现会很有趣.

所以:

OUTPUT INTO @TempTableMERGE

无论是INSERT全部由行@TempTableStats插入前或聚合.如果您只需要聚合统计信息,那么汇总此批处理的结果并将其合并到final中Stats而不是复制所有行是有意义的.

INSERTQueue只进入"插入"行@TempTable.

我将从@ i-one的答案中获取样本数据.

架构

-- I'll return to commented lines later

CREATE TABLE [dbo].[TestTarget](
    -- [ID] [int] IDENTITY(1,1) NOT NULL,
    [foo] [varchar](10) NULL,
    [bar] [varchar](10) NULL
);

CREATE TABLE [dbo].[TestStaging](
    [foo] [varchar](10) NULL,
    [bar] [varchar](10) NULL,
    [baz] [varchar](10) NULL
);

CREATE TABLE [dbo].[TestStats](
    [MergeAction] [nvarchar](10) NOT NULL
);

CREATE TABLE [dbo].[TestQueue](
    -- [TargetID] [int] NOT NULL,
    [foo] [varchar](10) NULL,
    [baz] [varchar](10) NULL
);

样本数据

TRUNCATE TABLE [dbo].[TestTarget];
TRUNCATE TABLE [dbo].[TestStaging];
TRUNCATE TABLE [dbo].[TestStats];
TRUNCATE TABLE [dbo].[TestQueue];

INSERT INTO [dbo].[TestStaging]
    ([foo]
    ,[bar]
    ,[baz])
VALUES
    ('A', 'AA', 'AAA'),
    ('B', 'BB', 'BBB'),
    ('C', 'CC', 'CCC');

INSERT INTO [dbo].[TestTarget]
    ([foo]
    ,[bar])
VALUES
    ('A', 'A_'),
    ('B', 'B?');

合并

DECLARE @TempTable TABLE (
    MergeAction nvarchar(10) NOT NULL,
    foo varchar(10) NULL,
    baz varchar(10) NULL);

MERGE INTO TestTarget AS Dst
USING TestStaging AS Src
ON Dst.foo = Src.foo
WHEN MATCHED THEN
UPDATE SET
    Dst.bar = Src.bar
WHEN NOT MATCHED BY TARGET THEN
INSERT (foo, bar)
VALUES (Src.foo, Src.bar)
OUTPUT $action AS MergeAction, inserted.foo, Src.baz
INTO @TempTable(MergeAction, foo, baz)
;

INSERT INTO [dbo].[TestStats] (MergeAction)
SELECT T.MergeAction
FROM @TempTable AS T;

INSERT INTO [dbo].[TestQueue]
    ([foo]
    ,[baz])
SELECT
    T.foo
    ,T.baz
FROM @TempTable AS T
WHERE T.MergeAction = 'INSERT'
;

SELECT * FROM [dbo].[TestTarget];
SELECT * FROM [dbo].[TestStats];
SELECT * FROM [dbo].[TestQueue];

结果

TestTarget
+-----+-----+
| foo | bar |
+-----+-----+
| A   | AA  |
| B   | BB  |
| C   | CC  |
+-----+-----+

TestStats
+-------------+
| MergeAction |
+-------------+
| INSERT      |
| UPDATE      |
| UPDATE      |
+-------------+

TestQueue
+-----+-----+
| foo | baz |
+-----+-----+
| C   | CCC |
+-----+-----+

第二个变种

在SQL Server 2014 Express上测试.

OUTPUT子句可以将其结果集发送到表和调用者.所以,OUTPUT INTO可以Stats直接进入,如果我们将MERGE语句包装到存储过程中,那么我们就可以使用INSERT ... EXECQueue.

如果你检查执行计划,你会看到INSERT ... EXEC在幕后创建一个临时表(参见Adam Machanic 的INSERT EXEC隐藏成本),所以我希望你创建临时表时整体性能与第一个变量相似明确.

还有一个问题需要解决:Queue表应该只有"插入"行,而不是所有受影响的行.要实现这一点,您可以使用Queue表上的触发器来丢弃"插入"以外的行.另一种可能性是定义一个唯一索引,IGNORE_DUP_KEY = ON并以这样的方式准备数据,即"未插入"行将违反唯一索引,并且不会插入到表中.

所以,我会ID IDENTITYTarget表格中添加一列,然后我会在表格中添加一TargetIDQueue.(在上面的脚本中取消注释它们).另外,我将为Queue表添加一个索引:

CREATE UNIQUE NONCLUSTERED INDEX [IX_TargetID] ON [dbo].[TestQueue]
(
    [TargetID] ASC
) WITH (
PAD_INDEX = OFF, 
STATISTICS_NORECOMPUTE = OFF, 
SORT_IN_TEMPDB = OFF, 
IGNORE_DUP_KEY = ON, 
DROP_EXISTING = OFF, 
ONLINE = OFF, 
ALLOW_ROW_LOCKS = ON, 
ALLOW_PAGE_LOCKS = ON)

重要的是UNIQUEIGNORE_DUP_KEY = ON.

这是以下存储过程MERGE:

CREATE PROCEDURE [dbo].[TestMerge]
AS
BEGIN
    SET NOCOUNT ON;
    SET XACT_ABORT ON;

    MERGE INTO dbo.TestTarget AS Dst
    USING dbo.TestStaging AS Src
    ON Dst.foo = Src.foo
    WHEN MATCHED THEN
    UPDATE SET
        Dst.bar = Src.bar
    WHEN NOT MATCHED BY TARGET THEN
    INSERT (foo, bar)
    VALUES (Src.foo, Src.bar)
    OUTPUT $action INTO dbo.TestStats(MergeAction)
    OUTPUT CASE WHEN $action = 'INSERT' THEN inserted.ID ELSE 0 END AS TargetID, 
    inserted.foo,
    Src.baz
    ;

END

用法

TRUNCATE TABLE [dbo].[TestTarget];
TRUNCATE TABLE [dbo].[TestStaging];
TRUNCATE TABLE [dbo].[TestStats];
TRUNCATE TABLE [dbo].[TestQueue];

-- Make sure that `Queue` has one special row with TargetID=0 in advance.
INSERT INTO [dbo].[TestQueue]
    ([TargetID]
    ,[foo]
    ,[baz])
VALUES
    (0
    ,NULL
    ,NULL);

INSERT INTO [dbo].[TestStaging]
    ([foo]
    ,[bar]
    ,[baz])
VALUES
    ('A', 'AA', 'AAA'),
    ('B', 'BB', 'BBB'),
    ('C', 'CC', 'CCC');

INSERT INTO [dbo].[TestTarget]
    ([foo]
    ,[bar])
VALUES
    ('A', 'A_'),
    ('B', 'B?');

INSERT INTO [dbo].[TestQueue]
EXEC [dbo].[TestMerge];

SELECT * FROM [dbo].[TestTarget];
SELECT * FROM [dbo].[TestStats];
SELECT * FROM [dbo].[TestQueue];

结果

TestTarget
+----+-----+-----+
| ID | foo | bar |
+----+-----+-----+
|  1 | A   | AA  |
|  2 | B   | BB  |
|  3 | C   | CC  |
+----+-----+-----+

TestStats
+-------------+
| MergeAction |
+-------------+
| INSERT      |
| UPDATE      |
| UPDATE      |
+-------------+

TestQueue
+----------+------+------+
| TargetID | foo  | baz  |
+----------+------+------+
|        0 | NULL | NULL |
|        3 | C    | CCC  |
+----------+------+------+

在以下期间会有额外的消息INSERT ... EXEC:

Duplicate key was ignored.

如果MERGE更新了一些行.当唯一索引过程中丢弃一些行发送此警告消息INSERTIGNORE_DUP_KEY = ON.

将重复键值插入唯一索引时,将出现警告消息.只有违反唯一性约束的行才会失败.

@VladimirBaranov:IGNORE_DUP_KEY是我在阅读完所有OP要求后首先想到的.不要分析任何东西只是抛出一切队列表.OP甚至可以批量插入 - 这比`INSERT ... NOT EXIST(..)`更快 (2认同)


i-one.. 7

考虑以下两种方法来解决问题:

将数据合并到单个语句中插入到队列中的目标和输出中,并在目标上创建的触发器中汇总统计信息.批处理标识符可以通过临时表传递给触发器.

将数据合并到单个语句中插入到队列中的目标和输出中,并使用内置的更改跟踪功能在合并后立即汇总统计信息,而不是在触发器中执行此操作.

方法1(合并数据并在触发器中收集统计信息):

示例数据设置(为简单起见省略了索引和约束):

create table staging (foo varchar(10), bar varchar(10), baz varchar(10));
create table target (foo varchar(10), bar varchar(10));
create table queue (foo varchar(10), baz varchar(10));
create table stats (batchID int, inserted bigint, updated bigint, deleted bigint);

insert into staging values
    ('A', 'AA', 'AAA')
    ,('B', 'BB', 'BBB')
    ,('C', 'CC', 'CCC')
    ;

insert into target values
    ('A', 'A_')
    ,('B', 'B?')
    ,('E', 'EE')
    ;

触发收集插入/更新/删除的统计信息:

create trigger target_onChange
on target
after delete, update, insert
as
begin
    set nocount on;

    if object_id('tempdb..#targetMergeBatch') is NULL
        return;

    declare @batchID int;
    select @batchID = batchID from #targetMergeBatch;

    merge into stats t
    using (
        select
            batchID = @batchID,
            cntIns = count_big(case when i.foo is not NULL and d.foo is NULL then 1 end),
            cntUpd = count_big(case when i.foo is not NULL and d.foo is not NULL then 1 end),
            cntDel = count_big(case when i.foo is NULL and d.foo is not NULL then 1 end)
        from inserted i
            full join deleted d on d.foo = i.foo
    ) s
    on t.batchID = s.batchID
    when matched then
        update
        set
            t.inserted = t.inserted + s.cntIns,
            t.updated = t.updated + s.cntUpd,
            t.deleted = t.deleted + s.cntDel
    when not matched then
        insert (batchID, inserted, updated, deleted)
        values (s.batchID, s.cntIns, s.cntUpd, cntDel);

end

合并声明:

declare @batchID int;
set @batchID = 1;-- or select @batchID = batchID from ...;

create table #targetMergeBatch (batchID int);
insert into #targetMergeBatch (batchID) values (@batchID);

insert into queue (foo, baz)
select foo, baz
from
(
    merge into target t
    using staging s
    on t.foo = s.foo
    when matched then
        update
        set t.bar = s.bar
    when not matched then
        insert (foo, bar)
        values (s.foo, s.bar)
    when not matched by source then
        delete
    output $action, inserted.foo, s.baz
) m(act, foo, baz)
where act = 'INSERT'
    ;

drop table #targetMergeBatch

检查结果:

select * from target;
select * from queue;
select * from stats;

目标:

foo        bar
---------- ----------
A          AA
B          BB
C          CC

队列:

foo        baz
---------- ----------
C          CCC

统计:

batchID  inserted   updated   deleted
-------- ---------- --------- ---------
1        1          2         1

方法2(使用更改跟踪功能收集统计信息):

示例数据设置与以前的情况相同(只需删除所有内容,包括触发器并从头开始重新创建表),除了在这种情况下我们需要在目标上使用PK来使示例工作:

create table target (foo varchar(10) primary key, bar varchar(10));

在数据库上启用更改跟踪:

alter database Test
    set change_tracking = on

在目标表上启用更改跟踪:

alter table target
    enable change_tracking

在此之后立即合并数据并获取统计信息,按更改上下文进行过滤以仅计算受合并影响的行:

begin transaction;
declare @batchID int, @chVersion bigint, @chContext varbinary(128);
set @batchID = 1;-- or select @batchID = batchID from ...;
SET @chVersion = change_tracking_current_version();
set @chContext = newid();

with change_tracking_context(@chContext)
insert into queue (foo, baz)
select foo, baz
from
(
    merge into target t
    using staging s
    on t.foo = s.foo
    when matched then
        update
        set t.bar = s.bar
    when not matched then
        insert (foo, bar)
        values (s.foo, s.bar)
    when not matched by source then
        delete
    output $action, inserted.foo, s.baz
) m(act, foo, baz)
where act = 'INSERT'
    ;

with ch(foo, op) as (
    select foo, sys_change_operation
    from changetable(changes target, @chVersion) ct
    where sys_change_context = @chContext
)
insert into stats (batchID, inserted, updated, deleted)
select @batchID, [I], [U], [D]
from ch
    pivot(count_big(foo) for op in ([I], [U], [D])) pvt
    ;

commit transaction;

检查结果:

select * from target;
select * from queue;
select * from stats;

它们与之前的样本相同.

目标:

foo        bar
---------- ----------
A          AA
B          BB
C          CC

队列:

foo        baz
---------- ----------
C          CCC

统计:

batchID  inserted   updated   deleted
-------- ---------- --------- ---------
1        1          2         1


dsz.. 5

我建议使用三个独立的AFTER INSERT / DELETE / UPDATE触发器来提取统计数据:

create trigger dbo.insert_trigger_target
on [dbo].[target]
after insert
as
insert into dbo.[stats] ([action],[count])
select 'insert', count(1)
from inserted;
go

create trigger dbo.update_trigger_target
on [dbo].[target]
after update
as
insert into dbo.[stats] ([action],[count])
select 'update', count(1) from inserted -- or deleted == after / before image, count will be the same
go

create trigger dbo.delete_trigger_target
on [dbo].[target]
after delete
as
insert into dbo.[stats] ([action],[count])
select 'delete', count(1) from deleted
go

如果你需要更多的上下文,可以放入一些内容CONTEXT_INFO并从触发器中取出.

现在,我要断言AFTER触发器都没有贵,但你需要测试以确保万无一失.

已经处理了这一点,你就可以自由使用OUTPUT条款( OUTPUT INTO中)MERGE,然后使用嵌套在选择子集,你想进入的数据queue表.

理由

由于需要从两者中访问列staging并且target为了构建数据queue,因此使用选项in 来完成此HAS,因为没有其他任何东西可以访问"双方".OUTPUTMERGE

然后,如果我们劫持了该OUTPUT条款queue,我们如何重新使用该功能呢?考虑AFTER到您所描述的统计数据的要求,我认为触发器将起作用.实际上,如果需要,根据可用的图像,统计数据可能非常复杂.我断言,该AFTER触发器是"不贵"因为之前和之后都必须始终可用,以便交易可以同时提交的数据 BACK轧制-是的,需要对数据进行扫描(即使获得计数)但这似乎不太费钱.

在我自己的分析中,扫描为执行计划的基本成本增加了约5%

听起来像解决方案?



1> Vladimir Bar..:

我的理解是,主要障碍是OUTPUTSQL Server中子句的限制.它允许将结果集返回给调用者的一个OUTPUT INTO table和/或一个OUTPUT.

您希望以MERGE两种不同的方式保存语句的结果:

MERGE收集统计信息影响的所有行

只插入行 queue


简单的变种

我会用你的S2解决方案.至少从一开始.这是很容易理解和维护,应该是相当有效的,因为资源最密集的操作(MERGETarget本身将只执行一次).下面有第二个变体,比较它们在实际数据上的表现会很有趣.

所以:

OUTPUT INTO @TempTableMERGE

无论是INSERT全部由行@TempTableStats插入前或聚合.如果您只需要聚合统计信息,那么汇总此批处理的结果并将其合并到final中Stats而不是复制所有行是有意义的.

INSERTQueue只进入"插入"行@TempTable.

我将从@ i-one的答案中获取样本数据.

架构

-- I'll return to commented lines later

CREATE TABLE [dbo].[TestTarget](
    -- [ID] [int] IDENTITY(1,1) NOT NULL,
    [foo] [varchar](10) NULL,
    [bar] [varchar](10) NULL
);

CREATE TABLE [dbo].[TestStaging](
    [foo] [varchar](10) NULL,
    [bar] [varchar](10) NULL,
    [baz] [varchar](10) NULL
);

CREATE TABLE [dbo].[TestStats](
    [MergeAction] [nvarchar](10) NOT NULL
);

CREATE TABLE [dbo].[TestQueue](
    -- [TargetID] [int] NOT NULL,
    [foo] [varchar](10) NULL,
    [baz] [varchar](10) NULL
);

样本数据

TRUNCATE TABLE [dbo].[TestTarget];
TRUNCATE TABLE [dbo].[TestStaging];
TRUNCATE TABLE [dbo].[TestStats];
TRUNCATE TABLE [dbo].[TestQueue];

INSERT INTO [dbo].[TestStaging]
    ([foo]
    ,[bar]
    ,[baz])
VALUES
    ('A', 'AA', 'AAA'),
    ('B', 'BB', 'BBB'),
    ('C', 'CC', 'CCC');

INSERT INTO [dbo].[TestTarget]
    ([foo]
    ,[bar])
VALUES
    ('A', 'A_'),
    ('B', 'B?');

合并

DECLARE @TempTable TABLE (
    MergeAction nvarchar(10) NOT NULL,
    foo varchar(10) NULL,
    baz varchar(10) NULL);

MERGE INTO TestTarget AS Dst
USING TestStaging AS Src
ON Dst.foo = Src.foo
WHEN MATCHED THEN
UPDATE SET
    Dst.bar = Src.bar
WHEN NOT MATCHED BY TARGET THEN
INSERT (foo, bar)
VALUES (Src.foo, Src.bar)
OUTPUT $action AS MergeAction, inserted.foo, Src.baz
INTO @TempTable(MergeAction, foo, baz)
;

INSERT INTO [dbo].[TestStats] (MergeAction)
SELECT T.MergeAction
FROM @TempTable AS T;

INSERT INTO [dbo].[TestQueue]
    ([foo]
    ,[baz])
SELECT
    T.foo
    ,T.baz
FROM @TempTable AS T
WHERE T.MergeAction = 'INSERT'
;

SELECT * FROM [dbo].[TestTarget];
SELECT * FROM [dbo].[TestStats];
SELECT * FROM [dbo].[TestQueue];

结果

TestTarget
+-----+-----+
| foo | bar |
+-----+-----+
| A   | AA  |
| B   | BB  |
| C   | CC  |
+-----+-----+

TestStats
+-------------+
| MergeAction |
+-------------+
| INSERT      |
| UPDATE      |
| UPDATE      |
+-------------+

TestQueue
+-----+-----+
| foo | baz |
+-----+-----+
| C   | CCC |
+-----+-----+

第二个变种

在SQL Server 2014 Express上测试.

OUTPUT子句可以将其结果集发送到表和调用者.所以,OUTPUT INTO可以Stats直接进入,如果我们将MERGE语句包装到存储过程中,那么我们就可以使用INSERT ... EXECQueue.

如果你检查执行计划,你会看到INSERT ... EXEC在幕后创建一个临时表(参见Adam Machanic 的INSERT EXEC隐藏成本),所以我希望你创建临时表时整体性能与第一个变量相似明确.

还有一个问题需要解决:Queue表应该只有"插入"行,而不是所有受影响的行.要实现这一点,您可以使用Queue表上的触发器来丢弃"插入"以外的行.另一种可能性是定义一个唯一索引,IGNORE_DUP_KEY = ON并以这样的方式准备数据,即"未插入"行将违反唯一索引,并且不会插入到表中.

所以,我会ID IDENTITYTarget表格中添加一列,然后我会在表格中添加一TargetIDQueue.(在上面的脚本中取消注释它们).另外,我将为Queue表添加一个索引:

CREATE UNIQUE NONCLUSTERED INDEX [IX_TargetID] ON [dbo].[TestQueue]
(
    [TargetID] ASC
) WITH (
PAD_INDEX = OFF, 
STATISTICS_NORECOMPUTE = OFF, 
SORT_IN_TEMPDB = OFF, 
IGNORE_DUP_KEY = ON, 
DROP_EXISTING = OFF, 
ONLINE = OFF, 
ALLOW_ROW_LOCKS = ON, 
ALLOW_PAGE_LOCKS = ON)

重要的是UNIQUEIGNORE_DUP_KEY = ON.

这是以下存储过程MERGE:

CREATE PROCEDURE [dbo].[TestMerge]
AS
BEGIN
    SET NOCOUNT ON;
    SET XACT_ABORT ON;

    MERGE INTO dbo.TestTarget AS Dst
    USING dbo.TestStaging AS Src
    ON Dst.foo = Src.foo
    WHEN MATCHED THEN
    UPDATE SET
        Dst.bar = Src.bar
    WHEN NOT MATCHED BY TARGET THEN
    INSERT (foo, bar)
    VALUES (Src.foo, Src.bar)
    OUTPUT $action INTO dbo.TestStats(MergeAction)
    OUTPUT CASE WHEN $action = 'INSERT' THEN inserted.ID ELSE 0 END AS TargetID, 
    inserted.foo,
    Src.baz
    ;

END

用法

TRUNCATE TABLE [dbo].[TestTarget];
TRUNCATE TABLE [dbo].[TestStaging];
TRUNCATE TABLE [dbo].[TestStats];
TRUNCATE TABLE [dbo].[TestQueue];

-- Make sure that `Queue` has one special row with TargetID=0 in advance.
INSERT INTO [dbo].[TestQueue]
    ([TargetID]
    ,[foo]
    ,[baz])
VALUES
    (0
    ,NULL
    ,NULL);

INSERT INTO [dbo].[TestStaging]
    ([foo]
    ,[bar]
    ,[baz])
VALUES
    ('A', 'AA', 'AAA'),
    ('B', 'BB', 'BBB'),
    ('C', 'CC', 'CCC');

INSERT INTO [dbo].[TestTarget]
    ([foo]
    ,[bar])
VALUES
    ('A', 'A_'),
    ('B', 'B?');

INSERT INTO [dbo].[TestQueue]
EXEC [dbo].[TestMerge];

SELECT * FROM [dbo].[TestTarget];
SELECT * FROM [dbo].[TestStats];
SELECT * FROM [dbo].[TestQueue];

结果

TestTarget
+----+-----+-----+
| ID | foo | bar |
+----+-----+-----+
|  1 | A   | AA  |
|  2 | B   | BB  |
|  3 | C   | CC  |
+----+-----+-----+

TestStats
+-------------+
| MergeAction |
+-------------+
| INSERT      |
| UPDATE      |
| UPDATE      |
+-------------+

TestQueue
+----------+------+------+
| TargetID | foo  | baz  |
+----------+------+------+
|        0 | NULL | NULL |
|        3 | C    | CCC  |
+----------+------+------+

在以下期间会有额外的消息INSERT ... EXEC:

Duplicate key was ignored.

如果MERGE更新了一些行.当唯一索引过程中丢弃一些行发送此警告消息INSERTIGNORE_DUP_KEY = ON.

将重复键值插入唯一索引时,将出现警告消息.只有违反唯一性约束的行才会失败.


@VladimirBaranov:IGNORE_DUP_KEY是我在阅读完所有OP要求后首先想到的.不要分析任何东西只是抛出一切队列表.OP甚至可以批量插入 - 这比`INSERT ... NOT EXIST(..)`更快

2> i-one..:

考虑以下两种方法来解决问题:

将数据合并到单个语句中插入到队列中的目标和输出中,并在目标上创建的触发器中汇总统计信息.批处理标识符可以通过临时表传递给触发器.

将数据合并到单个语句中插入到队列中的目标和输出中,并使用内置的更改跟踪功能在合并后立即汇总统计信息,而不是在触发器中执行此操作.

方法1(合并数据并在触发器中收集统计信息):

示例数据设置(为简单起见省略了索引和约束):

create table staging (foo varchar(10), bar varchar(10), baz varchar(10));
create table target (foo varchar(10), bar varchar(10));
create table queue (foo varchar(10), baz varchar(10));
create table stats (batchID int, inserted bigint, updated bigint, deleted bigint);

insert into staging values
    ('A', 'AA', 'AAA')
    ,('B', 'BB', 'BBB')
    ,('C', 'CC', 'CCC')
    ;

insert into target values
    ('A', 'A_')
    ,('B', 'B?')
    ,('E', 'EE')
    ;

触发收集插入/更新/删除的统计信息:

create trigger target_onChange
on target
after delete, update, insert
as
begin
    set nocount on;

    if object_id('tempdb..#targetMergeBatch') is NULL
        return;

    declare @batchID int;
    select @batchID = batchID from #targetMergeBatch;

    merge into stats t
    using (
        select
            batchID = @batchID,
            cntIns = count_big(case when i.foo is not NULL and d.foo is NULL then 1 end),
            cntUpd = count_big(case when i.foo is not NULL and d.foo is not NULL then 1 end),
            cntDel = count_big(case when i.foo is NULL and d.foo is not NULL then 1 end)
        from inserted i
            full join deleted d on d.foo = i.foo
    ) s
    on t.batchID = s.batchID
    when matched then
        update
        set
            t.inserted = t.inserted + s.cntIns,
            t.updated = t.updated + s.cntUpd,
            t.deleted = t.deleted + s.cntDel
    when not matched then
        insert (batchID, inserted, updated, deleted)
        values (s.batchID, s.cntIns, s.cntUpd, cntDel);

end

合并声明:

declare @batchID int;
set @batchID = 1;-- or select @batchID = batchID from ...;

create table #targetMergeBatch (batchID int);
insert into #targetMergeBatch (batchID) values (@batchID);

insert into queue (foo, baz)
select foo, baz
from
(
    merge into target t
    using staging s
    on t.foo = s.foo
    when matched then
        update
        set t.bar = s.bar
    when not matched then
        insert (foo, bar)
        values (s.foo, s.bar)
    when not matched by source then
        delete
    output $action, inserted.foo, s.baz
) m(act, foo, baz)
where act = 'INSERT'
    ;

drop table #targetMergeBatch

检查结果:

select * from target;
select * from queue;
select * from stats;

目标:

foo        bar
---------- ----------
A          AA
B          BB
C          CC

队列:

foo        baz
---------- ----------
C          CCC

统计:

batchID  inserted   updated   deleted
-------- ---------- --------- ---------
1        1          2         1

方法2(使用更改跟踪功能收集统计信息):

示例数据设置与以前的情况相同(只需删除所有内容,包括触发器并从头开始重新创建表),除了在这种情况下我们需要在目标上使用PK来使示例工作:

create table target (foo varchar(10) primary key, bar varchar(10));

在数据库上启用更改跟踪:

alter database Test
    set change_tracking = on

在目标表上启用更改跟踪:

alter table target
    enable change_tracking

在此之后立即合并数据并获取统计信息,按更改上下文进行过滤以仅计算受合并影响的行:

begin transaction;
declare @batchID int, @chVersion bigint, @chContext varbinary(128);
set @batchID = 1;-- or select @batchID = batchID from ...;
SET @chVersion = change_tracking_current_version();
set @chContext = newid();

with change_tracking_context(@chContext)
insert into queue (foo, baz)
select foo, baz
from
(
    merge into target t
    using staging s
    on t.foo = s.foo
    when matched then
        update
        set t.bar = s.bar
    when not matched then
        insert (foo, bar)
        values (s.foo, s.bar)
    when not matched by source then
        delete
    output $action, inserted.foo, s.baz
) m(act, foo, baz)
where act = 'INSERT'
    ;

with ch(foo, op) as (
    select foo, sys_change_operation
    from changetable(changes target, @chVersion) ct
    where sys_change_context = @chContext
)
insert into stats (batchID, inserted, updated, deleted)
select @batchID, [I], [U], [D]
from ch
    pivot(count_big(foo) for op in ([I], [U], [D])) pvt
    ;

commit transaction;

检查结果:

select * from target;
select * from queue;
select * from stats;

它们与之前的样本相同.

目标:

foo        bar
---------- ----------
A          AA
B          BB
C          CC

队列:

foo        baz
---------- ----------
C          CCC

统计:

batchID  inserted   updated   deleted
-------- ---------- --------- ---------
1        1          2         1



3> dsz..:

我建议使用三个独立的AFTER INSERT / DELETE / UPDATE触发器来提取统计数据:

create trigger dbo.insert_trigger_target
on [dbo].[target]
after insert
as
insert into dbo.[stats] ([action],[count])
select 'insert', count(1)
from inserted;
go

create trigger dbo.update_trigger_target
on [dbo].[target]
after update
as
insert into dbo.[stats] ([action],[count])
select 'update', count(1) from inserted -- or deleted == after / before image, count will be the same
go

create trigger dbo.delete_trigger_target
on [dbo].[target]
after delete
as
insert into dbo.[stats] ([action],[count])
select 'delete', count(1) from deleted
go

如果你需要更多的上下文,可以放入一些内容CONTEXT_INFO并从触发器中取出.

现在,我要断言AFTER触发器都没有贵,但你需要测试以确保万无一失.

已经处理了这一点,你就可以自由使用OUTPUT条款( OUTPUT INTO中)MERGE,然后使用嵌套在选择子集,你想进入的数据queue表.

理由

由于需要从两者中访问列staging并且target为了构建数据queue,因此使用选项in 来完成此HAS,因为没有其他任何东西可以访问"双方".OUTPUTMERGE

然后,如果我们劫持了该OUTPUT条款queue,我们如何重新使用该功能呢?考虑AFTER到您所描述的统计数据的要求,我认为触发器将起作用.实际上,如果需要,根据可用的图像,统计数据可能非常复杂.我断言,该AFTER触发器是"不贵"因为之前和之后都必须始终可用,以便交易可以同时提交的数据 BACK轧制-是的,需要对数据进行扫描(即使获得计数)但这似乎不太费钱.

在我自己的分析中,扫描为执行计划的基本成本增加了约5%

听起来像解决方案?

推荐阅读
N个小灰流_701
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有