我找不到以抽象和一般方式描述我的问题的方法,所以我只提供一个最小的例子:
假设我有这3个简单的表格:
CREATE TABLE Document( [Id] int IDENTITY(1, 1) NOT NULL PRIMARY KEY, [Title] nvarchar(MAX), [Patient] nvarchar(MAX) ); CREATE TABLE Link( DocumentId INT FOREIGN KEY REFERENCES Document(Id), Text nvarchar(max) ); CREATE TABLE ReadStatus( DocumentId INT FOREIGN KEY REFERENCES Document(Id), IsRead Bit NOT NULL, UserId Int NOT NULL );
我们有一套文件
文档可以包含0个或更多链接
用户可以读取文档 - 这由ReadStatus
表跟踪,该表将用户与文档相关联,并且IsRead=1
表示文档已被该用户读取,并且IsRead=0
表示该用户尚未读取该文档.
如果对于文档X
和用户A
,ReadStatus
表中不存在行,我们假设用户A
尚未读取文档X
.
现在,我需要运行一个查询来选择所有患者.对于每位患者,我需要可用的文件总数和已经阅读的文件数量(即IsRead=1
).这是我到目前为止:
SELECT d.Patient, COUNT(DISTINCT d.Id) AS DocumentCount, COUNT(NULLIF(rs.IsRead,0)) AS ReadDocumentCount, COUNT(*) OVER () AS TotalPatientCount FROM Document d LEFT OUTER JOIN ReadStatus AS rs ON d.Id = rs.DocumentId AND rs.UserId = 123 INNER JOIN Link AS l ON d.Id = l.DocumentId AND l.Text IN ('Link W', 'Link X', 'Link T', 'Link Z') GROUP BY d.Patient
当文档(已经读取)具有多个链接时,会发生此问题.如果该文档有3个链接,则由INNER JOIN和Link表生成的笛卡尔积将使ReadDocumentCount
选择为3而不是1.
换句话说,鉴于此数据:
INSERT INTO Document(Title, Patient) VALUES('Doc A', 'Mike') INSERT INTO Document(Title, Patient) VALUES('Doc B', 'Mike') INSERT INTO Link(DocumentId, Text) VALUES(1, N'Link W') INSERT INTO Link(DocumentId, Text) VALUES(1, N'Link X') INSERT INTO Link(DocumentId, Text) VALUES(1, N'Link Y') INSERT INTO Link(DocumentId, Text) VALUES(2, N'Link Z') INSERT INTO ReadStatus(DocumentID, IsRead, UserId) VALUES(1, 1, 123) INSERT INTO ReadStatus(DocumentID, IsRead, UserId) VALUES(2, 0, 123)
我得到了这个结果:
Patient DocumentCount ReadDocumentCount TotalPatientCount Mike 2 3 1
虽然这是我想要的:
Patient DocumentCount ReadDocumentCount TotalPatientCount Mike 2 1 1
SQL小提琴:http://sqlfiddle.com/#!6/ e06bf/3
您也可以COUNT(DISTINCT)
有条件地使用:
SELECT d.Patient, COUNT(DISTINCT d.Id) AS DocumentCount, COUNT(DISTINCT (CASE WHEN rs.IsRead <> 0 THEN d.id END)) AS ReadDocumentCount, COUNT(*) OVER () AS TotalPatientCount FROM Document d LEFT OUTER JOIN ReadStatus rs ON d.Id = rs.DocumentId AND rs.UserId = 123 INNER JOIN Link l ON d.Id = l.DocumentId AND l.Text IN ('Link W', 'Link X', 'Link T', 'Link Z') GROUP BY d.Patient;