我试图在postgresql上写下面的查询:
select name, author_id, count(1), (select count(1) from names as n2 where n2.id = n1.id and t2.author_id = t1.author_id ) from names as n1 group by name, author_id
这肯定适用于Microsoft SQL Server,但它在postegresql上完全不起作用.我稍微阅读了它的文档,似乎我可以将其重写为:
select name, author_id, count(1), total from names as n1, (select count(1) as total from names as n2 where n2.id = n1.id and n2.author_id = t1.author_id ) as total group by name, author_id
但是这会在postegresql上返回以下错误:"FROM中的子查询不能引用相同查询级别的其他关系".所以我被卡住了.有谁知道我怎么能做到这一点?
谢谢
我不确定我是否理解你的意图,但也许以下内容可能与你想要的一致:
select n1.name, n1.author_id, count_1, total_count from (select id, name, author_id, count(1) as count_1 from names group by id, name, author_id) n1 inner join (select id, author_id, count(1) as total_count from names group by id, author_id) n2 on (n2.id = n1.id and n2.author_id = n1.author_id)
不幸的是,这增加了按id以及name和author_id对第一个子查询进行分组的要求,我认为不需要.我不知道如何解决这个问题,因为你需要有id可以加入第二个子查询.也许其他人会提出更好的解决方案.
分享和享受.
我只是在这里回答我需要的最终sql的格式化版本,基于我在上面的评论中发布的Bob Jarvis答案:
select n1.name, n1.author_id, cast(count_1 as numeric)/total_count from (select id, name, author_id, count(1) as count_1 from names group by id, name, author_id) n1 inner join (select author_id, count(1) as total_count from names group by author_id) n2 on (n2.author_id = n1.author_id)
补充@Bob Jarvis和@dmikam回答,Postgres在你不使用LATERAL时没有执行一个好的计划,在模拟之下,在这两种情况下查询数据结果是相同的,但成本是非常不同的
表结构
CREATE TABLE ITEMS ( N INTEGER NOT NULL, S TEXT NOT NULL ); INSERT INTO ITEMS SELECT (random()*1000000)::integer AS n, md5(random()::text) AS s FROM generate_series(1,1000000); CREATE INDEX N_INDEX ON ITEMS(N);
执行JOIN
与GROUP BY
子查询无LATERAL
EXPLAIN SELECT I.* FROM ITEMS I INNER JOIN ( SELECT COUNT(1), n FROM ITEMS GROUP BY N ) I2 ON I2.N = I.N WHERE I.N IN (243477, 997947);
结果
Merge Join (cost=0.87..637500.40 rows=23 width=37) Merge Cond: (i.n = items.n) -> Index Scan using n_index on items i (cost=0.43..101.28 rows=23 width=37) Index Cond: (n = ANY ('{243477,997947}'::integer[])) -> GroupAggregate (cost=0.43..626631.11 rows=861418 width=12) Group Key: items.n -> Index Only Scan using n_index on items (cost=0.43..593016.93 rows=10000000 width=4)
运用 LATERAL
EXPLAIN SELECT I.* FROM ITEMS I INNER JOIN LATERAL ( SELECT COUNT(1), n FROM ITEMS WHERE N = I.N GROUP BY N ) I2 ON 1=1 --I2.N = I.N WHERE I.N IN (243477, 997947);
结果
Nested Loop (cost=9.49..1319.97 rows=276 width=37) -> Bitmap Heap Scan on items i (cost=9.06..100.20 rows=23 width=37) Recheck Cond: (n = ANY ('{243477,997947}'::integer[])) -> Bitmap Index Scan on n_index (cost=0.00..9.05 rows=23 width=0) Index Cond: (n = ANY ('{243477,997947}'::integer[])) -> GroupAggregate (cost=0.43..52.79 rows=12 width=12) Group Key: items.n -> Index Only Scan using n_index on items (cost=0.43..52.64 rows=12 width=4) Index Cond: (n = i.n)
我的Postgres版本是 PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)
我知道这是旧的,但是自Postgresql 9.3以来,有一个选项可以使用关键字"LATERAL"在JOINS中使用RELATED子查询,因此问题中的查询看起来像:
SELECT name, author_id, count(*), t.total FROM names as n1 INNER JOIN LATERAL ( SELECT count(*) as total FROM names as n2 WHERE n2.id = n1.id AND n2.author_id = n1.author_id ) as t ON 1=1 GROUP BY n1.name, n1.author_id