SQL UNION ALL消除重复

作者：郑小蒜9299_941611_G | 2023-09-06 21:52

如何解决《SQLUNIONALL消除重复》经验，为你挑选了1个好方法。

我发现这个样本面试问题和答案发布在toptal上.但我真的不懂代码.UNION ALL如何变成UNIION(不同)？另外,为什么这段代码更快？

题

使用UNION ALL(非UNION)编写SQL查询,该查询使用WHERE子句来消除重复项.你为什么要这样做？隐藏答案您可以使用UNION ALL避免重复,并且通过运行如下查询仍然比UNION DISTINCT(实际上与UNION相同)运行得快得多:

回答

SELECT * FROM mytable WHERE a=X UNION ALL SELECT * FROM mytable WHERE b=Y AND a!=X

The key is the AND a!=X part. This gives you the benefits of the UNION (a.k.a., UNION DISTINCT) command, while avoiding much of its performance hit.

1> Bill Karwin..：

但在该示例中,第一个查询在列上具有条件a,而第二个查询在列上具有条件b.这可能来自一个难以优化的查询:

SELECT * FROM mytable WHERE a=X OR b=Y

This query is hard to optimize with simple B-tree indexing. Does the engine search an index on column a? Or on column b? Either way, searching the other term requires a table-scan.

Hence the trick of using UNION to separate into two queries for one term each. Each subquery can use the best index for each search term. Then combine the results using UNION.

But the two subsets may overlap, because some rows where b=Y may also have a=X in which case such rows occur in both subsets. Therefore you have to do duplicate elimination, or else see some rows twice in the final result.

SELECT * FROM mytable WHERE a=X 
UNION DISTINCT
SELECT * FROM mytable WHERE b=Y

UNION DISTINCT is expensive because typical implementations sort the rows to find duplicates. Just like if you use SELECT DISTINCT ....

We also have a perception that it's even more "wasted" work if the two subset of rows you are unioning have a lot of rows occurring in both subsets. It's a lot of rows to eliminate.

But there's no need to eliminate duplicates if you can guarantee that the two sets of rows are already distinct. That is, if you guarantee there is no overlap. If you can rely on that, then it would always be a no-op to eliminate duplicates, and therefore the query can skip that step, and therefore skip the costly sorting.

If you change the queries so that they are guaranteed to select non-overlapping subsets of rows, that's a win.

SELECT * FROM mytable WHERE a=X 
UNION ALL 
SELECT * FROM mytable WHERE b=Y AND a!=X

These two sets are guaranteed to have no overlap. If the first set has rows where a=X and the second set has rows where a!=X then there can be no row that is in both sets.

The second query therefore only catches some of the rows where b=Y, but any row where a=X AND b=Y is already included in the first set.

So the query achieves an optimized search for two OR terms, without producing duplicates, and requiring no UNION DISTINCT operation.

推荐阅读

程序员
如何用"yyyyMMdd Hmm"格式将字符串解析为DateTime？

如何解决《如何用"yyyyMMddHmm"格式将字符串解析为DateTime？》经验，为你挑选了1个好方法。 ... [详细]
程序员
詹金斯（Jenkins）存档失败说：“您必须提供营销或技术版本的价值。两者都找不到。”

如何解决《詹金斯（Jenkins）存档失败说：“您必须提供营销或技术版本的价值。两者都找不到。”》经验，为你挑选了0个好方法。 ... [详细]
程序员
适用于Android 4.0.3(API 15)及以下版本的TextView.getMaxLines()的替代方法

如何解决《适用于Android4.0.3(API15)及以下版本的TextView.getMaxLines()的替代方法》经验，为你挑选了1个好方法。 ... [详细]
程序员
在copytree()完成之前,print()不会打印

如何解决《在copytree()完成之前,print()不会打印》经验，为你挑选了1个好方法。 ... [详细]
程序员
pandas.DataFrame可以有列表类型列吗？

如何解决《pandas.DataFrame可以有列表类型列吗？》经验，为你挑选了1个好方法。 ... [详细]
程序员
如何使用Spark Java返回静态html页面？

如何解决《如何使用SparkJava返回静态html页面？》经验，为你挑选了2个好方法。 ... [详细]
程序员
在Java的流中,Haskell的scanl相当于什么？

如何解决《在Java的流中,Haskell的scanl相当于什么？》经验，为你挑选了1个好方法。 ... [详细]
程序员
使用静态或非静态类

如何解决《使用静态或非静态类》经验，为你挑选了1个好方法。 ... [详细]
程序员
NET :: ERR_CERT_AUTHORITY_INVALID https为红色

如何解决《NET::ERR_CERT_AUTHORITY_INVALIDhttps为红色》经验，为你挑选了1个好方法。 ... [详细]
程序员
使用公共属性创建动态对象

如何解决《使用公共属性创建动态对象》经验，为你挑选了1个好方法。 ... [详细]
程序员
不使用https的chrome 47中的getUserMedia()

如何解决《不使用https的chrome47中的getUserMedia()》经验，为你挑选了1个好方法。 ... [详细]
程序员
Symfony 3.0嵌套实体不保存

如何解决《Symfony3.0嵌套实体不保存》经验，为你挑选了1个好方法。 ... [详细]
程序员
当使用jenkins运行测试时,iOS无法启动模拟器

如何解决《当使用jenkins运行测试时,iOS无法启动模拟器》经验，为你挑选了1个好方法。 ... [详细]
程序员
是否有可能使用杰克逊从Pojo获得价值

如何解决《是否有可能使用杰克逊从Pojo获得价值》经验，为你挑选了1个好方法。 ... [详细]
程序员
ContinueWith和TaskCancellation - 如果任务失败,如何返回默认值？

如何解决《ContinueWith和TaskCancellation-如果任务失败,如何返回默认值？》经验，为你挑选了1个好方法。 ... [详细]
程序员
DependencyInjection是否支持服务的自动注册？

如何解决《DependencyInjection是否支持服务的自动注册？》经验，为你挑选了0个好方法。 ... [详细]
程序员
C ++ OpenCV 2.4.11：列出所有摄像机

如何解决《C++OpenCV2.4.11：列出所有摄像机》经验，为你挑选了0个好方法。 ... [详细]
程序员
使用NSNotificationCenter时,ARC无法正常工作

如何解决《使用NSNotificationCenter时,ARC无法正常工作》经验，为你挑选了1个好方法。 ... [详细]
程序员
找不到谷歌地图片段

如何解决《找不到谷歌地图片段》经验，为你挑选了1个好方法。 ... [详细]
程序员
为什么Spring的@Transactional不能用于受保护的方法？

如何解决《为什么Spring的@Transactional不能用于受保护的方法？》经验，为你挑选了2个好方法。 ... [详细]

郑小蒜9299_941611_G

这个屌丝很懒，什么也没留下！

关注作者

Tags | 热门标签

RankList | 热门文章