首页新闻手机家电数码电脑财经大模型直播

循序渐进讲解Oracle数据库的Hash join

来源：赛迪网作者：若水 2008-05-04/10:10

软件频道

编程学院

Oracle

体系架构

正文

在开发过程中，很多人经常会使用到Hash Map或者Hash Set这种数据结构，这种数据结构的特点就是插入和访问速度快。当向集合中加入一个对象时，会调用hash算法来获得hash code，然后根据hash code分配存放位置。访问的时，根据hashcode直接找到存放位置。

Oracle Hash join 是一种非常高效的join 算法，主要以CPU（hash计算）和内存空间（创建hash table）为代价获得最大的效率。Hash join一般用于大表和小表之间的连接，我们将小表构建到内存中，称为Hash cluster，大表称为probe表。

效率

Hash join具有较高效率的两个原因：

1.Hash 查询，根据映射关系来查询值，不需要遍历整个数据结构。

2.Mem 访问速度是Disk的万倍以上。

理想化的Hash join的效率是接近对大表的单表选择扫描的。

首先我们来比较一下，几种join之间的效率，首先 optimizer会自动选择使用hash join。

注意到Cost= 221

SQL> select * from vendition t,customer b WHERE t.customerid = b.customerid;

100000 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 3402771356

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 106K| 22M| 221 (3)| 00:00:03 |

|* 1 | HASH JOIN | | 106K| 22M| 221 (3)| 00:00:03 |

| 2 | TABLE ACCESS FULL| CUSTOMER | 5000 | 424K| 9 (0)| 00:00:01 |

| 3 | TABLE ACCESS FULL| VENDITION | 106K| 14M| 210 (2)| 00:00:03 |

--------------------------------------------------------------------------------

不使用hash，这时optimizer自动选择了merge join。。

注意到Cost=3507大大的增加了。

SQL> select /*+ USE_MERGE (t b) */* from vendition t,customer b WHERE t.customerid = b.customerid;

100000 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1076153206

-----------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 106K| 22M| | 3507 (1)| 00:00:43 |

| 1 | MERGE JOIN | | 106K| 22M| | 3507 (1)| 00:00:43 |

| 2 | SORT JOIN | | 5000 | 424K| | 10 (10)| 00:00:01 |

| 3 | TABLE ACCESS FULL| CUSTOMER | 5000 | 424K| | 9 (0)| 00:00:01 |

|* 4 | SORT JOIN | | 106K| 14M| 31M| 3496 (1)| 00:00:42 |

| 5 | TABLE ACCESS FULL| VENDITION | 106K| 14M| | 210 (2)| 00:00:03 |

-----------------------------------------------------------------------------------------

那么Nest loop呢，经过漫长的等待后，发现Cost达到了惊人的828K，同时伴随3814337 consistent gets（由于没有建索引），可见在这个测试中，Nest loop是最低效的。在给customerid建立唯一索引后，减低到106K，但仍然是内存join的上千倍。

SQL> select /*+ USE_NL(t b) */* from vendition t,customer b WHERE t.customerid = b.customerid;

100000 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 2015764663

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 106K| 22M| 828K (2)| 02:45:41 |

| 1 | NESTED LOOPS | | 106K| 22M| 828K (2)| 02:45:41 |

| 2 | TABLE ACCESS FULL| VENDITION | 106K| 14M| 210 (2)| 00:00:03 | #p#分页标题#e#

|* 3 | TABLE ACCESS FULL| CUSTOMER | 1 | 87 | 8 (0)| 00:00:01 |

HASH的内部

HASH_AREA_SIZE在Oracle 9i 和以前，都是影响hash join性能的一个重要的参数。但是在10g发生了一些变化。Oracle不建议使用这个参数，除非你是在MTS模式下。Oracle建议采用自动PGA管理（设置PGA_AGGREGATE_TARGET和WORKAREA_SIZE_POLICY）来，替代使用这个参数。由于我的测试环境是mts环境，自动内存管理，所以我在这里只讨论mts下的hash join。

Mts的PGA中，只包含了一些栈空间信息，UGA则包含在large pool中，那么实际类似hash，sort，merge等操作都是有large pool来分配空间，large pool同时也是auto管理的，它和SGA_TARGET有关。所以在这种条件下，内存的分配是很灵活。

Hash连接根据内存分配的大小，可以有三种不同的效果：

1.optimal 内存完全足够

2.onepass 内存不能装载完小表

3.multipass workarea executions 内存严重不足

下面，分别测试小表为50行，500行和5000行，内存的分配情况（内存都能完全转载）。

Vendition表 10W条记录

Customer表 5000

Customer_small 500，去Customer表前500行建立

Customer_pity 50，取Customer表前50行建立

表的统计信息如下：

SQL> SELECT s.table_name,S.BLOCKS,S.AVG_SPACE,S.NUM_ROWS,S.AVG_ROW_LEN,S.EMPTY_BLOCKS FROM user_tables S WHERE table_name IN ('CUSTOMER','VENDITION','CUSTOMER_SMALL','CUSTOMER_PITY') ;

TABLE_NAME BLOCKS AVG_SPACE NUM_ROWS AVG_ROW_LEN EMPTY_BLOCKS

CUSTOMER 35 1167 5000 38 5

CUSTOMER_PITY 4 6096 50 37 4

CUSTOMER_SMALL 6 1719 500 36 2

VENDITION 936 1021 100000 64 88打开10104事件追踪：（hash 连接追踪）

ALTER SYSTEM SET EVENTS ‘ 10104 TRACE NAME CONTEXT,LEVEL 2’;

测试SQL

SELECT * FROM vendition a,customer b WHERE a.customerid = b.customerid;

SELECT * FROM vendition a,customer_small b WHERE a.customerid = b.customerid;

SELECT * FROM vendition a,customer_pity b WHERE a.customerid = b.customerid;

小表50行时候的trace分析：

*** 2008-03-23 18:17:49.467

*** SESSION ID:(773.23969) 2008-03-23 18:17:49.467

kxhfInit(): enter

kxhfInit(): exit

*** RowSrcId: 1 HASH JOIN STATISTICS (INITIALIZATION) ***

Join Type: INNER join

Original hash-area size: 3883510

PS:hash area的大小，大约380k，本例中最大的表也不过250块左右，所以内存完全可以完全装载

Memory for slot table: 2826240

Calculated overhead for partitions and row/slot managers: 1057270

Hash-join fanout: 8

Number of partitions: 8

PS:hash 表数据连一个块都没装满，Oracle仍然对数据进行了分区，这里和以前在一些文档上看到的，当内存不足时才会对数据分区的说法，发生了变化。

Number of slots: 23

Multiblock IO: 15

Block size(KB): 8

Cluster (slot) size(KB): 120

PS:分区中全部行占有的cluster的size

Minimum number of bytes per block: 8160

Bit vector memory allocation(KB): 128

Per partition bit vector length(KB): 16

Maximum possible row length: 270

Estimated build size (KB): 0

Estimated Build Row Length (includes overhead): 45

# Immutable Flags:

Not BUFFER(execution) output of the join for PQ

Evaluate Left Input Row Vector

Evaluate Right Input Row Vector

# Mutable Flags:

IO sync

kxhfSetPhase: phase=BUILD

kxhfAddChunk: add chunk 0 (sz=32) to slot table

kxhfAddChunk: chunk 0 (lbs=0x2a97825c38, slotTab=0x2a97825e00) successfuly added

kxhfSetPhase: phase=PROBE_1

qerhjFetch: max build row length (mbl=44)

*** RowSrcId: 1 END OF HASH JOIN BUILD (PHASE 1) ***

Revised row length: 45

Revised build size: 2KB

kxhfResize(enter): resize to 12 slots (numAlloc=8, max=23)

kxhfResize(exit): resized to 12 slots (numAlloc=8, max=12)

Slot table resized: old=23 wanted=12 got=12 unload=0

*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 1) ***

Total number of partitions: 8

Number of partitions which could fit in memory: 8

Number of partitions left in memory: 8

Total number of slots in in-memory partitions: 8

Total number of rows in in-memory partitions: 50

(used as preliminary number of buckets in hash table)

Estimated max # of build rows that can fit in avail memory: 66960

### Partition Distribution ###

Partition:0 rows:5 clusters:1 slots:1 kept=1

Partition:1 rows:6 clusters:1 slots:1 kept=1

Partition:2 rows:4 clusters:1 slots:1 kept=1

Partition:3 rows:9 clusters:1 slots:1 kept=1

Partition:4 rows:5 clusters:1 slots:1 kept=1

Partition:5 rows:9 clusters:1 slots:1 kept=1

Partition:6 rows:4 clusters:1 slots:1 kept=1

Partition:7 rows:8 clusters:1 slots:1 kept=1

PS:每个分区只有不到10行，这里有一个重要的参数Kept，1在内存中，0在磁盘

*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***

PS:hash join的第一阶段，但是要观察更多的阶段，需提高trace的level，这里略过

Revised number of hash buckets (after flushing): 50

Allocating new hash table.

*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***

Requested size of hash table: 16

Actual size of hash table: 16

Number of buckets: 128

Match bit vector allocated: FALSE

kxhfResize(enter): resize to 14 slots (numAlloc=8, max=12)

kxhfResize(exit): resized to 14 slots (numAlloc=8, max=14)

freeze work area size to: 2359K (14 slots)

*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***

Total number of rows (may have changed): 50

Number of in-memory partitions (may have changed): 8

Final number of hash buckets: 128

Size (in bytes) of hash table: 1024

kxhfIterate(end_iterate): numAlloc=8, maxSlots=14

*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***

### Hash table ###

# NOTE: The calculated number of rows in non-empty buckets may be smaller

# than the true number.

Number of buckets with 0 rows: 86

Number of buckets with 1 rows: 37

Number of buckets with 2 rows: 5

Number of buckets with 3 rows: 0

PS:桶里面的行数，最大的桶也只有2行，理论上，桶里面的行数越少，性能越佳。

Number of buckets with 4 rows: 0

Number of buckets with 5 rows: 0

Number of buckets with 6 rows: 0

Number of buckets with 7 rows: 0

Number of buckets with 8 rows: 0

Number of buckets with 9 rows: 0

Number of buckets with between 10 and 19 rows: 0

Number of buckets with between 20 and 29 rows: 0

Number of buckets with between 30 and 39 rows: 0

Number of buckets with between 40 and 49 rows: 0

Number of buckets with between 50 and 59 rows: 0

Number of buckets with between 60 and 69 rows: 0

Number of buckets with between 70 and 79 rows: 0

Nmber of buckets with between 80 and 89 rows: 0

Number of buckets with between 90 and 99 rows: 0

Number of buckets with 100 or more rows: 0

### Hash table overall statistics ###

Total buckets: 128 Empty buckets: 86 Non-empty buckets: 42

PS:创建了128个桶，Oracle 7开始的计算公式

Bucket数=0.8*hash_area_size/(hash_multiblock_io_count*db_block_size)

但是不准确，估计10g发生了变化。

Total number of rows: 50

Maximum number of rows in a bucket: 2

Average number of rows in non-empty buckets: 1.190476

小表500行时候的trace分析

Original hash-area size: 3925453

Memory for slot table: 2826240

。。。

Hash-join fanout: 8

Number of partitions: 8

。。。

### Partition Distribution ###

Partition:0 rows:52 clusters:1 slots:1 kept=1

Partition:1 rows:63 clusters:1 slots:1 kept=1

Partition:2 rows:55 clusters:1 slots:1 kept=1

Partition:3 rows:74 clusters:1 slots:1 kept=1

Partition:4 rows:66 clusters:1 slots:1 kept=1

Partition:5 rows:66 clusters:1 slots:1 kept=1

Partition:6 rows:54 clusters:1 slots:1 kept=1

Partition:7 rows:70 clusters:1 slots:1 kept=1

PS：每个partition的行数增加

。。。

Number of buckets with 0 rows: 622

Number of buckets with 1 rows: 319

Number of buckets with 2 rows: 71

Number of buckets with 3 rows: 10

Number of buckets with 4 rows: 2

Number of buckets with 5 rows: 0

。。。

### Hash table overall statistics ###

Total buckets: 1024 Empty buckets: 622 Non-empty buckets: 402

Total number of rows: 500

Maximum number of rows in a bucket: 4

Average number of rows in non-empty buckets: 1.243781

小表5000行时候的trace分析

Original hash-area size: 3809692

Memory for slot table: 2826240

。。。

Hash-join fanout: 8

Number of partitions: 8

Nuber of slots: 23

Multiblock IO: 15

Block size(KB): 8

Cluster (slot) size(KB): 120

Minimum number of bytes per block: 8160

Bit vector memory allocation(KB): 128

Per partition bit vector length(KB): 16

Maximum possible row length: 270

Estimated build size (KB): 0

。。。

### Partition Distribution ###

Partition:0 rows:588 clusters:1 slots:1 kept=1

Partition:1 rows:638 clusters:1 slots:1 kept=1

Partition:2 rows:621 clusters:1 slots:1 kept=1

Partiton:3 rows:651 clusters:1 slots:1 kept=1

Partition:4 rows:645 clusters:1 slots:1 kept=1

Partition:5 rows:611 clusters:1 slots:1 kept=1

Partitio:6 rows:590 clusters:1 slots:1 kept=1

Partition:7 rows:656 clusters:1 slots:1 kept=1

。。。

# than the true number.

Number of buckets with 0 rows: 4429

Number of buckets with 1 rows: 2762

Number of buckets with 2 rows: 794

Number of buckets with 3 rows: 182

Number of buckets with 4 rows: 23

Number of buckets with 5 rows: 2

Number of buckets with 6 rows: 0

。。。

### Hash table overall statistics ###

Total buckets: 8192 Empty buckets: 4429 Non-empty buckets: 3763

Total number of rows: 5000

Maximum number of rows in a bucket: 5

PS:当小表上升到5000行的时候，bucket的rows最大也不过5行。注意，如果bucket行数过多，遍历带来的开销会带来性能的严重下降。

Average number of rows in non-empty buckets: 1.328727

结论：

Oracle数据库10g中，内存问题并不是干扰Hash join的首要问题，现今硬件价格越来越便宜，内存2G，8G，64G的环境也很常见。大家在针对hash join调优的过程，更要偏重于partition和bucket的数据分配诊断。（

#p#分页标题#e#

{{cmoun}}人已赞

评论 {{userinfo.comments}}

{{c.nickname}}
{{c.create_time}}

{{c.content}}

点赞已赞 ({{c.count_praise}}) 回复({{c.count_reply}})

{{child.nickname}}
{{child.create_time}}

{{child.content}}

查看全部评论

{{money}}元

A {{question.A}}

B {{question.B}}

C {{question.C}}

D {{question.D}}

提交

专题更多

2024年度中兴通讯云网生态峰会

聚焦315提振消费信心专题报道

循序渐进讲解Oracle数据库的Hash join

{{c.nickname}} {{c.create_time}}

{{child.nickname}} {{child.create_time}}

昨夜今晨:摩尔线程、沐曦股份IPO已获受理 安克新增充电宝召回方案

百度宣布文心大模型4.5全系开源 涵盖47B等多个模型

蚂蚁集团可持续发展报告：2024年研发投入234.5亿元

海康威视回应加拿大禁令影响 市场营收占比不足0.3%

钉钉多维表上新 100+电商模板，要用一张表管好全域生意

小米MIX Flip2上手体验：综合体验不输直机旗舰的精致满分小折

电竞玩家狂欢！红魔北京首家体验店空降京东MALL

红魔电竞平板3 Pro发布：小尺寸性能怪兽重塑移动电竞体验

华为Pura 80系列发布：小艺智能体再度进阶 解锁小艺看世界功能

华为Pura 80系列新品发布会官宣！

小米武汉智能家电工厂动工两个月来主体建设进度已完成60%

最高网络安全标准，追觅扫地机获UL Solutions钻石级安全认证

小米滚筒洗衣机脱水振动精准感知及动态调控关键技术获评国际领先

618清洁品牌TOP1！石头科技全品类明星产品销额暴涨

万元以下双绿标认证！石头Z1 Plus终结洗烘一体机日常洗烘痛点，618到手只需3999元

性能、续航、AI都有料！Hi MateBook D 16真实使用体验分享

荣耀新突破：挑战极限！即将发布重量不足980克的超级轻薄笔记本

苹果Vision Pro国行版开启预购，1TB版售价32999起

618智能穿戴怎么选？华为智能穿戴全系列选购攻略出炉

苹果新款iPad Pro遭遇渲染问题，修复补丁即将推出

直播电商：抖音、快手、视频号上演“刀光剑影”

小熊电器一季报业绩下滑：短期承压明显，销售费用再创新高

第四范式亏损有所收窄：短期股价大跌，客户高黏性被机构看好

不卷低价卷“性价比”：京东采销的竞争谋局

百度开发者大会即将举行，小度将发布全球首个AI原生操作系统DuerOS X

新进化 新伙伴，科大讯飞AI学习机 AI 1对1功能重磅升级

腾讯云AI产业应用峰会召开：智能体开发平台发布，大模型生态再升级

OpenAI发布o3-mini模型，性能卓越且价格亲民

业界唯一！科大讯飞发布首个基于全国产算力的深度推理大模型X1！

售价999，打卡返全款购机券！闪极AI「拍拍镜」震撼发布

奥运观赛AI新体验！通义App上线“赛事百事通”等多款新功能

云计算首次超越卫星！超三分之二奥运直播信号基于阿里云向全球分发

国际奥委会主席巴赫：阿里AI技术将巴黎奥运转播带到新高度

“崩”了的阿里云，能靠AI带动增长？

芯片生产，磨难重重

上汽名爵MG5 2026款正式上市 六万元享受十万元级性能轿跑

冠军之选！“斯诺克新王”赵心童成为腾势Z9GT车主暨全球代言人

硬核创新引领人车家先进生活方式，首款SUV小米YU7、小米AI眼镜、小米MIX Flip 2等重磅发布

昨夜今晨：顺丰暂停揽收锂电池类产品 小米将于26日发布多款重磅新品

比亚迪回应海豹将搭载固态电池传闻：细节未知 非官方信息

云计算迎变局：阿里云、腾讯云“各有千秋”

Sora正在颠覆游戏行业，CEO们该怎么应对？

道总有理：苹果其实不想成为全球第一

抛开大模型故事，阿里、抖音、京东都在攻坚AI电商哪个方向？

处置1.9万个账号和240万条视频，抖音治理不实信息这一年

西北工业大学研究团队模仿萤火虫通信机制 实现无人机光链路协同飞行

价格才不是小米汽车的最大“杀器”

懂车帝与车企争论的背后，新能源测试标准何时迎来终局？

惨！判赔10亿巨款，阿里卸下包袱重新出发

立讯收购Qorvo中国工厂的幕后推手

专题 更多

驱动号 更多

{{c.nickname}}
{{c.create_time}}

{{child.nickname}}
{{child.create_time}}

昨夜今晨:摩尔线程、沐曦股份IPO已获受理安克新增充电宝召回方案

百度宣布文心大模型4.5全系开源涵盖47B等多个模型

海康威视回应加拿大禁令影响市场营收占比不足0.3%

华为Pura 80系列发布：小艺智能体再度进阶解锁小艺看世界功能

新进化新伙伴，科大讯飞AI学习机 AI 1对1功能重磅升级

上汽名爵MG5 2026款正式上市六万元享受十万元级性能轿跑

昨夜今晨：顺丰暂停揽收锂电池类产品小米将于26日发布多款重磅新品

比亚迪回应海豹将搭载固态电池传闻：细节未知非官方信息

西北工业大学研究团队模仿萤火虫通信机制实现无人机光链路协同飞行

专题更多

驱动号更多