关于统计信息过期的性能落差(r5笔记第36天)
今天客户反馈某一个应用部署补丁的时候,执行了一个脚本一个多小时还没有执行完。
语句是下面这样的形式。
insert into em1_rater_00068_01
(select *
from em1_rater_00050_01_backup a
where a.record_id <= 65971543
and not exists (select b.record_id
from em1_rater_00068_01 b
where a.record_id = b.record_id));
查看执行计划发现语句的执行计划信息真是惊人,执行计划中竟然出现了27T的字样,但是查看预估的时间却只有35秒左右。而且这个预估是在4个并行的基础上。
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 2879 (100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | 15M| 27T| 2879 (3)| 00:00:35 | | | Q1,03 | P->S | QC (RAND) |
如果不启用并行,执行计划的情况就更糟糕了。
Execution Plan
----------------------------------------------------------
Plan hash value: 3489211022
--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25M| 45T| 25M (1)| 83:25:53 | | |
| 1 | PARTITION RANGE ALL | | 1251K| 2319G| 387 (1)| 00:00:05 | 1 | 5 |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| EM1_RATER_00050_01_BACKUP | 1251K| 2319G| 387 (1)| 00:00:05 | 1 | 5 |
|* 3 | INDEX RANGE SCAN | EM1_RATER_00050_01_BK_PK | 225K| | 26 (4)| 00:00:01 | 1 | 5 |
| 4 | PARTITION RANGE ALL | | 2 | 26 | 1 (0)| 00:00:01 | 1 | 5 |
|* 5 | INDEX RANGE SCAN | EM1_RATER_00068_01_PK | 2 | 26 | 1 (0)| 00:00:01 | 1 | 5 |
--------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("A"."RECORD_ID"<=65971543)
filter( NOT EXISTS (SELECT 0 FROM "EM1_RATER_00068_01" "B" WHERE "B"."RECORD_ID"=:B1))
5 - access("B"."RECORD_ID"=:B1)
实际上这个表中的数据只有几十G,根本不会出现几十T的可能。
可以看出执行计划落差很大,查看了表的统计信息,发现还是存在很大的落差,先启用并行收集统计信息。
exec dbms_stats.gather_table_stats(OWNNAME=>null,tabname=>'EM1_RATER_00050_01_BACKUP',estimate_percent =>dbms_stats.auto_sample_size,METHOD_OPT =>'FOR ALL INDEXED COLUMNS SIZE 1',granularity=>'DEFAULT',cascade=>TRUE,degree=>8,block_sample=>TRUE);
PL/SQL procedure successfully completed.
Elapsed: 00:03:14.68
可以借着这个机会看到收集统计信息的时候,后台还是做了大量的信息计算。
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8h0z7512pn17a, child number 0
-------------------------------------
/* SQL Analyze(0) */ select /*+ full(t) parallel(t,8)
parallel_index(t,8) dbms_stats cursor_sharing_exact use_weak_name_resl
dynamic_sampling(0) no_monitoring no_substrb_pad
*/to_char(count("RECORD_ID")),to_char(substrb(dump(min("RECORD_ID"),16,0
,32),1,120)),to_char(substrb(dump(max("RECORD_ID"),16,0,32),1,120)),to_c
har(count("PERIOD_KEY")),to_char(substrb(dump(min("PERIOD_KEY"),16,0,32)
,1,120)),to_char(substrb(dump(max("PERIOD_KEY"),16,0,32),1,120)),to_char
(count("RECORD_STATUS")),to_char(count("RECORD_TYPE")),to_char(count("RE
SOLUTION_STATUS")),to_char(count("FIELD_00033_O")),to_char(count("FIELD_
00033_C")),to_char(count("FIELD_00279_O")),to_char(count("FIELD_00279_C"
)),to_char(count("FIELD_00436_O")),to_char(count("FIELD_00436_C")),to_ch
ar(count("FIELD_00361_O")),to_char(count("FIELD_00361_C")),to_char(count
("FIELD_00148_O")),to_char(count("FIELD_00148_C")),to_char(count("FIELD_
00341_O")),to_char(count("FIELD_00341_C")),to_char(count("FIELD_00116_O"
)),to_char(count("FIELD_00116。。。。。。
如果这个时候好奇想查看收集统计信息的语句的执行计划,发现更是惊人,里面有901T的字样,绝对是海量数据。
Plan hash value: 2890548601
--------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 251K(100)| | | | | | |
| 1 | SORT AGGREGATE | | 1 | 1933K| | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 1933K| | | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 1933K| | | | | Q1,00 | PCWP | |
| 5 | APPROXIMATE NDV AGGREGATE| | 500M| 901T| 251K (42)| 00:50:14 | | | Q1,00 | PCWP | |
| 6 | PX BLOCK ITERATOR | | 500M| 901T| 251K (42)| 00:50:14 | 1 | 5 | Q1,00 | PCWC | |
|* 7 | TABLE ACCESS FULL | EM1_RATER_00050_01_BACKUP | 500M| 901T| 251K (42)| 00:50:14 | 1 | 5 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7 - access(:Z>=:Z AND :Z<=:Z)
经过了短暂等待的3分钟,一切就绪,再次查看语句的执行计划,指标一下子就降了下来。
然后对这个语句进行了初步分析,发现其实还是可以尝试使用minus操作来做数据过滤。
insert into em1_rater_00068_01
(select b.*
from em1_rater_00050_01_backup b,(select record_id
from em1_rater_00050_01_backup
where record_id <= 65971543
minus
select record_id
from em1_rater_00068_01
where record_id <= 65971543)temp
where b.record_id=temp.record_id
)
对过滤后的数据再次关联就会轻松很多。在不启用并行的情况下执行计划如下:
Plan hash value: 3652964767
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 9709K| 24G| | 2412K (1)| 08:02:31 | | |
| 1 | LOAD TABLE CONVENTIONAL | EM1_RATER_00068_01 | | | | | | | |
|* 2 | HASH JOIN | | 9709K| 24G| 231M| 2412K (1)| 08:02:31 | | |
| 3 | VIEW | | 9709K| 120M| | 70833 (3)| 00:14:10 | | |
| 4 | MINUS | | | | | | | | |
| 5 | SORT UNIQUE | | 9709K| 55M| 111M| | | | |
| 6 | PARTITION RANGE ALL| | 9709K| 55M| | 2560 (1)| 00:00:31 | 1 | 5 |
|* 7 | INDEX RANGE SCAN | EM1_RATER_00050_01_BK_PK | 9709K| 55M| | 2560 (1)| 00:00:31 | 1 | 5 |
| 8 | SORT UNIQUE | | 10M| 57M| 115M| | | | |
| 9 | PARTITION RANGE ALL| | 10M| 57M| | 3195 (1)| 00:00:39 | 1 | 5 |
|* 10 | INDEX RANGE SCAN | EM1_RATER_00068_01_PK | 10M| 57M| | 3195 (1)| 00:00:39 | 1 | 5 |
| 11 | PARTITION RANGE ALL | | 9709K| 24G| | 1079K (2)| 03:35:51 | 1 | 5 |
| 12 | TABLE ACCESS FULL | EM1_RATER_00050_01_BACKUP | 9709K| 24G| | 1079K (2)| 03:35:51 | 1 | 5 |
------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("B"."RECORD_ID"="TEMP"."RECORD_ID")
7 - access("RECORD_ID"<=65971543)
10 - access("RECORD_ID"<=65971543)
26 rows selected.
使用并行后,执行计划就好多了,根据初步的测试大概在10分钟左右。
- 对windows密码抓取神器mimikatz的逆向分析
- Keras中神经网络模型的5阶段生命周期
- java的断言(assert)
- Android studio中Rendering Problems不能可视化操作的解决办法
- 使用 Referer Meta 标签控制 referer—详解 referrer-policy
- 网站抓取引子 - 获得网页中的表格
- Android Firebase 服务简介
- CVE-2015-0393:Oracle发布严重安全漏洞预警
- 研究人员发现一种利用Siri窃取苹果iPhone/iPad数据的方法
- 关于Python中的__main__和编程模板
- 世界大战尽在掌控:盘点全球网络攻击实时追踪系统
- Activity数据传递
- apache反向代理一、泛解析域名二、APACHE配置
- Python学习 - 可视化变量赋值、循环、程序运行过程
- JavaScript 教程
- JavaScript 编辑工具
- JavaScript 与HTML
- JavaScript 与Java
- JavaScript 数据结构
- JavaScript 基本数据类型
- JavaScript 特殊数据类型
- JavaScript 运算符
- JavaScript typeof 运算符
- JavaScript 表达式
- JavaScript 类型转换
- JavaScript 基本语法
- JavaScript 注释
- Javascript 基本处理流程
- Javascript 选择结构
- Javascript if 语句
- Javascript if 语句的嵌套
- Javascript switch 语句
- Javascript 循环结构
- Javascript 循环结构实例
- Javascript 跳转语句
- Javascript 控制语句总结
- Javascript 函数介绍
- Javascript 函数的定义
- Javascript 函数调用
- Javascript 几种特殊的函数
- JavaScript 内置函数简介
- Javascript eval() 函数
- Javascript isFinite() 函数
- Javascript isNaN() 函数
- parseInt() 与 parseFloat()
- escape() 与 unescape()
- Javascript 字符串介绍
- Javascript length属性
- javascript 字符串函数
- Javascript 日期对象简介
- Javascript 日期对象用途
- Date 对象属性和方法
- Javascript 数组是什么
- Javascript 创建数组
- Javascript 数组赋值与取值
- Javascript 数组属性和方法