关于降低高水位线的尝试(r3笔记47天)

在前一段时间，生产环境中有几个很大的分区表，由于存在太多的碎片，导致表里的数据就几十条，但是查询的时候特别慢。很明显是高水位线导致的问题。一般来说这类问题,使用备份->truncate->insert的方式比较保守，不适用于在线操作。而在10g开始的一个新特性shrink算是一个比较理想的方案，按照新特性的预期，速度也是很快的，而且是在线操作。可以分批释放表中的冗余空间。所以做了一个尝试，在生产系统中使用这个新特性来降低高水位线。生产中的表pub_log,sub_log都是分区表，分区数不多，几十个左右。首先使用shrink需要先设置表为enable rowmovement,这个操作会导致和这个表对应的包体失效。可以使用shrink space compact先来压缩空间，然后在空闲时段使用shrink space来降低高水位线，但是shrink的操作对于基于函数的索引还是受限的。所以使用的时候需要考量一下。需要降低高水位线的表是PUB_LOG,SUB_LOG,所在在简单准备之后，写了如下的脚本。


alter  session force parallel ddl parallel 8;  --设置了并行
alter table PUB_LOG enable row movement;   --启用row movement
alter table PUB_LOG shrink space compact;    --先压缩表的空间
alter table PUB_LOG shrink space;        --降低表的高水位线
alter index  PUB_LOG_PK shrink space compact;   --对索引也可以设置同样的操作。
alter index PUB_LOG_PK  shrink space; 
alter table PUB_LOG disable  row movement;  

alter table SUB_LOG enable row movement; 
alter  table SUB_LOG shrink space compact; 
alter table SUB_LOG shrink space  ; 
alter index SUB_LOG_PK shrink space compact; 
alter index SUB_LOG_PK  shrink space; 
alter index SUB_LOG_1IX shrink space compact; 
alter index  SUB_LOG_1IX shrink space ; 
alter table SUB_LOG disable row  movement;

在测试环境中做测试的时候，时间还是很快的，在5分钟以内完成了所有的操作。然后脚本提交给客户去运行,结果晚上就接到电话，说第一步操作 alter table PUB_LOG shrink space compact 执行了快3个小时，还没有执行完。客户最后kill了那个session. 在第二天查这个问题的时候发现，在shrink space compact的同时，有几个session正在执行update,delete操作，执行还是比较频繁的。看来shrink的操作还是需要谨慎，在生产环境中可能涉及的操作场景更为复杂。最后评估之后还是转为truncate的方式了。 truncate的操作步骤比较老套，但是在操作的时候还是有不少的细节。首先是备份可以使用exp/expdp的方式，如果数据量不大，可以采用使用表级备份。我先尝试了exp的方式，结果发现还是有一些问题，表里只有68条数据，但是exp的时候，用了1分钟左右。 Export terminated successfully without warnings. real 1m7.111s user 0m0.104s sys 0m0.065s 查看对应的索引情况,看来还是受到高水位线的影响。


INDEX_NAME             TABLESPACE  INDEX_TYPE  UNIQUENES  PAR  COLUMN_LIST                      TABLE_TYPE STATUS   NUM_ROWS LAST_ANAL  G
---------------------- ---------- ---------- --------- ---  ------------------------------ ---------- ------ ---------- ---------  -
PUB_LOG_PK               NORMAL      UNIQUE      YES  BUFFER_ID,PUB_TRX_ID,SOURCE_COMP_ID TABLE      N/A            15 23-OCT-14  N

select count(*)from pub_log 速度也是很慢的。因为索引是buffer_id开头,最后间接的使用索引，速度一下子就快了很多。 select count(*)from trb1_pub_log where buffer_id in(select buffer_id from trb1_pub_log group by buffer_id); COUNT(*) ---------- 68 Elapsed: 00:00:00.17 最后转换为exp的方式,时间降低到5秒 time exp xxx/xxx tables=pub_log file=pub_log_bak.dmp query=' where buffer_id in (select buffer_id from pub_log group by buffer_id)' buffer=9102000 statistics=none grants=n indexes=n

real 0m5.064s user 0m0.039s sys 0m0.037s 到这一步其实也基本告一段落了，如果有些分区表含有lob字段，导出速度也还是会慢不少。再次进行调整，发现使用表级备份还是不错的。 create table tmp_bak_pub_log nologging as select * from pub_log where buffer_id in (select buffer_id from pub_log group by buffer_id) ; Elapsed: 00:00:01.69 create table tmp_bak_sub_log nologging as select * from sub_log where queue_id in (select queue_id from sub_log group by queue_id) ; --sub_log含有lob字段，exp也还是慢不少，使用表级备份就快多了。 Elapsed: 00:00:00.58 备份完成之后，就是truncate truncate table pub_log reuse storage; truncate table sub_log reuse storage; 最后insert即可。 insert into pub_log select *from tmp_bak_pub_log; commit; insert into sub_log select *from tmp_bak_sub_log; commit; 总体来说，对于新特性的使用还是要做大量的测试，需要谨慎和保守，对于一些看似简单的操作也可以精工出细活。