11g备库无法开启ADG的原因分析 (r7笔记第62天)
今天碰到一个有些奇怪的问题,但是奇怪的现象背后都是有本质的因果。
下午在做一个环境的检查时,发现备库是在mount阶段,这可是一个11gR2的库,没有ADG实在是太浪费了,对于这种情况感觉太不应该了。
所以尝试启动至open阶段,发现状态一直是read only,在ADG中应该是READ ONLY WITH APPLY才对啊。
使用dg broker设置为READ-ONLY,备库的数据库日志如下:
Standby Database: stestdb3, Enabled Physical Standby (0x02010000)
08/14/2014 16:03:28
version check on database stestdb3 detected stale metadata,
requesting update from primary database
Creating process RSM0
12/29/2015 16:28:11
Command EDIT DATABASE stestdb3 SET STATE = READ-ONLY completed
Read-Only state no longer supported
12/29/2015 16:29:10
似乎也看不出来什么端倪。使用dg broker查看一下。发现报了下面的错误。
DGMGRL> show configuration;
Configuration - testdb
Protection Mode: MaxPerformance
Databases:
testdbbak93 - Primary database
stestdb3 - Physical standby database
Error: ORA-16766: Redo Apply is stopped
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
查看dg broker的日志如下:
Data Guard Broker initializing...
Data Guard Broker initialization complete
Tue Dec 29 16:47:15 2015
SMON: enabling cache recovery
No Resource Manager plan active
Physical standby database opened for read only access.
Completed: alter database open
Tue Dec 29 16:47:16 2015
idle dispatcher 'D000' terminated, pid = (18, 1)
Tue Dec 29 16:51:40 2015
Primary database is in MAXIMUM PERFORMANCE mode
RFS[3]: Assigned to RFS process 3596
RFS[3]: Selected log 7 for thread 1 sequence 72606 dbid -1549369665 branch 746558785
Tue Dec 29 16:51:41 2015
RFS[4]: Assigned to RFS process 3590
RFS[4]: Selected log 8 for thread 1 sequence 72605 dbid -1549369665 branch 746558785
Tue Dec 29 16:51:42 2015
Archived Log entry 69432 added for thread 1 sequence 72605 ID 0xa829ec3b dest 2:
从上面的情况可以很明显看到,确实MRP没有开始工作,只有RFS在接收归档。
然后使用dg broker把备库设置为ONLINE状态,再次查看dg broker的检查,发现检查就没有问题了。
DGMGRL> show configuration;
Configuration - testdb
Protection Mode: MaxPerformance
Databases:
testdbbak93 - Primary database
stestdb3 - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
总体感觉这不是一个11g的库。
然后再次尝试,手工启动到open阶段,然后可以看到备库还是READ ONLY,重启之后问题依然存在。
对于这个问题,最好的方式也还是查看日志,这个备库是一年前重启的了,庆幸的是数据库日志依然存在。从当时的启动情况来看,也没有其它的错误。
但是我注意到了compatible这个参数,因为在11g的库中还是比较显眼的。所以这个参数引起了我的好奇。
结果带着疑问在MOS一查,果然有几篇相关的文章,看来又碰上一个遗留问题,而且有一个相关的BUG描述。
ACTIVE DATAGUARD (ADG) NOT POSSIBLE WITH COMPATIBLE < 11.1.0.0.0 (Doc ID 1363396.1)
BUG:13032521 - ADG PHYSICAL STANDBY GOES TO MOUNT STATE INSTEAD OF READ ONLY WITH APPLY
问题基本定位后,主备库中查看这个参数都是10.2.0.5.0
SQL> show parameter compa
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
compatible string 10.2.0.5.0
那么按照bug描述的WA,是设置备库的compatible为11.1.0.7以上,这个参数的修改需要重启实例,所以还是比较影响的,主库目前是没法重启了。
SQL> alter system set compatible='11.2.0.3.0';
alter system set compatible='11.2.0.3.0'
*
ERROR at line 1:
ORA-02095: specified initialization parameter cannot be modified
现在备库设置一番,先看看行不行。
SQL> alter system set compatible='11.2.0.3.0' scope=spfile;
System altered.
重启时,可以看到备库的数据库日志有下面这么一段输出。
Tue Dec 29 17:25:26 2015
Spfile /U01/app/oracle/product/11.2.3/db_1/dbs/spfiletestdb.ora is in old pre-11 format and compatible >= 11.0.0; converting to new H.A.R.D. compliant format.
Completed: alter database mount
但是再次设置为ONLINE,查看数据库状态依旧是MOUNT
SQL> select open_mode from v$database;
OPEN_MODE
--------------------
READ ONLY
看来备库修改还不行,主库也得修改一致。
不过查看数据库日志可以看到下面的这么一段内容,发现MRP启动失败。
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE
Attempt to start background Managed Standby Recovery process (testdb)
Tue Dec 29 17:57:03 2015
MRP0 started with pid=29, OS id=17740
MRP0: Background Managed Standby Recovery process started (testdb)
started logmerger process
Tue Dec 29 17:57:08 2015
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 16 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /U01/app/oracle/fra/StestDB3/archivelog/2015_12_29/o1_mf_1_72606_c84n0xml_.arc
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE
Errors with log /U01/app/oracle/fra/StestDB3/archivelog/2015_12_29/o1_mf_1_72606_c84n0xml_.arc
MRP0: Background Media Recovery terminated with error 38800
Errors in file /U01/app/oracle/diag/rdbms/stestdb3/testdb/trace/testdb_pr00_17745.trc:
ORA-38800: Cannot start Redo Apply on the open physical standby database
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
MRP0: Background Media Recovery process shutdown (testdb)
看来这个参数变化影响确实不小,备库先恢复正常状态再说,等协调主库重启再处理了,所以开始恢复参数原有的设置。把compatible设置为10.2.0.5.0
?但是重启的时候就开始报错了。
SQL> alter database mount;
alter database mount
*
ERROR at line 1:
ORA-00201: control file version 11.2.0.3.0 incompatible with ORACLE version
10.2.0.5.0
ORA-00202: control file: '/U01/app/oracle/oradata/testdb/control01.ctl'
这个问题看似还有余地,在主库生成备库控制文件,传输过去,mount就没有问题了
主库:
SQL> alter database create standby controlfile as '/tmp/std1.ctl';
Database altered.
?备库:
SQL> alter database mount standby database;
Database altered.
但是这个时候查看备库的数据库日志,发现问题貌似变麻烦了。文件头部已经修改,已经不同步了。
ALTER DATABASE RECOVER managed standby database disconnect from session
Attempt to start background Managed Standby Recovery process (testdb)
Tue Dec 29 18:28:13 2015
MRP0 started with pid=30, OS id=24283
MRP0: Background Managed Standby Recovery process started (testdb)
started logmerger process
Tue Dec 29 18:28:18 2015
Managed Standby Recovery not using Real Time Apply
Read of datafile '/U01/app/oracle/oradata/testdb/system01.dbf' (fno 1) header failed with ORA-01130
Rereading datafile 1 header failed with ORA-01130
MRP0: Background Media Recovery terminated with error 1110
Errors in file /U01/app/oracle/diag/rdbms/stestdb3/testdb/trace/testdb_pr00_24288.trc:
ORA-01110: data file 1: '/U01/app/oracle/oradata/testdb/system01.dbf'
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/U01/app/oracle/oradata/testdb/system01.dbf'
ORA-01130: database file version 11.2.0.3.0 incompatible with ORACLE version 10.2.0.5.0
Slave exiting with ORA-1110 exception
Errors in file /U01/app/oracle/diag/rdbms/stestdb3/testdb/trace/testdb_pr00_24288.trc:
ORA-01110: data file 1: '/U01/app/oracle/oradata/testdb/system01.dbf'
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/U01/app/oracle/oradata/testdb/system01.dbf'
ORA-01130: database file version 11.2.0.3.0 incompatible with ORACLE version 10.2.0.5.0
Recovery Slave PR00 previously exited with exception 1110
MRP0: Background Media Recovery process shutdown (testdb)
Completed: ALTER DATABASE RECOVER managed standby database disconnect from session
对应的trace文件如下:
*** 2015-12-29 18:28:18.495 4320 krsh.c
Managed Standby Recovery not using Real Time Apply
Read of datafile '/U01/app/oracle/oradata/testdb/system01.dbf' (fno 1) header failed with ORA-01130
Rereading datafile 1 header failed with ORA-01130
V10 STYLE FILE HEADER:
Compatibility Vsn = 186647296=0xb200300
Db ID=2745597631=0xa3a67ebf, Db Name='testDB'
Activation ID=0=0x0
Control Seq=1=0x1, File size=147200=0x23f00
File Number=1, Blksiz=8192, File Type=3 DATA
Tablespace #0 - SYSTEM rel_fn:1
对于这种情况,其实恢复备库11g的控制文件,重启主库 应该就可以解决了,但是重启主库还需要协调时间,找维护窗口,所以不是一蹴而就的事情,那么这个期间容灾是重中之重,一旦主库出了问题,影响还是不小,所以最后的无奈之举就是重建备库。
当然搭建备库还是可以采用11g的active方式。
rman target sys@xxxxx auxiliary sys@xxxx nocatalog
RMAN> duplicate target database for standby from active database nofilenamecheck;
?然后就没有然后了,就是备库搭建成功了,看着白忙活一场,心中像打翻了五味瓶。
- 机器学习中常用评估指标汇总
- 用 Grid Search 对 SVM 进行调参
- PCA 的数学原理和可视化效果
- 用 Pipeline 将训练集参数重复应用到测试集
- 什么是 ROC AUC
- SSE(Server-sent events)技术在web端消息推送和实时聊天中的使用
- 详解 Stacking 的 python 实现
- RESTful接口设计原则和优点
- 用 Doc2Vec 得到文档/段落/句子的向量表达
- 手把手用 IntelliJ IDEA 和 SBT 创建 scala 项目
- 项目中记录影响性能的缓慢数据库查询
- memory_profiler的使用
- 使用line_profiler查看api接口函数每行代码执行时间
- GAN 的 keras 实现
- JavaScript 教程
- JavaScript 编辑工具
- JavaScript 与HTML
- JavaScript 与Java
- JavaScript 数据结构
- JavaScript 基本数据类型
- JavaScript 特殊数据类型
- JavaScript 运算符
- JavaScript typeof 运算符
- JavaScript 表达式
- JavaScript 类型转换
- JavaScript 基本语法
- JavaScript 注释
- Javascript 基本处理流程
- Javascript 选择结构
- Javascript if 语句
- Javascript if 语句的嵌套
- Javascript switch 语句
- Javascript 循环结构
- Javascript 循环结构实例
- Javascript 跳转语句
- Javascript 控制语句总结
- Javascript 函数介绍
- Javascript 函数的定义
- Javascript 函数调用
- Javascript 几种特殊的函数
- JavaScript 内置函数简介
- Javascript eval() 函数
- Javascript isFinite() 函数
- Javascript isNaN() 函数
- parseInt() 与 parseFloat()
- escape() 与 unescape()
- Javascript 字符串介绍
- Javascript length属性
- javascript 字符串函数
- Javascript 日期对象简介
- Javascript 日期对象用途
- Date 对象属性和方法
- Javascript 数组是什么
- Javascript 创建数组
- Javascript 数组赋值与取值
- Javascript 数组属性和方法