关于字符串匹配查找的总结（43天)

判断一个字符型字段中出现某个字符超过3次的数据行，如果为了简单达到目的，可以直接使用Like来做， SQL> select content from clob_test where content like '%is%is%is%'; CONTENT -------------------------------------------------------------------------------- this is a test,and it is very useful 但是可能在实际应用中，如果有一些有些特别的需求，比如判断某个字符串出现的次数20次以上的。用Like就有些体力活了。如果字段类型是clob,可以使用dbms_lob.instr来实现。 FUNCTION INSTR RETURNS NUMBER(38) Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- FILE_LOC BINARY FILE LOB IN PATTERN RAW IN OFFSET NUMBER(38) IN DEFAULT NTH NUMBER(38) IN DEFAULT 下面来做一个简单的测试。 SQL> create table clob_test(content clob); Table created. SQL> insert into clob_test values('this is a test,and it is very useful'); 1 row created. SQL> insert into clob_test values('here it is'); 1 row created. SQL> commit; Commit complete. 从中查找出现is超过3次的数据行。 SQL> select content from clob_test where dbms_lob.instr(content,'is',1,3)>0; CONTENT -------------------------------------------------------------------------------- this is a test,and it is very useful 如果是varchar2类型，貌似只能使用like了。其实不然。如果在10g版本内，对于字符串想自己写一个类似的函数来处理，可以类似下面的形式。 SQL> select content from clob_test where (length(content)-length(replace(content,'is',null)))/(length('is'))>=3; CONTENT -------------------------------------------------------------------------------- this is a test,and it is very useful 如果在这个基础上想更进一步，可以使用11g的regexp_count来实现。 SQL> select content from clob_test where regexp_count(content,'is')>=3; CONTENT -------------------------------------------------------------------------------- this is a test,and it is very useful 从上面的例子，可以看出这个函数有多实用，省去了很多其他的处理。当然了在11g版本中，还有regexp_substr,regexp_instr,regexp_like等等的函数，也很实用。加一个字段，在varchar2上作这个测试。 SQL> alter table clob_test add(content2 varchar2(1000)); Table altered. SQL> insert into clob_test(content2) values('stringtest=100#stringtest=50'); 1 row created. SQL> insert into clob_test(content2) values('stringtest=200#stringtest=60'); 1 row created. 现在是想截取串"stringtest=100#stringtest=50"中间的100 如果按照一般的思路，可以这样来做。 select TO_NUMBER ( SUBSTR ( content2, INSTR (content2, 'stringtest=') + 11, INSTR ( SUBSTR ( content2, INSTR (content2, 'stringtest=') + 11), '#') - 1))content3 from clob_test where content2 is not null; CONTENT3 ---------- 100 200 如果使用regexp_substr来的话，可能一行就可以了。 SQL> select 2 to_number(replace(regexp_substr(content2,'[^stringtest=]+',1,1) ,'#','')) context3 from clob_test where content2 is not null; CONTEXT3 ---------- 100 200