9

I need to get the extensions of filenames. Extensions could be any length (not just 3) and they could also be non-existent, in which case I need null returned. I know I could easily write a PL/SQL function that does this then just call that function in the query but I was hoping that I could somehow do it all inline. And I don't really care how long the solution is, what I need is the fastest solution. Speed matters because this will end up being ran against a very large table. This is what I have so far...

/*
The same method is being used in all 5 examples.
It works for all of them except the first one.
The first one I need to return null
*/

SELECT substr(filename,instr(filename,'.',-1)+1,length(filename)-instr(filename,'.',-1))
  FROM (select 'no_extension_should_return_null' filename from dual);
--returns: no_extension_should_return_null

SELECT substr(filename,instr(filename,'.',-1)+1,length(filename)-instr(filename,'.',-1))
  FROM (select 'another.test.1' filename from dual);
--returns: 1

SELECT substr(filename,instr(filename,'.',-1)+1,length(filename)-instr(filename,'.',-1))
  FROM (select 'another.test.doc' filename from dual);
--returns: doc

SELECT substr(filename,instr(filename,'.',-1)+1,length(filename)-instr(filename,'.',-1))
  FROM (select 'another.test.docx' filename from dual);
--returns: docx

SELECT substr(filename,instr(filename,'.',-1)+1,length(filename)-instr(filename,'.',-1))
  FROM (select 'another.test.stupidlong' filename from dual);
--returns: stupidlong

So is there a fast way to accomplish this inline or should I just write this in a PL/SQL function?

This is what I'm working with...

select * from v$version;
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0  Production
TNS for 64-bit Windows: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

UPDATE I'm moving this code into a function and will setup a test to call it a million times to see if the function slows it down, I'm thinking it won't make an impact since it's just string manipulation.

UPDATE Thanks for the answers so far. I ended up making a PL/SQL function that does what I need...

create or replace function extrip(filename varchar2) return varchar2 as
begin
    if ( instr(filename,'.',-1) = 0 ) then
        return null;
    end if;

    return substr(filename,instr(filename,'.',-1)+1,length(filename)-instr(filename,'.',-1));
end;

I then ran two tests against a table with 2 million rows. When I viewed the explain plan for both they were 100% IDENTICAL. How could that be?

select regexp_substr(filename, '\.[^\.]*$') ext from testTable;

select extrip(filename) ext from testTable;

UPDATE I added a order by ext to both of those then reran the tests and there was a difference. The regexp took 9sec and the function took 17sec. I guess without the order by TOAD was just retrning the first X number of recs. So @Brian McGinity was right. I still need the regexp method to NOT return the dot "." though.

5 Answers 5

15

It will run fastest when done 100% sql, as you have.

The substr/instr are native compiled functions in oracle.

If you put this in a plsql function it will run slower due to context switching between sql and plsql:

This is slower due to context switching:

select extrip( filename ) from million_row_table 

What you have is faster.

Update:

try this:

select s,
       substr(s,   nullif( instr(s,'.', -1) +1, 1) )
from ( 
     select 'no_extension_should_return_null' s from dual union
     select 'another.test.1'                    from dual union
     select 'another.test.doc'                  from dual union
     select 'another.test.docx'                 from dual union
     select 'another.test.stupidlng'            from dual 
     )
3
  • Look at the UPDATE I added to my question. I ran a test and the function seemed to have no effect on performance. Am I not viewing the explain plan correctly (I use TOAD)?
    – gfrobenius
    Commented Jan 18, 2014 at 21:45
  • I added a order by to the tests, appears you were right. Function took 17sec and regexp way took 9sec.
    – gfrobenius
    Commented Jan 18, 2014 at 22:00
  • Thank you. This worked in 1sec! Final version: select substr(filename, nullif( instr(filename,'.', -1) +1, 1) ) ext from testTable order by ext;
    – gfrobenius
    Commented Jan 18, 2014 at 22:30
5

You need to use regular expressions.

Try

select regexp_substr(filename, '\.[^\.]*$')
from
    (select 'no_extension_should_return_null' filename from dual);

I don't have an Oracle database to test this on but this should be pretty close.

Check the Oracle docs on regexp_substr and Using regular expressions in Oracle database for more info.

Update

To drop the period from the file extension:

select substr(regexp_substr(filename, '\.[^\.]*$'), 2)
from
    (select 'abc.def' filename from dual);
3
  • Thanks, I'll look into the regexp method, but this answer fails on filenames with multiple periods like another.test.docx
    – gfrobenius
    Commented Jan 18, 2014 at 21:16
  • Thanks! I voted up. I need fastest so once I have a handful of methods to try I'll run all against millions of recs and see which is fastest.
    – gfrobenius
    Commented Jan 18, 2014 at 21:18
  • Actually I don't need to "." to be part of the result. I just ran this example against a 2million row table. Seems pretty dang fast.
    – gfrobenius
    Commented Jan 18, 2014 at 21:38
2
SELECT NULLIF(substr(filename,instr(filename,'.',-1)+1,length(filename)-instr(filename,'.',-1)) from (select 'no_extension_should_return_null' filename from dual) t1, SELECT filename from t1);

Sorry no oracle to test it, I'm sure you get the idea though.

3
  • 1
    +1 . . . Despite the bad formatting and the untested answer, this is probably the fastest method -- which is what the OP is asking for. Commented Jan 18, 2014 at 21:48
  • I couldn't get it to run as is, syntax error somewhere. I'll play with it in a little bit and see how it does.
    – gfrobenius
    Commented Jan 18, 2014 at 21:52
  • NULLIF was the ticket. I reformatted it correctly and it ran in 2secs with a order by just like the other tests against a 2million row table. Final result: SELECT NULLIF(substr(filename,instr(filename,'.',-1)+1,length(filename)-instr(filename,'.',-1)),filename) ext from testTable order by ext;
    – gfrobenius
    Commented Jan 18, 2014 at 22:05
0

Yeah as per my understanding you can use DECODE function and query goes as follows:

SELECT substr(filename,instr(filename,'.',-1)+1,length(filename)- DECODE(INSTR(filename,'.',-1),0,LENGTH(filename),INSTR(filename,'.',-1))) from (select 'no_extension_should_return_null' filename from dual);
0

Perhaps the simplest would be to use

regexp_substr(filename, '[^\.]*$')

It works on filenames with multiple periods and returns no period.


For filenames without extension next could be used

select case when filename like '%.%' then regexp_substr(filename, '[^.]*$') end EXT from dual

1
  • The only problem with this one is if filename have no extension. So, for example, regexp_substr('test', '[^\.]*$') will return 'test'.
    – idavid2013
    Commented Sep 1, 2016 at 6:10

Not the answer you're looking for? Browse other questions tagged or ask your own question.