Nov 252013
 

On Friday I asked how this query could be fixed without actually changing it:

select *
from   this_simple_join tsj, 
       does_not_work dnw
where  tsj.column_1 = dnw.column_1
and    tsj.column_2 = dnw.column_2;

ORA-01722: invalid number 

As the commenters identified it’s to do with implicit conversions. A few different workarounds were found, with Matthias getting the essence of my solution. For an explanation of my setup, keep reading.

I created my tables. Crucially, one of the join columns (column_1) is a varchar2 in one table and an integer in the other:

create table this_simple_join (
  column_1 varchar2(10), 
  column_2 varchar2(10)
);

create table does_not_work (
  column_1 integer, 
  column_2 varchar2(10)
);

I then put some data in. The varchar2 column contains has two records, one with a letter and one with a number:

insert into this_simple_join values ('a','Y');
insert into this_simple_join values ('1','N');

insert into does_not_work values (1,'N');

commit;

If we now run the query, we get an “invalid number” exception. This happens because we’re comparing a string to a number. When we do this, Oracle implicitly converts the string to a number. Because ‘a’ isn’t a number we get the exception:

select *
from   this_simple_join tsj, 
       does_not_work dnw
where  tsj.column_1 = dnw.column_1
and    tsj.column_2 = dnw.column_2;

ORA-01722: invalid number 

One way to avoid the error is to prevent the row where column_1 = ‘a’ from being accessed. How can we do this? By creating an index!

create index i on this_simple_join (column_2);

Now when we run the query, we have an index range scan on the THIS_SIMPLE_JOIN table, rather than a full-table scan. This is because DOES_NOT_WORK only joins to records in THIS_SIMPLE_JOIN where COLUMN_1 is a number. This means we never touch the records with the letter ‘a’, no implicit conversion is applied and we see the result we expect!

select *
from   this_simple_join tsj, 
       does_not_work dnw
where  tsj.column_1 = dnw.column_1
and    tsj.column_2 = dnw.column_2;

COLUMN_1   COLUMN_2     COLUMN_1 COLUMN_2
---------- ---------- ---------- ----------
1          N                   1 N         

Note however, this only works because none of the rows in DOES_NOT_WORK match rows where THIS_SIMPLE_JOIN.COLUMN_1 is a string. If we put a row in DOES_NOT_WORK, where COLUMN_2 = ‘Y’, we’re back where we started…

insert into does_not_work values (1,'Y');

select *
from   this_simple_join tsj, 
       does_not_work dnw
where  tsj.column_1 = dnw.column_1
and    tsj.column_2 = dnw.column_2;

ORA-01722: invalid number 

Beware implicit conversions!

  3 Responses to “SQL Quiz: How can we fix this query – the answer”

  1. In my opinion the important sentence here is Beware implicit conversions!. Even if the index or other subtle changes resolve the error with the current data, this might not be true if the plan changes (e.g. new statistics) or – as you pointed out – if the data within the tables changes.
    So always have matching datatypes or convert explicitly.
    By the way: Is there a table in Oracle where one can see if implicit conversions occurred when executing the DMLs used?

  2. I hope this quiz has no relevance in praxis.
    It should be clear, that implicit conversions *are always a bug* and have to be fixed in the application code.

    This example shows, that SQLs with implicit conversions might behave non-deterministic, which should never be the case in my opinion.
    This is not a fault of the Oracle-Code rather than the application code.
    In this sense, I find my first (pseudo-)fix “delete all data which can cause an exception when executed this SQL” an even better one than your (pseudo-)fix.

    @Salek:

    select
    *
    from
    v$sql_plan p
    where
    p.filter_predicates like '%TO_NUMBER%' or
    p.filter_predicates like '%TO_DATE%'

    might probably be a starting point

    • I agree Matthias – these are bugs, there are cases where you’re not able to modify the application however (3rd party software for example).

      I certainly wouldn’t recommend this as a solution to the problem in the real world, more of an interesting thought exercise :) It’s something I found in an ETL process – the plan changed which meant we ended up with (new) implicit conversions. We got around it by applying a regex to the character column to ensure it is a number.

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>