May 052014
 

There’s a neat optimization in Oracle I found while tinkering around (look closely at the predicate section):

select oracle_can_now_use,
       a_function_based_index
from   queries_with_no_functions
where  a_date_col_with_an_fbi = :a_date
and    oracle_version >= '11.2.0.2';

-------------------------------------------------------------------------
| Id  | Operation                   | Name                      | Rows  |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                           |     1 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| QUERIES_WITH_NO_FUNCTIONS |     1 |
|*  2 |   INDEX RANGE SCAN          | QUWF_DATE_FBI             |     1 |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ORACLE_VERSION">='11.2.0.2' AND "A_DATE_COL_WITH_AN_FBI"=:A_DATE)
   2 - access(TRUNC(INTERNAL_FUNCTION("A_DATE_COL_WITH_AN_FBI"))=TRUNC(:A_DATE))

Line 2 of the predictates states that the INDEX RANGE SCAN was accessed via TRUNC(INTERNAL_FUNCTION()) – a function-based index was used when no function is present in the where clause!

This optimization appeared in 11.2.0.2 (fix 9263333) – I have to thank the people of Oracle-L (particularly Velikikh Mikhail) identifying when this improvement came in.

This optimization isn’t restricted to TRUNC() on date columns however.
Continue reading »

Apr 112014
 

Over the past few articles we’ve looked at database joins. It started out with me noticing that joins appear to be getting bad press recently and wondering whether they really are the root of all evil that some people seem to think they are.

We seen that denormalizing removes joins, giving performance benefits to primary key lookups. This benefit can come at the cost of non-primary key lookup queries though – often the extra work for these “search” queries is outweights the gains made for the primary key lookups.

Query performance (particularly for “small” systems) is just part of the story though. The real cost of removing joins is the impact on data modification. Aside from (potentially) vastly increasing the number of records must update to keep data consistent, denormalization can introduce waiting issues for concurrent updates. In the worst case we may introduce application bugs in the form of deadlocks.

We’ve seen there are some cases where joins do result in inefficient queries. These are usually the result of poor design decisions or limitations of the optimizer. Better design and use of Oracle features can overcome the worst of many of these however.

I hope you’ve found this series (and blog as a whole!) useful and informative. If there’s anything else you’d like to see on the subject of joins get in touch or say in the comments.

I’m going to take a break from regular blogging for a while to work on other projects. Thanks to those of you who’ve been following. If you’d like to receive posts when I start again, just enter your email address in the form below!

Apr 072014
 

So far in the joins series we’ve looked at the effect removing joins (via denormalization) has on performance. We’ve seen that joins can cause primary key looks to do more work. Lowering the normalization level to remove these can negatively impact “search” style queries though. More importantly, we’ve seen the real cost of denormalizing to remove joins is when updating records, potentially leading to concurrency waits and application bugs.

So are joins always “good”?

The fastest way to do anything is to not do it at all. If joins aren’t necessary to answer your queries, including them will add some overhead. Also, like any tool, there’s situations where adding a join may substantially slow your query down.

Here’s some examples where joins may be “expensive” and strategies for coping with them.
Continue reading »

Apr 042014
 

In the previous article in the joins series we compared query performance between a third normal form schema and the same schema denormalized to second normal form. We then extended it the example so our denormalized schema was in just first normal form.

The normalized approach performed better overall. The differences were small though – generally just a few consistent gets and all the queries executed in under a second. As Jeff Atwood points out, hardware is powerful enough that for most systems the performance between more normalized and less normalized schemas won’t make much difference to your queries. I disagree slightly with one of his conclusions though – that normalziation “doesn’t matter”.

To see why, let’s look at a different use-case: data modification. While this isn’t to do with joins directly, it is a very important consideration when deciding to denormalize to “remove joins”.

I’m not going to get into the details of the performance updates – there’s a much bigger problem waiting for us. Can you see what it is?
Continue reading »

Mar 312014
 

Continuing the series on joins, I’m going to look at denormalization. This process reduces the number of joins necessary to return results for a schema.

One of the big arguments against normalizing data is “for performance”. The process of normalization creates new tables as relations are decomposed according to their functional dependencies. This means (more) joins are necessary to return the same results.

A google of “database normalization performance” turns up several articles like this, this and this all advocating denormalizing your data to improve performance. There’s not a concrete discussion or test cases showing why you should denormalize, just hand-wavy arguments about joins being bad.

I wanted to test this to see if normalizing really makes performance worse. If you’ve been preaching “you must denormalize for performance”, my conclusions may surprise you.
Continue reading »

Mar 242014
 

The following update forms the basis of a DIY connection pool. It is intended to allocate an unused session to whoever requests it:

declare

  sess connection_pool.session_name%type;
  
begin
  
  update connection_pool
  set    in_use = 'Y',
         last_used = systimestamp
  where  rowid in (
      select rowid 
      from   (
          select rowid
          from   connection_pool
          where  in_use = 'N'
          order  by last_used
      )
      where  rownum = 1
  )
  returning session_name 
  into      sess;
  
  dbms_output.put_line ( sess );
  
end;
/

There’s a serious flaw with this approach however – can you spot it?

Answers in the comments!

Mar 172014
 
create table from_12c_onwards (
  you_can_have_your_pk number 
    generated always as identity,
  so_you_dont_have_to_type_it varchar2(10)
);

insert into from_12c_onwards (
  so_you_dont_have_to_type_it
) values (
  'yes'
);

insert into from_12c_onwards (
  so_you_dont_have_to_type_it
) values (
  'Yes'
);

insert into from_12c_onwards (
  so_you_dont_have_to_type_it
) values (
  'YES!'
);

select * from from_12c_onwards;

YOU_CAN_HAVE_YOUR_PK SO_YOU_DON
-------------------- ----------
                   1 yes
                   2 Yes
                   3 YES!

If you’d like more details, there’s a nice article on oracle-base discussing how identity columns work. If you prefer, you can assign a sequence to be a column default instead, as discussed here.

Mar 142014
 

Somehow I ended up with two rows in my table both with the same primary key value:

select the_pk 
from   dodgy_data;

    THE_PK
----------
         1
         1

The primary key does exist and THE_PK is the only column in it, so we can’t insert another row with THE_PK = 1:

select constraint_name, constraint_type
from   user_constraints
where  table_name = 'DODGY_DATA';

CONSTRAINT_NAME      CONSTRAINT_TYPE
-------------------- ---------------
TAB_PK               P              

select column_name 
from   user_cons_columns
where  constraint_name = 'THE_PK';

COLUMN_NAME
-----------
THE_PK     

insert into dodgy_data ( the_pk )
values ( 1 );

SQL Error: ORA-00001: 
  unique constraint (CHRIS.TAB_PK) violated

How? Answers in the comments!

Mar 102014
 
alter table from_11g add 
  column_with_a varchar2(100) default 
    'is instant provided you set it' not null;

Prior to 11g the above statement could take a very long time if the table was “big”. I was reminded of this by a recent discussion on Oracle-L. Tom Kyte also discusses this in more detail on asktom.

For a quick and dirty demo, keep reading Continue reading »

Mar 072014
 
with recursive_query_subfactoring (
    you_can_generate_rows, yes
) as (
  select 1 you_can_generate_rows,
         'from nothing!' yes
  from   dual
  union  all
  select you_can_generate_rows + 1,
         rpad(yes, 
              13 + you_can_generate_rows, 
              '!') yes
  from   recursive_query_subfactoring
  where  you_can_generate_rows < 100
)
select you_can_generate_rows, yes
from   recursive_query_subfactoring
where  rownum <= 10;

YOU_CAN_GENERATE_ROWS YES               
--------------------- --------------------
                    1 from nothing!       
                    2 from nothing!!      
                    3 from nothing!!!     
                    4 from nothing!!!!    
                    5 from nothing!!!!!   
                    6 from nothing!!!!!!  
                    7 from nothing!!!!!!! 
                    8 from nothing!!!!!!!!
                    9 from nothing!!!!!!!!! 
                   10 from nothing!!!!!!!!!! 

If you want more details, Rob van Wijk has a nice detailed post about CTEs and Jonathan Lewis discusses some possible issues with hinting these.

Personally I’ll be sticking with connect by for now as it’s quicker to type! ;)