Jan 132014
 

In addition to SQL I love games of all kinds. Scrabble is one of my favourite games, so I thought I’d combine these passions and build a scrabble word finder using just SQL!

First up, we need to create a table holding all the letters and the points values for each (adjust the points values accordingly if not using the English edition):

create table letters as (
  select  'a' letter, 1 points from dual union 
  select  'b' letter, 3 points from dual union
  select  'c' letter, 3 points from dual union
  select  'd' letter, 2 points from dual union
  select  'e' letter, 1 points from dual union
  select  'f' letter, 4 points from dual union
  select  'g' letter, 2 points from dual union
  select  'h' letter, 4 points from dual union
  select  'i' letter, 1 points from dual union
  select  'j' letter, 8 points from dual union
  select  'k' letter, 5 points from dual union
  select  'l' letter, 1 points from dual union
  select  'm' letter, 3 points from dual union
  select  'n' letter, 1 points from dual union
  select  'o' letter, 1 points from dual union
  select  'p' letter, 3 points from dual union
  select  'q' letter, 10 points from dual union
  select  'r' letter, 1 points from dual union
  select  's' letter, 1 points from dual union
  select  't' letter, 1 points from dual union
  select  'u' letter, 1 points from dual union
  select  'v' letter, 4 points from dual union
  select  'w' letter, 4 points from dual union
  select  'x' letter, 8 points from dual union
  select  'y' letter, 4 points from dual union
  select  'z' letter, 10 points from dual
);

Next, we need to build a list of possible words. I’ve used the TWL06 list downloadable at http://www.scrabblestop.com/lists-of-scrabble-words.html and then loaded these into the table WORDS:

create table words (
  word varchar2(30) primary key
);

We can now use recursive subquery factoring (requires 11gR2) to enter all your tiles and generate the list of possible combinations. Limit to eight characters for now – your seven tiles, plus linking to one on the board. (credit to Radislav Golian: this is based on his Boggle SQL solver (scroll down the comments)):

with tiles (
    not_used_chars, word, max_length) as (
  select :tiles not_used_chars,
         cast(null as varchar2(16)) word,
         8 max_length
  from   dual
  union  all
  select substr(b.not_used_chars, 1, a.lev - 1) ||
           substr(b.not_used_chars, a.lev + 1) 
             as not_used_chars, 
         b.word || 
           substr(b.not_used_chars, a.lev, 1) 
             as word,
         max_length
  from   tiles b,
         (select level lev 
          from   dual 
          connect by level <= 8) a 
  where  nvl(length(b.word), 0) < b.max_length 
  and    a.lev <= 
             nvl(length(b.not_used_chars), 0) 
)
select * from (
  select t.word
  from   tiles t
  where  length(t.word) > 1
);

To calculate the value of each combination, we can add the score in the recursive query. This is done by joining the letters table we built earlier. By then linking this to the words table we can return only those combinations that are real words. You can supply an underscore “_” to represent blanks:

with tiles (
    not_used_chars, word, max_length, score) as (
  select :tiles not_used_chars,
         cast(null as varchar2(16)) word,
         8 max_length,
         0 score
  from   dual
  union  all
  select substr(b.not_used_chars, 1, a.lev - 1) ||
           substr(b.not_used_chars, a.lev + 1) 
             as not_used_chars, 
         b.word || 
           substr(b.not_used_chars, a.lev, 1) 
             as word,
         max_length,
         score + nvl(
           (select points 
            from   letters
            where  letter = 
              substr(b.not_used_chars, a.lev, 1)
         ), 0) score
  from   tiles b,
         (select level lev 
          from   dual 
          connect by level <= 8) a 
  where  nvl(length(b.word), 0) < b.max_length 
  and    a.lev <= 
             nvl(length(b.not_used_chars), 0) 
)
select * from (
  select distinct t.word, t.score
  from   tiles t, words w
  where  length(t.word) > 1
  and    w.word like t.word
  order  by score desc
)
where  rownum <= 3

This works pretty well. If we enter ‘scrabble’ for the tiles, we find the top words are:

WORD     SCORE
-------- -----
scrabble    14
clabbers    14
clabber	    13

Good stuff, but the most points are scored by using the bonus squares. We can add in the letter bonuses by creating a “multiplier” table. This has the possible positions in the word for the bonus square and the value (double or treble). The following shows a treble letter in the first position (sticking with eight positions for your seven plus one already on the board):

with multiplier as (
  select 1 pos, 3 mul from dual union
  select 2 pos, 1 mul from dual union
  select 3 pos, 1 mul from dual union
  select 4 pos, 1 mul from dual union
  select 5 pos, 1 mul from dual union
  select 6 pos, 1 mul from dual union
  select 7 pos, 1 mul from dual union
  select 8 pos, 1 mul from dual
)

By selecting this in the recursive subquery we can multiply the letter values calculating the running score. Putting it all together we have:

with multiplier as (
  select 1 pos, 3 mul from dual union
  select 2 pos, 1 mul from dual union
  select 3 pos, 1 mul from dual union
  select 4 pos, 1 mul from dual union
  select 5 pos, 1 mul from dual union
  select 6 pos, 1 mul from dual union
  select 7 pos, 1 mul from dual union
  select 8 pos, 1 mul from dual
), tiles (
    not_used_chars, word, max_length, score, p) as (
  select :tiles not_used_chars,
         cast(null as varchar2(16)) word,
         8 max_length,
         0 score,
         1 p
  from   dual
  union  all
  select substr(b.not_used_chars, 1, a.lev - 1) ||
           substr(b.not_used_chars, a.lev + 1) 
             as not_used_chars, 
         b.word || 
           substr(b.not_used_chars, a.lev, 1) 
             as word,
         max_length,
         score + nvl(
           (select points *
              (select mul
               from   multiplier
               where  pos = p)
            from   letters
            where  letter = 
              substr(b.not_used_chars, a.lev, 1)
         ), 0) score,
         p + 1 p
  from   tiles b,
         (select level lev 
          from   dual 
          connect by level <= 8) a
  where  nvl(length(b.word), 0) < b.max_length 
  and    a.lev <= 
             nvl(length(b.not_used_chars), 0) 
)
select * from (
  select distinct t.word, t.score
  from   tiles t, words w
  where  length(t.word) > 1
  and    w.word like t.word
  order  by score desc
)
where  rownum <= 3

Using the letters from ‘scrabble’ again and triple letter bonus on the first letter, we get:

WORD     SCORE
-------- -----
clabbers    20
clabber	    19
barbels	    17

We still need to calculate total word multipliers. It would also be good if the bonus locations were dynamically generated, so we didn’t have to go through and enter all the combinations separately.

Can you extend the SQL above to do this? Is there a better way to generate the highest scoring words?

If you have better alternatives, put your approaches in the comments!

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>