Interesting SQL: Grouped ranked rows


The following is some SQL code written in Sqlite SQL syntax to output a rank column on rows of grouped items. The statement is pure SQL and doesn’t use a user defined function or external program logic to calculate the rank.

Here is and example of what the output looks like using the sqlite3 command line tool.

Example Output:

f1|f2|fy
a|blue|1
a|yellow|2
b|green|1
c|black|1
c|false|2
c|true|3
d|sdsdsd|1
e|orange|1

The above example is a little contrived but will translate on to real world queries with a little effort.

It has rows with values (column f2) which are members of groups (column f1). The groups being a single letter a, b, c, d and e.

Column f1 contains values that are repeated on several rows and contain the values that are part of a specific group. The column f2 contains the real data we are interested in. Column fy contains the rank (ie. a running count of 1st, 2nd, 3rd) in the particular group the value is in. for example the value ‘yellow’ is the second in the group ‘a’.

Here is the sql that produced the above output including the create table, insert and finally the select statement that produced the results.

The Code:

CREATE TABLE tbl1 (
  id INT primary key,
  f1 varchar(30),
  f2 varchar(30)
);

insert into tbl1 values (1, 'a', 'yellow');
insert into tbl1 values (2, 'a', 'blue');
insert into tbl1 values (3, 'b', 'green');
insert into tbl1 values (4, 'c', 'true');
insert into tbl1 values (5, 'e', 'orange');
insert into tbl1 values (6, 'c', 'false');
insert into tbl1 values (7, 'd', 'sdsdsd');
insert into tbl1 values (8, 'c', 'black');

.headers on

select
  t1.f1,
  t1.f2,
  (
    select
      t1.rowid - t3.rowid + 1
    from
      (
        select
          distinct f1, f2
        from
          tbl1
        order by f1
      ) t3
    where
      t3.f1 = t1.f1
  ) fy
from
  (select distinct f1, f2 from tbl1) t1
order by
  t1.f1,
  t1.f2,
  fy
;

It’s actually quite hard to get your head around a statement like this.
Most of the magic is done by using the Sqlite rowid of the inner select statement and the rowid of the outermost select. The actual rank value is calculated using a select statement that outputs a single value per output row. So the rank select in enclosed in brackets and placed as a column output in the outer select.

The actual rank in this query is determined by the order in which the appear in the outer select. You can see this by changing the order by from

order by
t1.f1,
t1.f2,
fy

to

order by
t1.f1,
t1.f2 desc,
fy

Other databases have build in features that do the same thing but the above is code can be made to work on databases that don’t have this feature. See the Microsoft MSDN documentation of the RANK() feature

Conclusion:

Obviously the performance of this query may not be the best depending on the number of rows and size of the groups being worked with, but its an interesting solution that only uses a few select statements and the rowid feature.

Advertisements

2 Comments

  1. Hi,
    I have for many years tried to keep track of an ever increasing birthday and address list.
    How difficult is it using SQLITE?

    1. Hi Wendy,
      Sqlite certainly could be used to keep track of calendar events. It could even do this in a web browser or on your phone!
      Having said that you would need to build the user interface in something else, as Sqlite is the raw data storage layer and does not contain forms and reports like MS Access.
      For a more insight into what Sqlite is, take a look at the official Sqlite web page: Appropriate Uses For SQLite
      There are also several administrative Sqlite Graphical User Interface tools that allow you to look at the data in an Sqlite database, but if you really want a full application you will need to built that half with some other tool that can produce an application GUI. You could use the sqlite shell to extract the data as is but it wouldn’t be very pretty or easy for users other than those technically proficient in SQL.
      A good place to start (but perhaps not for the faint hearted) would be to write a Tcl/Tk front end to your database. The Tk part is the user interface part (read more here http://en.wikipedia.org/wiki/Tk_(framework))

      This post was about the low level details and data manipulation in a specific manner and not a how-to built entire programs.
      Perhaps I will post a demo GUI to access a simple database when I have time.

      Thanks for your question.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s