Wednesday, November 26, 2008

select count(*) as omg_wtf from postgres;

This morning I discovered something that made me want to jump up on my desk and drop the f-bomb in my office with heart and soul. It was as though I'd just been sucker punched by a circus midget then fisted by his exceptionally well trained elephant as I doubled over in pain.

Let me tell you a story.

Once upon a time this morning, I was trying to figure out why the fuck one of our servers had gone postal and exploded, leaving behind a stinking cloud of piss mist and gaseous fecal matter yet again.

Trawling through the logs of one of the web apps on the machine, I found a bunch of requests that had each taken 15 seconds to complete. After some investigation, I found that these requests were all doing a "select count(*) ..." from a table with millions of rows. OMG. So I fired up a postgres client and did an explain for the same query, and it told me it was doing a sequential fucking scan to find the count!

I googled intensely, and what I found made my jaw drop and my tongue roll out like a red carpet: There is no way to optimize count operations in postgresql. Read here for a brief explanation.

OMG thanks postgres! Mind if I rape you?!

Solutions

There are a bunch of work arounds you can find on google, but they all boil down to 2 approaches: cache the count, or guess the count.

So.....

Triggers anyone?!

 1 -- Create foos table
 2 CREATE TABLE foos (
 3   id   SERIAL PRIMARY KEY,
 4   name VARCHAR(255)
 5 );
 6
 7 -- Create table foos_count to hold a single row that contains the number of
 8 -- rows in the foos table
 9 CREATE TABLE foos_count (
10   counter  INTEGER
11 );
12 insert into foos_count values(0);
13
14 -- Increment count function
15 CREATE FUNCTION inc_foo_count() RETURNS trigger AS '
16 BEGIN
17 UPDATE foos_count SET counter = counter + 1;
18 RETURN NEW;
19 END;
20 ' LANGUAGE plpgsql;
21
22 -- add a trigger to increment foos_count.counter when inserting into foos
23 CREATE TRIGGER inc_foo_counter AFTER INSERT ON foos
24   EXECUTE PROCEDURE inc_foo_count();
25
26 -- Decrement count function
27 CREATE FUNCTION dec_foo_count() RETURNS trigger AS '
28 BEGIN
29 UPDATE foos_count SET counter = counter - 1;
30 RETURN NEW;
31 END;
32 ' LANGUAGE plpgsql;
33
34 -- add a trigger to decrement foos_count.counter when inserting into foos
35 CREATE TRIGGER dec_foo_counter AFTER DELETE ON foos
36   EXECUTE PROCEDURE dec_foo_count();

Oh yeah. Nothing like a good trigger to make you feel like you just got probed by a bunch of aliens with a fetish for egg beaters.


How about rules?

 1 -- Create foos table
 2 CREATE TABLE foos (
 3   id   SERIAL PRIMARY KEY,
 4   name VARCHAR(255)
 5 );
 6
 7 -- Create foos_count to hold a single row that contains the number of rows in
 8 -- the foos table
 9 CREATE TABLE foos_count (
10   counter  INTEGER
11 );
12 insert into foos_count values(0);
13
14 -- add a rule to increment foos_count.counter when inserting into foos
15 CREATE RULE inc_foo_count AS ON INSERT TO foos DO ALSO
16     UPDATE foos_count set counter = counter + 1;
17
18 -- add a rule to decrement foos_count.counter when inserting into foos
19 CREATE RULE dec_foo_count AS ON DELETE TO foos DO ALSO
20     UPDATE foos_count set counter = counter - 1;

Rules don't look all that bad— certainly much nicer than triggers.


You could also obtain an estimate of the count with this:
SELECT reltuples FROM pg_class WHERE oid = 'foos'::regclass::oid;

This will give you the count at the time of the last ANALYSE.

My lists

Ok, so lets take a look at what impact this experience has had on my lists.

Postgres just jumped from my Christmas List over to my Shit List — fuck you postgres, I'm gonna find you, and I'm gonna take my knife, and I'm gonna use your cock for a sheath.

The List Of Things I Should Have Known By Now grew by 1 for select count sucking the scrote.

The List Of Things I Should Have Played With By Now also grew by 1 for rules since this is the first time I've actively investigated them.