Boost PostgreSQL Performance with Partial Indexes

Introduction

In our backend application, we often include unnecessary rows in our database indexes, making them larger than they need to be. This can lead to increased query latency because the database has to traverse a bigger index, and it also consumes more storage space. By optimizing indexes and only including the rows that are actually used in queries, you can save significant space and improve performance.

In this post, I’ll share some real-world examples where I reduced index size and improved query latency by carefully reviewing and optimizing our indexes. Let’s break it down.

Optimizing User Online/Offline Status Tracking

Let’s say we’re tracking user online/offline status in a table with two columns:

  • user_id (references the user)

  • is_online (a boolean indicating whether the user is online or not).

We have an API that takes a user_id (or a list of user_ids) and returns their online status.

Step 1: Basic Optimization

To optimize the query:

1
SELECT is_online FROM user_status WHERE user_id = 'user_id' LIMIT 1;  

We add an index on user_id:

1
CREATE INDEX user_id_index ON user_status(user_id);  

This gives us a significant performance boost. But can we do better?

Step 2: Leveraging Partial Indexes

Let’s assume that, at most, only 20% of users are online at the same time. This means 80% of the rows in the user_status table represent offline users, which we don’t need to index.

To optimize further, we can rewrite the query to only check for online users:

1
2
3
4
SELECT true AS is_online  
FROM user_status
WHERE user_id = 'user_id' AND is_online = true
LIMIT 1;

If the query returns a result, the user is online. If not, the user is offline.

Now, let’s create a partial index to support this query:

1
2
CREATE INDEX user_id_only_is_online ON user_status(user_id)  
WHERE is_online = true;

This index is 80% smaller than the original one because it only includes online users. As a result, it’s faster to query, uses less storage, and scales better to support more users.


This approach not only reduces the index size but also improves query performance by focusing only on the data that matters.

Optimizing Support Ticket Queries

Let’s say we have a table for support tickets where users can submit complaints or issues, and the support team helps resolve them. The tickets table has columns like id, user_id, status, and others. The status can be one of: pending, in_progress, or done.

Over time, the majority of tickets will be marked as done, while only a small portion will remain pending or in_progress. When clients visit their ticket page, we typically only show the pending and in_progress tickets because they likely don’t care about resolved (done) tickets.

Here’s the query we use to fetch these tickets:

1
2
3
SELECT * FROM tickets  
WHERE user_id = 'user_id' AND status != 'done'
LIMIT 10;

(Note: SELECT * is not recommended in production and is only used here for teaching purposes.)

Initial Approach: Basic Indexing

At first glance, you might think of adding an index on user_id or a multi-column index on (user_id, status). While these indexes help, they aren’t the most efficient for this specific query.

Better Approach: Partial Indexing

Since we only care about pending and in_progress tickets for this query, we can create a partial index that excludes done tickets entirely:

1
2
CREATE INDEX not_done_tickets ON tickets (user_id)  
WHERE status != 'done';

Why This Works

  1. Smaller Index Size: This index only includes pending and in_progress tickets, so it doesn’t grow as the number of done tickets increases.

  2. Faster Queries: The database only needs to traverse a smaller index, making the query faster.

  3. Space Efficiency: The index remains compact and scales better over time.

When to Use a Full Index on user_id

We still need a regular index on user_id for cases where we want to fetch all tickets for a user (e.g., for admin purposes or historical data). However, this isn’t the default behavior for the ticket page.

Conclusion

Before jumping into complex solutions like partitioning or sharding, consider simpler optimizations like partial indexing. In the examples above, we saw how small changes—like creating targeted indexes—can reduce storage, improve performance, and scale efficiently. Often, a simpler, smarter solution can save time and effort while delivering great results. Always ask: Is there an easier way?